Clap doesn't allow describing plugin in a manifest (like VST3 and LV2 do). This allows to scan for plugins faster.

Also, CLAP uses 3 or 4 methods to represent MIDI data (MIDI1, MIDI1 + MPE, MIDI2, CLAP events). This requires to write several converters when implementing a host.

But, CLAP is much simpler and doesn't use COM-like system (VST3 resembles a Windows COM library with endless interfaces and GUIDs).

Also, VST3 interfaces in SDK are described as C++ classes with virtual functions (example: [1]), and I wonder how do they achieve portability, if the layout for such classes (vtables) is not standardized and may differ across compilers.

[1] https://github.com/steinbergmedia/vst3_pluginterfaces/blob/3...

>Also, CLAP uses 3 or 4 methods to represent MIDI data (MIDI1, MIDI1 + MPE, MIDI2, CLAP events)

Contrast to VST3 which doesn't support MIDI at all, unless you count cresting thousands of dummy parameters hardcoded to MIDI controller numbers "support."

VST3 uses proprietary events for things like note on/off and note expressions. As for MIDI controllers, the host is supposed to convert them to parameter changes.

This makes sense if you want to map a controller to a plugin parameter in a DAW. However, if you want to write a "MIDI effect", which transforms incoming MIDI data for controllers, it would be difficult.

Also it is interesting that VST3 has an event for note expression, and separate event for polyphonic pressure although it can be considered a note expression.

> As for MIDI controllers, the host is supposed to convert them to parameter changes

And nearly everyone except Steinberg considers this to be a mistake. MIDI messages (CCs, pitch bend, and so on) are _not_ parameters.

As I understand it is better because it gives more freedom in routing parameter values. The parameter values might come not only from a MIDI controller knob directly, but from an automation curve, or from other plugin output, or from an automation track output. For example, you might combine 2 automation curves through a multiplication plugin and route the output to a plugin's parameter input.

Or you could have an automation curve that produces a sine wave, and have a MIDI knob to modulate its amplitude, and send the result to an LPF "cutoff frequency" input, so that you can control amount of modulation with a knob.

So VST3 (and CLAP) treat each parameter as an additional plugin input, which can be routed arbitrarily.

If a plugin would instead have a single MIDI input and extract controller changes from a MIDI stream, then the scenario above would be more difficult to implement (you would need to convert an output of multiplication plugin into a sequence of MIDI controller changes and combine it with other MIDI events).

Am I missing something here? Never developed a single plugin.

So there's a couple of problems here.

The first is that VST3 does not exist in a vacuum. Plugin developers need to ship AU (and previously VST2 concurrently with VST3) plugins. MIDI still is the basic abstraction used by higher level wrappers for sending events to plugins, so in practice, all VST3 events get converted to MIDI anyway (for the majority of plugins).

The second thing is that parameter values are special. You do not actually want the modulation features you are talking about to touch the parameter value, you want them to be used as modulation sources on top of the parameter value. Most synths for example treat MIDI CC and pitch bend as modulation sources to the voice pitch or other parameters (programmable by the user), and not direct controls of parameters. Keep in mind that parameters are typically serialized and saved for preset values.

The third thing is that parameters not in practice allowed to be dynamic. So if you want to use a MIDI controller as a modulation source, you need a dedicated parameter for it. You cannot dynamically add and remove it based on user input.

As an example:

> Or you could have an automation curve that produces a sine wave, and have a MIDI knob to modulate its amplitude, and send the result to an LPF "cutoff frequency" input,

This is not possible in VST3, in any host I'm aware of. You cannot output parameters from one plugin to route to another until 3.8, and I highly doubt anyone will support this.

VST3 is less flexible and more surprising in how it handles midi which is why historically, VST3 plugins have very poor MIDI support.

I think you can find something similar in a modular synth: you can have an envelope generator that generates control voltage, and connect it to a filter cutoff frequency input.

I don't care much how it is made in other DAWs, I just tried to design a routing system for a DAW on a paper. I needed a concept that would be simple yet powerful. So I thought that we can define 3 types of data - audio data, MIDI data and automation (parameter) data. And every plugin may have inputs/outputs of these types. So if a plugin has 20 parameters, we can count them as 20 inputs. And we can connect inputs and outputs however we want (as long as the types match). Of course, we also have 3 types of clips - audio, MIDI and automation curve clips. And obviously we can process these data with plugins - so we can take a MIDI clip, connect it to a plugin that generates envelopes for incoming MIDI notes, and connect its output to a cutoff frequency of a filter. Why not?

Technically it is possible to process parameter data, we just have to deal with converting data between different formats - some plugins might have "control voltage" inputs/outputs, other allow changing parameters sample- or block-precise points. And here VST3, which has a defined model for parameter interpolation, is easier to deal with than plugin formats that do not define exact interpolation formula.

By the way now I noted a serious weakness in my model - it doesn't support per-note parameters/controllers - all parameters are global in my concept. Guess I need to think more.

Your point about modulating parameters is valid, however I am not sure if it is better to implement modulation in a host and have full control over it (what do we do if a user moves a knob when the automation is being played) or have every plugin developer to implement it themselves (like in CLAP which supports parameter modulation).

> This is not possible in VST3,

I think it is possible - the plugin gets a sequence of parameter changes from the host and doesn't care where they come from. As I remember, plugins may also have output parameters, so it is possible to process parameter data using plugins.

So your paper design is most similar in spirit to CLAP. I would say that the actual audio/event processing bits are the "easy" part of the API.

> And here VST3, which has a defined model for parameter interpolation, is easier to deal with than plugin formats that do not define exact interpolation formula.

So I'll just reiterate that this is not true for either plugin or host developers and that's not a minority opinion. The parameter value queue abstraction is harder to implement on both sides of the API, has worse performance, and doesn't provide much in benefit over sending a sparse list of time-stamped events and delegating smoothing to the plugin.

> As I remember, plugins may also have output parameters, so it is possible to process parameter data using plugins.

The host forwards those output parameters back to that plugin's editor, not to other plugins. You use this as a hack to support metering, although in practice since this is a VST3 quirk, few people do it. Until 3.8.0 which added the IMidiLearn2 interface there was no way to annotate MIDI mappings for output parameters, which caused hosts to swallow MIDI messages even if they should be forwarded to all plugins. I doubt that the new interface will be implemented consistently by hosts, and now there's a problem where old plugins may do the wrong thing in new versions of hosts that expect plugins to be updated (this is catastrophic behavior for audio plugins - you never want a version update to change how they behave, because it breaks old sessions). There's also no good way to consistently send what are effectively automation clips out of plugins, since the plugin does not have a view into the sequencer.

And most importantly - plugins aren't aware of other plugins. If one plugin outputs an parameter change it is meaningless to another plugin. Maybe if both plugins implement IMidiMapping2 the host can translate the output parameter change into a MIDI event and then into another parameter change. Sounds a lot stupider than just sending discrete MIDI events.

Essentially, the design of parameters in VST3 is fragile and bad.

You are right though that many DAWs do not allow to route and process parameter data (control voltages) - I notice that some plugins implement this internally, especially synths, so you can draw curves and process them inside a plugin - but I think it would be better if you didn't need every plugin developer to add a curve editor and modulators.

> Clap doesn't allow describing plugin in a manifest (like VST3 and LV2 do). This allows to scan for plugins faster.

VST3 only recently gained the `moduleinfo.json` functionality and support is still materialising. Besides, hosts generally do a much better job about only scanning new plugins or ones that have changed, and hosts like Bitwig even do the scanning in the background. The manifest approach is cool, but in the end, plugin DLLs just shouldn't be doing any heavy lifting until they actually need to create an instance anyway.

> Also, CLAP uses 3 or 4 methods to represent MIDI data (MIDI1, MIDI1 + MPE, MIDI2, CLAP events). This requires to write several converters when implementing a host.

I've not done the host-side work, but the plugin-side work isn't too difficult. It's the same data, just represented differently. Disclaimer: I don't support MIDI2 yet, but I support the other 3.

On the other side, VST3 has some very strange design decisions that have led me to a lot of frustration.

Having separate parameter queues for sample-accurate automation requires plugins to treat their parameters in a very specific way (basically, you need audio-rate buffers for your parameter values that are as long as the maximum host block) in order to be written efficiently. Otherwise plugins basically have to "flatten" those queues into a single queue and handle them like MIDI events, or alternately just not handle intra-block parameter values at all. JUCE still doesn't handle these events at all, which leads to situations where a VST2 build of a JUCE plugin will actually handle automation better than the VST3 build (assuming the host is splitting blocks for better automation resolution, which all modern hosts do).

duped's comment about needing to create "dummy" parameters which get mapped to MIDI CCs is spot-on as well. JUCE does this. 2048 additional parameters (128 controllers * 16 channels) just to receive CCs. At least JUCE handles those parameters sample-accurately!

There's other issues too but I've lost track. At one point I sent a PR to Steinberg fixing a bug where their VST3 validator (!!!) was performing invalid (according to their own documentation) state transitions on plugins under test. It took me weeks to get the VST3 implementation in my plugin framework to a shippable state, and I still find more API and host bugs than I ever hit in VST2. VST3 is an absolute sprawl of API "design" and there are footguns in more places than there should be.

On the contrary, CLAP support took me around 2 days, 3 if we're being pedantic. The CLAP API isn't without its share of warts, but it's succinct and well-documented. There's a few little warts (the UI extension in particular should be more clear about when and how a plugin is supposed to actually open a window) but these are surmountable, and anecdotally I have only had to report one (maybe two) host bugs so far.

Again, disclaimer: I was involved in the early CLAP design efforts (largely the parameter extension) and am therefore biased, but if CLAP sucked I wouldn't shy away from saying it.

> Having separate parameter queues for sample-accurate automation requires plugins to treat their parameters in a very specific way (basically, you need audio-rate buffers for your parameter values that are as long as the maximum host block) in order to be written efficiently.

Oh I forgot about parameters. In VST3, the parameter changes use linear interpolation. So the DAW can predict how the plugin would interpret parameter value between changes and use this to create the best piece-wise linear approximation for automation curve (not merely sampling the curve every N samples uniformly which is not perfect).

CLAP has no defined interpolation method, and every plugin would interpolate the values in its own, unique and unpredictable way (and if you don't interpolate, there might be clicks). It is more difficult for a host to create an approximation for an automation curve. So with CLAP "sample-precise" might be not actually sample-precise.

I didn't find anything about interpolation in the spec, but it mentions interpolation for note expressions [1]:

> A plugin may make a choice to smooth note expression streams.

Also, I thought that maybe CLAP should have used the same event for parameters and note expessions? Aren't they very similar?

> duped's comment about needing to create "dummy" parameters which get mapped to MIDI CCs is spot-on as well. JUCE does this. 2048 additional parameters (128 controllers * 16 channels) just to receive CCs. At least JUCE handles those parameters sample-accurately!

What is the purpose of this? Why does plugin (unless it is a MIDI effect) need values for all controllers? Also, MIDI2 has more than 128 controllers anyway so this is a poor solution.

[1] https://github.com/free-audio/clap/blob/main/include/clap/ev...

> Oh I forgot about parameters. In VST3, the parameter changes use linear interpolation. So the DAW can predict how the plugin would interpret parameter value between changes and use this to create the best piece-wise linear approximation for automation curve (not merely sampling the curve every N samples uniformly which is not perfect).

Can you link to any code anywhere that actually correctly uses the VST3 linear interpolation code (other than the "again_sampleaccurate" sample in the VST3 SDK)? AU also supports "ramped" sample-accurate parameter events, but I am not aware of any hosts or plugins that use this functionality.

> CLAP has no defined interpolation method, and every plugin would interpolate the values in its own, unique and unpredictable way (and if you don't interpolate, there might be clicks). It is more difficult for a host to create an approximation for an automation curve. So with CLAP "sample-precise" might be not actually sample-precise.

Every plugin does already interpolate values on its own. It's how plugin authors address zipper noise. VST3 would require plugin authors to sometimes use their own smoothing and sometimes use the lerped values. Again, I'm not aware of any plugins that actually implement the linear interpolated method. I think Melda? It certainly requires both building directly on the VST3 SDK and also using the sample-accurate helpers (which only showed up in 2021 with 3.7.3).

Anyway, I maintain that this is a bad design. Plugins are already smoothing their parameters (usually with 1 pole smoothing filters) and switching to this whole interpolated sample accurate VST3 system requires a pretty serious restructuring.

Personally, I would have loved having a parameter event flag in CLAP indicating whether a plugin should smooth a parameter change or snap immediately to it (for better automation sync). Got overruled, oh well.

> What is the purpose of this? Why does plugin (unless it is a MIDI effect) need values for all controllers? Also, MIDI2 has more than 128 controllers anyway so this is a poor solution.

Steinberg has been saying exactly this since 2004 when VST3 was first released. Time and time again, plugin developers say that they do need them. For what? Couldn't tell you, honestly. In my case, I would have to migrate a synth plugin from MPE to also be able to use the VST3 note expressions system, and I absolutely cannot be bothered - note expressions look like a nightmare.

And this is the chief problem with VST3. The benefits are either dubious or poorly communicated, and the work required to implement these interfaces is absurd. Again – 3 days to implement CLAP vs 3 weeks to implement VST3 and I'm still finding VST3 host bugs routinely.

> 2048 additional parameters (128 controllers * 16 channels) just to receive CCs.

It's worth mentioning that it's 2 x 16 x 16,384 in MIDI 2, + 128 x 16 MIDI1 because you gotta support both.

But to quote Steinberg devs, "plugins shouldn't handle MIDI CC at all"

they are com classes. the vtable layout for them is specified.

I don't think GCC has a special case for handling COM classes. However, I found that GCC uses "Itanium CXX ABI" on Linux which specifies vtable layout which accidentally might match the layout of COM classes. However, it is not guaranteered (for example, by C++ standards) that other compilers use the same layout.

The ABI is stable everywhere VST3s are used. It has to be or nothing would work.

Everything would work except for VST3, if written according to standards.

The ABI isn't covered by C++ standards, it's target and architecture dependent. For the purposes of this discussion that ABI is stable for C++ vtables on the targets and architectures that VST3 supports.

If a compiler and linker don't follow those ABIs then it would also be close to useless for compiling or linking against shared libraries. So in the wild, all useful compilers do target the same ABIs.

gcc in mingw on windows is the odd duck, but most production software does not support it anyway.

> If a compiler and linker don't follow those ABIs then it would also be close to useless for compiling or linking against shared libraries.

I guess in C++ you are not supposed to link libraries produced by different compilers? Maybe you should use C-compatible interfaces in this case?

You are, you can, and people do. Sure you should use C interfaces, that's what CLAP does, and it's easier to understand as a result.

The C standard similarly does not specify an ABI.

Not really, VST3's COM-like API just uses virtual methods, they don't guarantee layout to the same degree actual COM does with compiler support. They simply rely on the platform ABI being standardized enough.