"opinionated and limited" in what way? I'm also curious to hear how AU differs appreciably from VST3, mostly at a conceptual level (as I'm already familiar with the low-level details of both APIs).

For instance VST3 does not support many to one, and only awkwardly supports two to one -- via second-class "side-chaining" -- sound processing architectures. But the need for such possibility may not be clear to those who only know the primary one-to-one audio flow architecture ubiquitous in digital audio workstation ("DAWWWWWWW") designs. VST3 neatly fits into this architecture. In 2025 I don't think more of this is particularly innovative. In contrast, the AudioUnit spec is open ended and has an internal graph audio flow design that can readily facilitate other signal processing architectures. If you don't want to think outside the "DAW", you don't have to, but some of us musicians do.

> For instance VST3 does not support many to one, and only awkwardly supports two to one -- via second-class "side-chaining" -- sound processing architectures

This is a limitation of your host and plugins and not of VST3, plugins can declare arbitrarily many input/output busses for audio and events for many-to-many connections. It's just that in practice, hosts don't like this, and JUCE has a busted interface for it.

This is really interesting – AU's notion of having separate input and output "elements" (buses, more or less) is one of the worst parts of the whole API.

I understand why historically these design decisions were made, but it's not like they really enable any functionality that the other APIs don't. It's just that since the host can call `Render` more than once per sample block (ideally the host would call it once per sample block per element, but there's nothing saying the host can't call it redundantly), there's additional bookkeeping that plugins have to do surrounding the `AudioTimeStamp`. And for what? There's nothing AU can do that the other formats can't.

If a plugin has multiple fully independent buses, the model mostly works, but if there's any interdependence then things get even more complicated. Say you have two separate stereo elements that don't have any routing between them, but there's MIDI input or sample-accurate parameter changes for the given block. Now you have to keep those events around until the host has moved on to the next block, which means the plugin has to maintain even more internal buffering. This sort of lifetime ambiguity was one of the worst parts of VST2. In VST2, how long does MIDI input last? "Until the next call to `process()`." In AUv2, how long do MIDI input data or parameter change events last? "All of the render calls of the next timestamp, but then not the timestamp after that." Great, thanks.

Modern plugins, upon receiving a `Render` for a new timestamp, will just render all of their elements at the same time, but they'll internally buffer all the outputs and then just copy the buffers out per-element-render call. So, it reduces down to the same thing that other APIs do, just with more pageantry.

And yet, plugin instances having to manage their own input connection types is somehow even worse. Again, I understand why it was done this way – allowing plugins to "pull" their own inputs lets an audio graph basically run itself with very little effort from the host – it can just call `Render` on the final node of the chain, and all of the inputs come along for free.

It's a compelling value proposition, but unfortunately it fully prevents any sort of graph-level multithreading. Modern hosts do all sorts of graph theory to determine which parts of the graph can be executed in parallel, and this means that the host has to be in charge of determining which plugins to render and when. Even Logic does this now. The "pull model" of an AU instance directly calling `Render` on its input connections is relic of the past.

Anyway. VST3, CLAP, even VST2 support multiple input and output buses (hell, one of my plugins has multiple output buses for "main out", "only transients", and "everything other than a transient") – it's just a question of host support and how they're broken out. Ironically, Logic is one of the clunkiest implementations of multi-out I've seen (Bitwig is far and away the best).

I am unfamiliar with the differences between these two at all, as someone who uses audio plugins but does not develop with them. What are the main differences and why is OP claiming that there are far better methods of doing so?

The short answer is that there really aren't. All extant audio plugin APIs/formats are basically ways of getting audio data into and out of a `process()` (sometimes called `Render`) function which gets called by the host application whenever it needs more audio from the plugin.

Every API has its own pageantry not just around the details of calling `process()`, but also exposing and manipulating things like parameters, bus configuration, state management, MIDI/note i/o, etc. There are differences in all of these (sometimes big differences), but there aren't any real crazy outliers.

At the end of the day, a plugin instance is a sort of "object", right? And the host calls methods on it. How the calls look varies considerably:

VST2 prescribes a big `switch()` statement in a "dispatcher" function, with different constants for the "command" (or "method", more or less). VST3 uses COM-like vtables and `QueryInterface`. CLAP uses a mechanism where a plugin (or host) is queried for an "extension" (identified by a string), and the query returns a const pointer to a vtable (or NULL if the host/plugin doesn't support the extension). AudioUnits has some spandrels of the old mac "Component Manager", including a `Lookup` method for mapping constants to function pointers (kind of similar to VST2 except it returns a function rather than dispatching to it directly), and then AU also has a "property" system with getters and setters for things like bus configuration, saving/loading state, parameter metadata, etc.

I'm not sure why OP is claiming that AU is somehow unopinionated or less limited. It doesn't support any particular extensibility that the other formats don't too.

Is there some software to convert VST3 to AU? Or do you develop both separately?

Someone else said

> Almost all VST plugins have an AU version (like 80%-90% or so, and 99% of the major ones).

Which I noticed as well, I wondered if that required a large time investment to support both or there's some API translation layers available.

Historically, there was an API translation layer that VST2 plugins used (called Symbiosis), but these days the vast majority of plugin devs use a framework like JUCE which has multiple different "client-side" API implementations (of VST2, and VST3, and etc) that all wrap around JUCE's native class hierarchy.

There's a few other frameworks floating around (Dplug for writing in D, a few others in C++), but JUCE is far and away the most common.

Me three would like to know. I'm producer/mixer who favors AU over VST3 plugins. Not for any opinionated reason. Merely because my experience is that they're slightly less error prone in my DAW.