As a complete audio outsider, my observations are:
1. Great news! VSTs seem to fill an important role in the audio-processing software world, and having them more open must be a good thing.
2. From the things they mention, the SDK seems way larger than I had imagined, but that is normal for (software) things, I guess. "This API also enables the scheduling of tasks on the main thread from any other thread." was not easy to unpack nor see the use of in what was (to me) an audio-generation-centered API.
3. The actual post seems to be somewhat mangled, I see both proper inline links and what looks like naked Markdown links, and also bolded words that also have double asterisks around them. Much confusing.
> the SDK seems way larger than I had imagined, but that is normal for (software) things, I guess. "This API also enables the scheduling of tasks on the main thread from any other thread." was not easy to unpack nor see the use of in what was (to me) an audio-generation-centered API
VST plugins almost all have a GUI, thus the VST SDK has to support an entire cross-platform UI framework... This threading functionality is mostly about shipping input events/rendering updates back and forth to the main (UI) thread
There is no single UI framework in VST. The plugin API only has interfaces for creating/destroying/resizing a GUI window. You are not required to use VSTGUI.
For context, the variation in UI between VSTs is pretty large and tend to be very creative, much like UI in games.
For better or worse, frequently worse
Like traveling back in time before UX was a word
Correct, it just hands you native handles. What you do with them is up to you.
JUCE is a popular UI framework (at least it was 10 years ago). But I've seen people put electron apps somehow into a VST.
Oh man, this is really starting to look like a plague.
And yet here we are discussing the value of using C++ vs other languages for real time audio processing.
Some background is needed for the thread API
The basic threading model for plugins is the "main" and "audio" threads. The APIs specify which methods are allowed to be called concurrently from which thread.
There is also a state machine for the audio processing bits (for example you can guarantee that processing won't happen until after the plugin as been "activated" and won't go from a deactivated state to processing until a specific method is called - I'm simplifying significantly for the VST3 state machine).
The "main" thread is the literal main/UI thread of the application typically, or a sandboxed plugin host running in a separate process. You do your UI on this thread as well as handle most host events.
Plugins often want to do things on background threads, like stream audio from disk or do heavy work like preparing visualization without blocking the main UI thread (which also handles rendering and UI events - think like the JS event loop, it's bad to block it).
The threading model and state machine make it difficult to know where it's safe to spawn and join threads. You can do it in a number of places but you also have to be careful about lifetimes of those threads, most plugins do it as early as possible and then shut them down as late as possible.
The host also has to do a lot of the stuff on background threads and usually has its own thread pool. CLAP introduced an extension to hook into the host's thread pool so plugins don't have to spawn threads and no longer have to really care about the lifetime. VST3 is copying that feature.
When you see annotations on methods in these APIs about "main" vs "any" thread and "active" etc they're notes to developers on where it is safe to call the methods and any synchronization required (on both sides of the API).
If it sounds complicated that's because it is, but most of this is accidental complexity created by VST3.
Audio is often processed on a separate thread than the UI. If memory serves (been a while) there's the UI portion and the audio engine portion of most VSTs, which can be booted together or independently. So threading is very important.
Yeah, I realized that once I finished writing my comment, that it might be about communicating with the UI since UI toolkits are usually not thread-safe enough. Thanks.
There's that, and then there's the explicit design decision that the two are very cleanly separated as it's somewhat common in more professional environments to run the audio portion on another machine entirely, and configure it on the main desktop.