Server applications don’t spawn threads per request, they use thread pools. The extra context switching due to threads waiting for I/O is negligible in practice for most applications. Asynchronous I/O becomes important when the number of simultaneous requests approaches the number of threads you can have on your system. Many applications don’t come close to that in practice.

There’s a benefit in being able to code the handling of a request in synchronous logic. A case has to be made for the particular application that it would cause performance or resource issues, before opting for asynchronous code that adds more complexity.

Thread pools are another variation on the theme. But if your threads block then your pool saturates and you can't process any more requests. So thread pools still need non-blocking operations to be efficient or you need more threads. If you have thread pools you also need a way of communicating with that pool. Maybe that exists in the framework and you don't worry about it as a developer. If you are managing a pool of threads then there's a fair amount of complexity to deal with.

I totally agree there are applications for which this is overkill and adds complexity. It's just a tool in the toolbox. Video games famously are just a single thread/main loop kind of application.

There’s also a really good operational benefit if you have limits like total RAM, database connections, etc. where being able to reason about resource usage is important. I’ve seen multiple async apps struggle with things like that because async makes it harder to reason about when resources are released.

Could you point out the issue here?

Why does async make it harder to reason about when resources are released?

Basically it’s the non-linear execution flow creating situations which are harder to reason about. Here’s an example I’m trying to help a Node team fix right now: something is blocking the main loop long enough that some of the API calls made in various places are timing out or getting auth errors due to the signature expiring between when the request was prepared and when it is actually dispatched because that’s sporadically tend of seconds instead of milliseconds. Because it’s all async calls, there are hundreds of places which have to be checked whereas if it was threaded this class of error either wouldn’t be possible or would be limited to the same thread or an explicit synchronization primitive for something like a concurrency limit on the number of simultaneous HTTP requests to a given target. Also, the call stack and other context is unhelpful until you put effort into observability for everything because you need to know what happened between hitting await and the exception deep in code which doesn’t share a call stack.

Because async usually means you've stopped having "call stack" as a useful abstraction.