hardware from 10 years ago - do you have benchmarks for more recent hardware?

https://vorner.github.io/async-bench.html

I don't think those benches are much of a flex, even by the author's own description you'd be fine with any of them. They all have acceptable performance and don't show any order of magnitude differences or non-linear scaling problems.

Further, the benches that are showing best there are non-thread-stealing scenarios, not tokio.

I also suspect simply tuning the thread-based workloads more aggressively would have the same effect.

When I profile high throughput tokio applications there's way too much contention on shared atomics, mostly inside tokio's scheduler itself. On lower core count machines and where the workload is I/O heavy, this is probably fine. So, yes, web servers.

But I'm very interested in applications that scale on machines with lots of cores and where CPU is a large part of the equation.