Yes, I was planning a similar experiment with UCall (https://github.com/unum-cloud/ucall), leveraging the NUMA functionality introduced in v2 of Fork Union. I don’t currently have the right hardware to test it properly, but it would be very interesting to measure how pinning behaves on machines with multiple NUMA nodes, NICs, and a balanced PCIe topology.
That's outrageous.. and I don't agree with your assessment, because smol is in the same niche as Tokio (that is, an async execuutor, which isn't necessarily optimizing for CPU-bound workloads) and isn't nearly as slow.
I think performance is a very critical property for Rust infrastructure. One can only hope that newer Tokio versions could address overheads which make everyone slower than necessary.
Yes, I was planning a similar experiment with UCall (https://github.com/unum-cloud/ucall), leveraging the NUMA functionality introduced in v2 of Fork Union. I don’t currently have the right hardware to test it properly, but it would be very interesting to measure how pinning behaves on machines with multiple NUMA nodes, NICs, and a balanced PCIe topology.
Tokio actually has some similarities with Rayon. Tokio is used in most Rust web servers, like Axum and Actix-web
That’s true — though in my benchmarks Tokio came out as one of the slower parallelism-enabling projects. The article still included a comparison:
... but I now avoid comparing to Tokio since it doesn’t seem fair — fork-join style parallel processing isn’t really its primary use case.That's outrageous.. and I don't agree with your assessment, because smol is in the same niche as Tokio (that is, an async execuutor, which isn't necessarily optimizing for CPU-bound workloads) and isn't nearly as slow.
I think performance is a very critical property for Rust infrastructure. One can only hope that newer Tokio versions could address overheads which make everyone slower than necessary.