Kind of a tangent, but I think "systems programming" tends to bounce back and forth between three(?) different concerns that turn out to be closely related:
1. embedded hardware, like you mentioned
2. high-performance stuff
3. "embedding" in the cross-language sense, with foreign function calls
Of course the "don't use a lot of resources" thing that makes Rust/C/C++ good for tiny hardware also tends to be helpful for performance on bigger iron. Similarly, the "don't assume much about your runtime" thing that's necessary for bare metal programming also helps a lot with interfacing with other languages. And "run on a GPU" is kind of all three of those things at once.
So yeah, which of those concerns was async Rust really designed around? All of them I guess? It's kind of like, once you put on the systems programming goggles for long enough, all of those things kind of blend together?
> So yeah, which of those concerns was async Rust really designed around? All of them I guess?
Yes, all of them. Futures needed to work on embedded platforms (so no allocation), needed to be highly optimizable (so no virtual dispatch), and need to act reasonably in the presence of code that crosses FFI boundaries (so no stack shenanigans). Once you come to terms with these constraints--and then add on Rust's other principles regarding guaranteed memory safety, references, and ownership--there's very little wiggle room for any alternative designs other than what Rust came up with. True linear types could still improve the situation, though.
> so no virtual dispatch
Speaking of which, I'm kind of surprised we landed on a Waker design that requires/hand-rolls virtual dispatch. Was there an alternate universe where every `poll()` function was generic on its Waker?
In my view, the major design sin was not _forcing_ failure into the outcome list.
.await(DEADLINE) (where deadline is any non 0 unit, and 0 is 'reference defined' but a real number) should have been the easy interface. Either it yields a value or it doesn't, then the programmer has to expressly handle failure.
Deadline would only be the minimum duration after which the language, when evaluating the future / task, would return the empty set/result.
> Deadline would only be the minimum duration after which the language, when evaluating the future / task, would return the empty set/result.
This appears to be misunderstanding how futures work in Rust. The language doesn't evaluate futures or tasks. A future is just a struct with a poll method, sort of like how a closure in Rust is just a struct with a call method. The await keyword just inserts yield points into the state machine that the language generates for you. If you want to actually run a future, you need an executor. The executor could implement timeouts, but it's not something that the language could possibly have any way to enforce or require.
Does that imply a lot of syscalls to get the monotonic clock value? Or is there another way to do that?
On Linux there is the VDSO, which on all mainstream architectures allows you to do `clock_gettime` without going through a syscall. It should take on the order of (double digit) nanoseconds.
If the scheduler is doing _any_ sort of accounting at all to figure out any remote sort of fairness balancing at all, then whatever resolution that is probably works.
At least for Linux, offhand, popular task scheduler frequencies used to be 100 and 1000hz.
Looks like the Kernel's tracking that for tasks:
https://www.kernel.org/doc/html/latest/scheduler/sched-desig...
"In CFS the virtual runtime is expressed and tracked via the per-task p->se.vruntime (nanosec-unit) value."
I imagine the .vruntime struct field is still maintained with the newer "EEVDF Scheduler".
...
A Userspace task scheduler could similarly compare the DEADLINE against that runtime value. It would still reach that deadline after the minimum wait has passed, and thus be 'background GCed' at a time of the language's choice.
The issue is that no scheduler manages futures. The scheduler sees tasks, futures are just a struct. See discussion of embedded above: there is no “kernel esque” parallel thread