(Honest question from non Rustacean.)
How does the cancellation story differ between threads and async in Rust? Or vs async in other languages?
There's no inherent reason they should be different, but in my experience (in C++, Python, C#) cancellation is much better in async then simple threads and blocking calls. It's near impossible to have organised socket shutdown in many languages with blocking calls, assuming a standard read thread + write thread per socket. Often the only reliable way to interrupt a socket thread it's to close the socket, which may not be what you want, and in principle can leave you vulnerable to file handle reuse bugs.
Async cancellation is, depending on the language, somewhere between hard but achievable (already an improvement) and fabulous. With Trio [1] you even get the guarantee that non-compared socket operations are either completed or have no effect.
Did this work any better in Rust threads / blocking calls? My uneducated understanding is that things are actually worse in async than other languages because there's no way to catch and handle cancellations (unlike e.g. Python which uses exceptions for that).
I'm also guessing things are no better in Ada but very happy to hear about that too.
Cancellation in rust async is almost too easy, all you need to do is drop the future.
If you need cleanup, that still needs to be handled manually. Hopefully the async Drop trait lands soon.
Ok I could be super wrong here, but I think that's not true.
Dropping a future does not cancel a concurrently running (tokio::spawn) task. It will also not magically stop an asynchronous I/o call, it just won't block/switch from your code anymore while that continues to execute. If you have created a future but not hit .await or tokio::spawn or any of the futures:: queue handlers, then it also won't cancel it it just won't begin it.
Cancellation of a running task from outside that task actually does require explicit cancelling calls IIRC.
Edit here try this https://cybernetist.com/2024/04/19/rust-tokio-task-cancellat...
Spawn is kind of a special case where it's documented that the future will be moved to the background and polled without the caller needing to do anything with the future it returns. The vast majority of futures are lazy and will not do work unless explicitly polled, which means the usual way of cancelling is to just stop polling (e.g. by awaiting the future created when joining something with a timeout; either the timeout happens before the other future completes, or the other future finishes and the timeout no longer gets polled). Dropping the future isn't technically a requirement, but in practice it's usually what will happen because there's no reason to keep around a future you'll never poll again, so most of the patterns that exist for constructing a future that finishes when you don't need it anymore rather than manually cancelling will implicitly drop any future that won't get used again (like in the join example above, where the call to `join` will take ownership of both futures and not return either of them, therefore dropping whichever one hasn't finished when returning).
So how do you do structured concurrency [1] in Rust i.e. task groups that can be cancelled together (and recursively as a tree), always waiting for all tasks to finish their cancellation before moving on? (Normally structured concurrency also involves automatically cancelling the other tasks if one fails, which I guess Rust could achieve by taking special action for Result types.)
If you can't cancel a task and its direct dependents, and wait for them to finish as part of that, I would argue that you still don't have "real" cancellation. That's not an edge case, it's the core of async functionality.
[1] https://vorpus.org/blog/notes-on-structured-concurrency-or-g...
(Too late to edit)
Hmm, maybe it's possible to layer structured concurrency on top of what Rust does (or will do with async drop)? Like, if you have a TaskGroup class and demand all tasks are spawned via that, then internally it could keep track of child tasks and make sure that they're all cancelled when the parent one is (in the task group's drop). I think? So maybe not such an issue, in principle.
I think you're on the right track here to figuring this out. Tokio's JoinSet basically does what you describe for a single level of spawning (so not recursively, but it's at least part of the way to get what you describe); the `futures` library also has a type called `FuturesUnordered` that's similar but has the tradeoff that all futures it tracks need to be the same type which allows it to avoid spawning new tasks (and by extension doesn't need to wrap the values obtained by awaiting in a Result).
Under the hood, there's nothing stopping a future from polling on or more other futures, so keeping in mind that it isn't the dropping that cancels but rather the lack of polling, you could achieve what you're describing with each future in the tree polling its children in its own poll implementation, which means that once you stop polling the "root" future in the tree, all of the others in the tree will by extension no longer get polled. You don't actually need any async Drop implementation for this because there's no special logic you need when dropping; you just stop polling, which happens automatically since you can't poll something that's been dropped anyhow.
That's a rare exception, and just a design choice of this particular library function. It had to intentionally implement a workaround integrated with the async runtime to survive normal cancellation. (BTW, the anti-cancellation workaround isn't compatible with Rust's temporary references, which can be painfully restrictive. When people say Rust's async sucks, they often actually mean `tokio:spawn()` made their life miserable).
Regular futures don't behave like this. They're passive, and can't force their owner to keep polling them, and can't prevent their owner from dropping them.
When a Future is dropped, it has only one chance to immediately do something before all of its memory is obliterated, and all of its inputs are invalidated. In practice, this requires immediately aborting all the work, as doing anything else would be either impossible (risking use-after-free bugs), or require special workarounds (e.g. io_uring can't work with the bare Future API, and requires an external drop-surviving buffer pool).
Rain showed that not all may be as simple as it seems to do it correctly.
In her presentation on async cancellation in Rust, she spoke pretty extensively on cancel safety and correctness, and I would recommend giving it a watch or read.
https://sunshowers.io/posts/cancelling-async-rust/
Yeah that's what I'm talking about ... Cancellation where the cancelled object can't handle the cancellation, call other async operations and even (very rarely) suppress it, isn't "real" cancellation to me, having seen how this essential it is.
> There's no inherent reason they should be different
There is... They're totally different things.
And yeah Rust thread cancellation is pretty much the same as in any other language - awkward to impossible. That's a fundamental feature of threads though; nothing to do with Rust.
There's no explicit cancel, but there's trivial one shot cancellation messages that you can handle on the thread side. It's perfectly fine, honestly, and how I've been doing it forever.
I would call that clean shutdown more than cancellation. You can't cancel a long computation, or std::thread::sleep(). Though tbf that's sort of true of async too.
To be clear about what I meant: I was saying that, in principle, it would be possible design a language or even library where all interruptable operations (at least timers and networking) can be cancelled from other threads. This can be done using a cancellation token mechanism which avoids even starting the operation of already cancelled token, in a way that avoids races (as you might imagine from a naive check of a token before starting the operation) if another thread cancels this one just as the operation is starting.
Now I've set (and possibly moved) the goalposts, I can prove my point: C# already does this! You can use async across multiple threads and cancellation happens with cancellation tokens that are thread safe. Having a version where interruptable calls are blocking rather than async (in the language sense) would actually be easier to implement (using the same async-capable APIs under the hood e.g., IOCP on Windows).
Well sure, there's nothing to stop you writing a "standard library" that exposes that interface. The default one doesn't though. I expect there are platforms that Rust supports that don't have interruptible timers and networking (whereas C# initially only supported Windows).