+1 this.
IMHO async is an anti-pattern, and probably the final straw that will prevent me from ever finishing learning Rust. Once one learns pass-by-value and copy-on-write semantics (Clojure, PHP arrays), the world starts looking like a spreadsheet instead of spaghetti code. I feel that a Rust-like language could be built with no borrow checker, simply by allocating twice the memory. Since that gets ever-less expensive, I'm just not willing to die on the hill of efficiency anymore. I predict that someday Rust will be relegated to porting scripting languages to a bare-metal runtime, but will not be recommended for new work.
That said, I think that Rust would make a great teaching tool in an academic setting, as the epitome of imperative languages. Maybe something great will come of it, like Swift from Objective-C or Kotlin from Java. And having grown up on C++, I have a soft spot in my heart for solving the hard problems in the fastest way possible. Maybe a voxel game in Rust, I dunno.
> Since that gets ever-less expensive,
That kind of thinking made sense in the 90s when things followed Moore’s law. But DRAM was one of the first things to fail to keep up: https://ourworldindata.org/grapher/historical-cost-of-comput... and barely gets cheaper anymore. Thats why mobile phones still only have 16gb of memory despite having 4gib a decade ago.
And there’s all sorts of problems that Rust doesn’t necessarily make a great fit for. But Rust’s target marketplace is where you’d otherwise use a low level language like C or C++. If you can just heap allocate everything and aggressively create copies all over the place, then why would you ever use those languages in the first place.
And for what it’s worth Rust is finding a lot of success even replacing all the tooling in other language ecosystems like Ruby, Python, and JS precisely because the tools in those ecosystems written in the native language end up being horribly slow. And memory allocation and randomly deep copying arrays are the kinds of things that add up and make things slow (in addition to GC pauses, slow startups, interpreter costs etc).
And you can always choose not to do async in Rust although personally I’m a huge fan as it makes it really clear where you have sprinkled in I/O in places you shouldn’t have.
Before adopting Rust, I also found it silly for high-level tasks where e.g. Clojure or Java would suffice. However, the results of using Rust changed my mind.
I used to write web backends in Clojure, and justified it with the fact that the JVM has some of the best profiling tools available (I still believe this), and the JVM itself exposes lots of knobs to not only fine-tune the GC, but even choose a GC! (This cannot be understated; garbage collectors tend to be deeply integrated into a language's runtime, and it's amazing to me that the Java platform manages to ship several garbage collectors, each of which are optimal in their own specific situations).
After rewriting an NLP-heavy web app in Rust, I saw massive performance gains over the original Clojure version, even though both aggressively copy data and the Rust version is full of atomic refcounts (atomic refcounting is not the fastest GC out there...)
The binary emitted by rustc is also much smaller. ~10 MB static binary vs. GraalVM's ~80 MB native images (and longer build times, since classpath analysis and reflection scanning require a lot of work)
What surprised me the most is how high-level Rust feels in practice. I can use pattern matching, async/await, functional programming idioms, etc., and it ends up being fast anyway. Coming from Clojure, Rust syntax trying its best to be expression-oriented is a key differentiator from other languages in its target domain (notably, C++). I sometimes miss TypeScript's anonymous enums, but Rust's type system can express a lot of of runtime behavior, and it's partly why many jokingly state "if it compiles, it's likely correct". Then there's the little things, like how Rust's Futures don't immediately start in the background. In contrast, JavaScript Promises are immediately pushed to a microtask queue, so cancelling a Promise is impossible by design.
Overall, it's the little things like this -- and the toolchain (cargo, clippy, rustfmt) -- that have kept me using Rust. I can write high-level code and still compile down to a ~5 MB binary and outperform idiomatic code in other languages I'm familiar with (e.g. Clojure, Java, and TypeScript).
Speaking personally, that is what first attracted me to Rust — that you can write high-level idiomatic code and still get roughly optimal performance.
I also like Clojure, but I have to wonder how that would have compared in Java, which I think is more performant.
It isn’t as dramatic a decrease as other types of storage, but $4,000 to $1,000 per terabyte in a decade is still a big drop.
Not big enough to hand wave away being careless with RAM. That worked for CPU cycles until ~2010 but the failure to continue scaling traditional computing paradigms exponentially is a huge reason why good performance engineering is still really important for large scale tasks.
Author here -- I'd recommend reading my blog post about how cargo-nextest uses Tokio + async Rust to handle very complex state machines: https://sunshowers.io/posts/nextest-and-tokio/
Ah cool, a couple of kudos to you:
1) I learned about pin in Rust to prevent values from moving in memory.
2) I learned about the html <summary> tag (the turndown arrows in your article that work with Javascript disabled) hah.
I can see how dealing with stream and resource cleanup in async code could be a chore. It sounds like you were able to do that in a fairly declarative manner, which is what I always strive for as well.
I think my hesitation with async is that I already went down that road early in my programming life with cooperative threads/multitasking on Mac OS 9 and earlier. There always seems to be yet another brittle edge case to deal with, so it can feel infuriating playing whack-a-mole until they're all nailed down.
For example, pinning memory looks a lot like locking handles in Mac OS. Handles were pointers to pointers, so it was a bare hands way to implement a memory defragmenter before runtimes were smart enough to handle it. If apps used handles, then blocks of data could be unlocked, moved somewhere else in memory, and then re-locked. Code had to do an extra hop through each handle to get to the original pointer, which was a frequent source of bugs because one async process might be working on a block, yield, and then have another async process move the handle out from under it.
The lock's state was stored in a flag in the memory manager, basically a small bit of metadata. I haven't investigated, but I suspect that Rust may be able to handle locking more efficiently, perhaps more like reference counting or the borrow checker where it can infer whether a pointer is locked without storing that flag somewhere (but I could be wrong).
Apple abandoned handles when it migrated to OS 10 and Darwin inherited protected memory and better virtual memory from FreeBSD. Although now that I write this out, I'm not sure that they solved in-process fragmentation. I think they just give apps the full 32 or 64 bit address space so that effectively there is always another region available for the next allocation, and let the virtual memory subsystem consolidate 4k memory blocks into contiguous strips internally. The dereferencing of memory step became implicit rather than explicit, as well as hidden from apps, so that whole classes of bugs became unreachable.
Anyway, that's why I prefer the runtime to handle more of this. I want strong guarantees that I can terminate a process and all locks inside it will get freed as well. I can pretty much rely on that even in hacky languages like PHP.
My frustration with all of this is that we could/should have demanded better runtimes. We could have had realtime unixes where task switching and memory allocation were effectively free. Unfortunately the powers that be (Mac OS and Windows) had runtimes that were too entrenched with too many users relying on quirks and so they dragged their feet and never did better. Languages like Rust were forced to get very clever and go to the ends of the earth to work around that. Then when companies like Google and Facebook won the internet lottery, they pulled the ladder up behind them by unilaterally handing down decrees from on high that developers should use bare hands techniques, rather than putting real resources into reforming the fundamentals so that we wouldn't have to.
What I'm trying to say is that your solution is clever and solves a common pattern in about the simplest way possible, but is not as simple as synchronous-blocking unix pipes to child processes in shell scripts. That's in no way a criticism. I have similar feelings about stuff like Docker and Kubernetes after reading about Podman. If we could magically go back and see the initial assumptions that led us down the road we're on, we might have tried different approaches. It's all of those roads not taken that haunt me, because they represent so much of my workload each day.
Thanks for the kind words.
It is not as simple as synchronous pipes, but it also has far better edge case and error handling.
For example, on Unix, if you press ctrl-Z to pause execution, nextest will send SIGTSTP to test processes and also pause its internal timers (resuming them when you type in fg or bg). That kind of bookkeeping is pretty hard to do with linear code, and especially hard to coordinate across subprocesses.
State machines with message passing (as seen in GUI apps) are very helpful at handling this, but they're quite hard to write by hand.
The async keyword in Rust allows you to write state machines that look somewhat like linear code (though with the big cancellation asterisk).
The rust ecosystem is very invested into making every library that touches the network async. But if the program you are writing doesn't touch the network you don't have to think about async. Or you can banish network code onto one thread with an async runtime, and communicate via flume queues/channels with it from normal threaded code running in another thread
> The rust ecosystem is very invested into making every library that touches the network async.
Right, and that is one of the absolute worst things about the Rust ecosystem. Most programs don't benefit from async, and should use plain old threads because they are much easier to work with.
There is a very reasonable argument that an entire language feature shouldn't be oriented towards making high-complexity state machines easy to write, since they're relatively rare in production. But speakingly purely selfishly, I'm happy I can write something like cargo-nextest using async Rust in a bug-free manner.
In your view, which languages / ecosystems have a better general approach for handling task cancellations than async rust?
Well, synchronous blocking approaches (as opposed to asynchronous nonblocking) provide that stuff for free. It would basically be the functional programming and unix ecosystems. Arguably the Go language's goroutines strike a good balance between cooperate and preemptive threads/multitasking. Although that distinction is not fundamental, because if we had realtime unix runtimes, then spawning an isolated process would have no more overhead than spawning a thread (this is the towering failure of all mainstream OSs today IMHO):
https://kushallabs.com/understanding-concurrency-in-go-green...
So lots of concepts are worth learning like atomicity, ACID compliance, write ahead logs (WALs), statically detecting livelocks and deadlocks (or making them unreachable), consensus algorithms like Raft and Paxos, state transfer algorithms like software transaction memory (STM), connectionless state transfer like hash trees and Merkle trees, etc.
The key insight is that manual management of tasks is, for the most part, not tenable by humans. It's better to take a step back and work at a higher level of abstraction. For example, declarative programming works in terms of goals/specifications/tests, so that the runner has more freedom to cancel and restart/retry tasks arbitrarily. That way the user can fire off a workload and wait until all of the tasks match a success criteria, and even treat that process as idempotent so it can all be run again without harm. In this way, trees of success criteria can be composed to manage a task pool.
I'd probably point to CockroachDB as one of the best task-cancellers, since it doesn't have a shutdown procedure. Its process can simply be terminated by the user with control-c, then it reconciles any outstanding transactions the next time it's booted, which just adds some latency. If an entire database can do that, then "this is the way".
> Well, synchronous blocking approaches (as opposed to asynchronous nonblocking) provide that stuff for free.
Not really. The talk describes problems that can show up in any environment where you have concurrency and cancellation. To adapt some examples: a thread that consumes a message from a channel but is killed before it can process it, has still resulted in that message being lost. A synchronous task that needs to temporarily violate invariants in some data structure that can't be updated atomically, has still left that data structure in an invalid state when it gets killed part way through.
> Arguably the Go language's goroutines strike a good balance between cooperate and preemptive threads/multitasking.
Goroutines are pretty nice. It's especially nice that Go has avoided the function colouring problem. I'm not convinced that having to litter your code with select's if you need to make your goroutine's cancel-able is good though. And if you don't care about being able to cancel tasks, you can write async rust in a way that ensures they won't be cancelled by accident fairly easily. Unless there's some better way to write cancel-able goroutines that I'm not familiar with.
> The key insight is that manual management of tasks is, for the most part, not tenable by humans. It's better to take a step back and work at a higher level of abstraction.
Of course it's always important to look at systems as a whole. But to build larger systems out of smaller components you need to actually build the small components.
> I'd probably point to CockroachDB as one of the best task-cancellers, since it doesn't have a shutdown procedure. Its process can simply be terminated by the user with control-c, then it reconciles any outstanding transactions the next time it's booted, which just adds some latency. If an entire database can do that, then "this is the way".
I'm not familiar with CockroachDB specifically, but I do think a database should generally have a more involved happy-path shutdown procedure than that. In particular, I would like the database not to begin processing new transactions if it is not going to be able to finish them before it needs to shut down, even if not finishing them wouldn't violate ACID or any of my invariants.
Elixir via the Task module https://hexdocs.pm/elixir/Task.html
Ada has very well thought out and proven tasking features, including clean methods of task cancellation.
> I feel that a Rust-like language could be built with no borrow checker, simply by allocating twice the memory.
If that's what you're looking for, have you considered OCaml?
There is a voxel game in Rust, btw: https://veloren.net/
This reads hella uninformed
You're right, but not in the usual way hah. I started programming in the late 1980s with HyperCard, then used mostly C++ in the 90s, and have seen the rise and fall of various paradigms that felt eternal. I mean at one time, Java felt untouchable.
I think that Rust is making an admiral attempt to attack challenges that have already been solved better in other ways. I just don't have much use for its arsenal.
For example, I wasted 2 years of my life trying to write a NAT-punching peer to peer networking framework for games around 2005, but was first exposed to synchronous blocking vs asynchronous nonblocking networking in the late 90s when I read Beej's Guide to Network Programming:
https://beej.us/guide/bgnet/
I was hopelessly trying to mimic the functionality of libraries like RakNet and Zoidcom without knowing some fundamentals that I wouldn't fully understand for years:
https://www.reddit.com/r/gamedev/comments/93kr9h/recommended...
20 years later, Rust has iroh:
https://github.com/n0-computer/iroh
I realize there is some irony in pointing to a Rust library as a final solution.
But my point is that when developers reached high levels of financial success and power, they didn't go back to address the fundamentals. NAT was always an abomination to me. And as far as I know, they kept it in IPv6. Someone like Google should have provided a way to get around it that's not as heavy as WebRTC. So many developer years of work have been wasted due to the mistakes of the status quo. So that we wander in the desert for years using lackluster paradigms because we don't know that better stuff exists.
Knowing what I know now, I would have created open source C (portable) libraries to solve NAT punching, state transfer with a software transactional memory (STM) or Raft, entity state machines (like in Unity), movement prediction/dead reckoning, etc etc etc to form the basis of a distributed computing network for virtual worlds and let the developer community solve that. Someone will do that in a year or two with AI now I assume.
Ok you kinda got me. I realize after writing this out that I wouldn't use Rust for new work, but it's not so much about the language itself as building upon proven layers to "get real work done". The lower the level of abstraction, the harder that is to do. So it's hard for me to see the problem which Rust is trying to solve.
lean4.
it analyses code. if it finds raii/linearity/single-ownership, it does exactly like rust mem mgmt.
but if it js not, it does rc.
so it does what rust, but automagically without polluting code.
so cow or pbw or 2mem are not only options to improve rust.
By allocating twice the memory of ...?
Sorry, I should have elaborated. I believe that copy-on-write with virtual memory (VM) can be used to achieve a runtime that appears to use copy-by-value everywhere with near-zero overhead when the VM block size is small, like 4k.
If we imagine a function passing a block of memory to sub functions which may write bytes to it randomly, then each of those writes may allocate another block. If those allocations are similar in size to the VM block size, then each invocation can potentially double the amount of memory used.
A do-one-thing-and-do-it-well (DOTADIW?) program works in a one-shot fashion where the main process fires off child processes that return and free the memory that was passed by value. Surrounded by pipes, so that data is transmuted by each process and sent to the next one. VM usage may grow large temporarily per-process, but overall we can think of each concurrent process as roughly doubling the amount of memory.
Writing this out, I realized that the worst case might be more like every byte changing in a 4k block, so a 4096 times increase in memory. Which still might be reasonable, since we accept roughly a 200x speed decrease for scripting languages. It might be worth profiling PHP to see how much memory increases when every byte in a passed array is modified. Maybe they use a clever tree or refcount strategy to reduce the amount of storage needed when arrays are modified. Or maybe they just copy the entire array?
Another avenue of research might be determining whether a smarter runtime could work with "virtual" VMs (VVMs?) to use a really small block size, maybe 4 or 8 bytes to match the memory bus. I'd be willing to live with a 4x or 8x increase in memory to avoid borrow checkers, refcounts or garbage collection.
-
Edit: after all these years, I finally looked up how PHP handles copy-on-write, and it does copy the whole array on write unfortunately:
http://hengrui-li.blogspot.com/2011/08/php-copy-on-write-how...
If I were to write something like this today, I'd maybe use "smart" associative arrays of some kind instead of contiguous arrays, so that only the modified section would get copied. Internally that might be a B-Tree with perhaps 8 bytes per leaf to hold N primitives like 1 double, 2 floats, etc. In practice, a larger size like 16-256 bytes per leaf might improve performance at the cost of memory.
Looks like ZFS deduplication only copies the blocks within the file that changed, not the entire file. Their strategy could be used for a VM so that copy-on-write between processes only copies the 4k blocks that change. Then if it was a realtime unix, functions could be synchronous blocking processes that could be called with little or no overhead.
This is the level of work that would be required to replace Rust with simpler metaphors, and why it hasn't happened yet.
Everything, everywhere, all the time! It’s so simple, why has no one ever thought of just increasing a finite resource!?