I hope this approach gets some visibility in the CPU field. It could be obviously improved with a special cpu instruction which simply races two reads and returns the first one which succeeds. She’s doing an insane amount of work, making multiple threads and so on (and burning lots of performance) all to work around the lack of dedicated support for this in silicon.
I actually hope it doesn't!
The results are impressive, but for the vast, vast majority of applications the actual speedup achieved is basically meaningless since it only applies to a tiny fraction of memory accesses.
For the use case Laurie mentioned - i.e. high-frequency trading - then yes, absolutely, it's valuable (if you accept that a technology which doesn't actually achieve anything beyond transmuting energy into money is truly valuable).
For the rest of us, the last thing the world needs is a new way to waste memory, especially given its current availability!