An old concept indeed! I think about this Ed Fredkin story a lot... In his words:
"Just a funny story about random numbers: in the early days of computers people wanted to have random numbers for Monte Carlo simulations and stuff like that and so a great big wonderful computer was being designed at MIT’s Lincoln laboratory. It was the largest fastest computer in the world called TX2 and was to have every bell and whistle possible: a display screen that was very fancy and stuff like that. And they decided they were going to solve the random number problem, so they included a register that always yielded a random number; this was really done carefully with radioactive material and Geiger counters, and so on. And so whenever you read this register you got a truly random number, and they thought: “This is a great advance in random numbers for computers!” But the experience was contrary to their expectations! Which was that it turned into a great disaster and everyone ended up hating it: no one writing a program could debug it, because it never ran the same way twice, so ... This was a bit of an exaggeration, but as a result everybody decided that the random number generators of the traditional kind, i.e., shift register sequence generated type and so on, were much better. So that idea got abandoned, and I don’t think it has ever reappeared."
And still today we spend a great deal of effort trying to make our randomly-sampled LLM outputs reproducibly deterministic:
https://thinkingmachines.ai/blog/defeating-nondeterminism-in...
can't you just save the seed?
My understanding is that because GPUs do operations in a highly parallelized fashion, and because float point operations aren't commutative, then once you're using GPUs the seed isn't enough, no. You'd need the seed plus the specific order in which each of intermediate steps of the calculation was finished by the various streaming multiprocessors.
It's funny because that did actually reappear at some point with rdrand. But still it's only really used for cryptography, if you just need a random distribution almost everyone just uses a PRNG (a non-cryptographic one is a lot faster still, apart from being deterministic).