Hold on a second. A transformer produces deterministically a probability distribution over the token alphabet from the context. Then one samples from this distribution. This is random and meant to be random.
Hold on a second. A transformer produces deterministically a probability distribution over the token alphabet from the context. Then one samples from this distribution. This is random and meant to be random.
The sampling process isn't random. If you sample with identical sampling parameters and identical values for said parameters, you will always get same results. You only start getting "non deterministic" behavior when you start using more complex systems outside the scope of your control like multi gpu systems and batch processing. One llm sampled with cash prompting off and and batch processing off will always generate same results if all values are same.
It's possible to deterministically sample from a probability distribution. For example, just seed your RNG with a constant, or with the SHA256 hash of the context.
Well yes, you can "hack" the pseudorandom number generator, but... that's not really the point when talking about determinism in LLMs is it? I mean the mathematical idea of the standard LLM is certainly truly random.
> I mean the mathematical idea of the standard LLM is certainly truly random.
Not really, LLMs give you a distribution over possible next tokens. You are free to then sample from this distribution how you want. There is no need to hack RNG or whatever, for example you can simply just take a greedy approach and always output the most likely token, in which case the LLM becomes deterministic (mathematically). This is equivalent to setting the temperature to 0.