We're going to see increasingly more of these, and it's going to cause a big scandal at one point, that pops the current AI bubble. It's really obvious that you can't use non-deterministic systems this way but companies are hellbent on doing it anyway. This is why I won't take a role to implement "AI" in an existing product.
I don’t understand why people seem to be attacking the “non-determinism” of LLMs. First, I think most people are confusing “probabilistic” with “non-deterministic” which have very distinct meanings in CS/ML. Non-deterministic typically entails following multiple paths at once. Consider regex matching with NFAs or even the particular view of a list as a monad. The only case where LLMs are “non-deterministic” is when using sampling algorithms like beam search where multiple paths are considered simultaneously. But most LLM usage being discussed doesn’t involve beam search.
But even if one assumes people mean “probabilistic”, that’s also an odd critique given how probabilistic software has pretty much eaten the world. Most of my career has been building reliable product using probabilistic models.
Finally, there’s nothing inherently probabilistic or non-deterministic about LLM generation, these are properties of the sampler applied. I did quite a lot of LLM benchmarking in recent years and almost always used greedy sampling both for performance (doing things like GSM8K strong benefits from choosing the maximum likely path) and reproducibility. You can absolutely set up LLM tools that have perfectly reproducible results. LLMs have many issues but their probabilistic nature is not one of them.
I think you're being overly (and incorrectly) pedantic about the meaning of "non-deterministic" -- you're applying the fairly niche definition of the term as used on finite automata, when the people you're refuting are using it in the sense of https://en.wikipedia.org/wiki/Nondeterministic_algorithm: "In computer science and computer programming, a nondeterministic algorithm is an algorithm that, even for the same input, can exhibit different behaviors on different runs, as opposed to a deterministic algorithm." I think this usage of the term is more common than the finite automata sense. Dictionary.com doesn't have non-deterministic, but its (relevant) definition of deterministic is "of or relating to a process or model in which the output is determined solely by the input and initial conditions, thereby always returning the same results": https://www.dictionary.com/browse/deterministic
Under that definition of (non-)deterministic, ironically, an NFA is deterministic, because it always produces the same result for the same input.
I'm not being pedantic, this is foundational comp-sci stuff drawing from the theory of computation (it's what the 'N' stands for in NP complete). That is not a particularly great (or relevant) wikipedia article you link to (you can look at the citations). The one on "Non-deterministic programming"[0] is probably better. But ultimately you can't just dismiss NFAs as these serve as the foundation for computational non-determinism. Automata theory isn't just some niche area of computing it's part of how we actually define what computation is.
We can just go straight to the Sipser (from the chapter 1, all emphasis is Sipser's)[1]:
> Nondeterminism is a useful concept that has had great impact on the theory of computation. So far in our discussion, every step of a computation follows in a unique way from the preceding step. When the machine is in a given state and reads the next input symbol, we know what the next state will be--it is determined. We call this deterministic computation. In a nondeterministic machine several choices may exist for the next state at any point.
> How does an NFA compute? Suppose that we are running an NFA on an input string and come to a state with multiple ways to proceed. For example, say that we are in state q_1 in NFA N_1 and that the next input symbol is a 1. After reading that symbol, the machine splits into multiple copies of itself and follows all the possibilities in parallel. Each copy of the machine proceeds and continues as before.
This is why the list monad also provides a useful way to explore non-determinism that mirrors in functional programming terms what NFAs do in a classical theory of computation framework.
To this point, LLMs can form this type of nondeterministic computing when they follow multiple paths at once doing beam search, but are unquestionably deterministic when doing greedy optimization, and still deterministic when using other single path sampling techniques and a known seed.
[0]. https://en.wikipedia.org/wiki/Nondeterministic_programming
[1]. https://cs.brown.edu/courses/csci1810/fall-2023/resources/ch...
You're just explaining what the word "nondeterministic" means when you put it before "finite automaton", which doesn't have much to do with what "nondeterministic" means in other places.
There was an article on hackernews a few years back (before LLMs took over) about jobs that could be replaced by a sign saying "$default_result" 99% of the time.
Like being a cancer diagnostician. Or an inspector at a border crossing.
Using LLMs is currently a lot like going to a diagnostian that always responds "no, you're healthy". The answer is probably right. But still we pay people a lot to get that last 1%.
> But still we pay people a lot to get that last 1%.
If people paid for docs as an independent product, or had the foresight to evaluate the quality of the docs before making a purchase and use it as part of their criteria (or are able to do that at all), I think attitudes around docs and "docs bots" and their correctness, utility etc. would be a lot different.
"chaotic system" might be more precise here: small variations in the input may result in arbitrary large differences in the output.
It's not entirely unrelated, the fact that the system is non-deterministic means that it necessarily is probabilistic.
A business can reduce temperature to 0 and choose a specific seed, and it's the correct approach in most cases, but still the answers might change!
On the other hand, it's true that there is some probability that is independent of determinism, for example maybe changing the order of some words might yield different answers, this might be a deterministic machines, but there's millions of ways to frame a question, if the answer depends on trivial details of the question formatting, there's a randomness there. Similar to how there is randomness in who will win a chess match between two equally rated players, despite the game being deterministic.
> the system is non-deterministic means that it necessarily is probabilistic.
This is not correct. Both of the examples I gave where specifically chosen because they use non-determinism without any probabilistic framework associated.
Regex matching using non-deterministic finite automata requires absolutely zero usage of probability. You simply need to keep track of multiple paths and store whether or not any are in valid state at the end of processing the string. The list monad as non-determinism is an even more generic model of non-determinism, that again, requires nothing probabilistic in it's reasoning.
Non-deterministic things do often become probabilistic because typically you have to make a choice of paths, and that choice can have a probabilistic nature. But again, NFA regex matching is a perfect example where no "choice" is needed.
[dead]