In comp sci it’s been deterministic, but in other science disciplines (eg medicine) it’s not. Also in lots of science it looks non-deterministic until it’s not (eg medicine is theoretically deterministic, but you have to reason about it experimentally and with probabilities - doesn’t mean novel drugs aren’t technological advancements).
And while the kind of errors hasn’t changed, the quantity and severity of the errors has dropped dramatically in a relatively short span of time.
The problem has always been that every token is suspect.
It's the whole answer being correct that's the important thing, and if you compare GPT 3 vs where we are today only 5 years later the progress in accuracy, knowledge and intelligence is jaw dropping.
I have no idea what you're talking about because they still screw up in the exact same way as gpt3.