> The issue with your analogy is that calculators do not hallucinate. They do not make mistakes. An accountant is able to fully offload the mental overhead of arithmetic because the calculator is reliable.
If you've ever done any modeling/serious accounting, you'll find that you feel more like a DBA than a "person punching on a calculator". You ask questions and then you figure out how to get the answers you want by "querying" excel cells. Many times querying isn't in quotes.
To me, the analogy of the parent is quite apt.
But the database doesn't hallucinate data, if always does exactly what you ask it to do and gives you reliable numbers unless you ask it to do a random operation.
I agree databases don't hallucinate but somehow most databases still end up full of garbage.
Whenever people are doing the data entry you shouldn't trust your data. It's not the same as LLM hallucinations but it's not entirely different either.
I really don't understand the hallucination problem now in 2025. If you know what you're doing and you know what you need to get from the LLM and you can describe it well enough that it would be hard to screw up, LLMs are incredibly useful. They can nearly one shot an entire (edited here) skeleton architecture that I only need to nudge into the right place before adding what I want on top of it. Yes, i run into code from LLMs that i have to tweak, but it has been incredibly helpful for me. I haven't had hallucination problems in a couple of years now...
> I really don't understand the hallucination problem now in 2025
Perhaps this OpenAI paper would be interesting then (published September 4th):
https://arxiv.org/pdf/2509.04664
Hallucination is still absolutely an issue, and it doesn’t go away by reframing it as user error, saying the user didn’t know what they were doing, didn’t know what they needed to get from the LLM, or couldn’t describe it well enough.
That is why you check your results. If you know what the end outcome should be. Doesn’t matter if it hallucinates. If it does, it probably already got you 90% of the work done which is less work that you have to do now to finish it.
This only works for classes of problems where checking the answer is easier than doing the calculation. Things like making a visualization, writing simple functions, etc. For those, it’s definitely easier to use an LLM.
But a lot of software isn’t like that. You can introduce subtle bugs along the way, so verifying is at least as hard as writing it in the first place. Likely harder, since writing code is easier than reading for most people.
Exactly, thank you.
I recognize that an accountant’s job is more than just running a bunch of calculations and getting a result. But part of the job is doing that, and it would be a real PITA if their calculator was stochastic. I would definitely not call it a productivity enhancer.
If my calculator sometimes returned incorrect results I would throw it out. And I say this as an MLE who builds neural nets.
You still make mistakes. Just because you did it yourself doesn't mean it's error free. The more complex the question the more error prone.
Thankfully the more complex the question almost always there's more than one way to derive the answer and you use that to check.