> (to be pedantic "Excel" doesn't make mistakes, people trusting its defaults do)
So what is your point? An expert that mastered excel don't have to check that excel calculated things correctly, he just need to check that he gave excel the right inputs and formulas. That is not true for LLM, you do have to check that it actually did what you asked regardless how good you are at prompting.
The only thing I trust an LLM to do correctly are translations, they are very reliable at that, other than that I always verify.
"Just" check that every cell in the million row xlxs file is correct.
See the issue here?
Excel has no proper built-in validation or test suite, not sure about 3rd party ones. The last time I checked some years back there was like one that didn't do much.
All it takes is one person accidentally or unknowingly entering static data on top of a few formulas in the middle and nobody will catch it. Or Excel "helps" by changing the SEPT1 gene to "September 1. 2025"[0] - this case got so bad they had to RENAME the gene to make Excel behave. "Just" doing it properly didn't work at scale.
The point I'm trying to get at here that neither tool is perfect and requires validation afterwards. With agentic coding we can verify the results, we have the tools for it - and the agent can run them automatically.
In this case Excel is even worse because one human error can escalate massively as there is no simple way to verify the output, Excel has no unit test equivalents or validators.
[0] https://www.progress.org.uk/human-genes-renamed-as-microsoft...
You are describing a garbage in, garbage out problem. However, LLMs introduce a new type of issue, the “valid data in, garbage out” problem. The existence of the former doesn’t make the latter less of an issue.
“Just” checking a million rows is trivial depending on the types of checks you’re running. In any case, you would never want a check which yields false positives and false negatives, since that defeats the entire purpose of the check.