I've been stunned by how many smart people talk so casually about LLMs becoming better at math. Do they just forget that a calculator that is wrong 1% of the time is a de facto calculator that doesn't work and should not be used?
I've been stunned by how many smart people talk so casually about LLMs becoming better at math. Do they just forget that a calculator that is wrong 1% of the time is a de facto calculator that doesn't work and should not be used?
> I've been stunned by how many smart people talk so casually about LLMs becoming better at math
Could they be referring to this?
"Advanced version of Gemini with Deep Think officially achieves gold-medal standard at the International Mathematical Olympiad" https://deepmind.google/discover/blog/advanced-version-of-ge...
Doing math is not the same as calculating. LLMs can be very useful in doing math; for calculating they are the wrong tool (and even there they can be very useful, but you ask them to use calculating tools, not to do the calculations themselves—both Claude and ChatGPT are set up to do this).
If you're curious, check out how mathematicians like Robert Ghrist or Terence Tao are using LLMs for math research, both have written about it online repeatedly (along with an increasing number of other researchers).
Apart from assisting with research, their ability on e.g. math olympiad problems is periodically measured and objectively rapidly improving, so this isn't just a matter of opinion.
The best math lecturers I had at university sucked at mental calculations. Some almost screwed up 2+2 on the blackboard.
Yes LLMs suck at calculating stuff. However they can manipulate equations and such, and sometimes impressively so.
You realize that when typing into a calculator, you probably hit a wrong key more than 1% of the time? Which is why you always type important calculations twice?
I've been stunned by how many smart people talk so casually about how because LLMs aren't perfect, they therefore have no value. Do they just forget that nothing in the world is perfect, and the values of things are measured in degrees?
There’s a big difference between mistyping 1% of the time yourself (human error) and a calculator failing 1% of the time (machine error) and I am willing to bet there isn’t a company out there (maybe a handful of less scrupulous ones) that has knowingly shipped a calculator that got it wrong 1% of the time. Especially in previous decades when countless people were using a dedicated calculator dozens of times a day. Hard to imagine a 1% margin of error was acceptable.
Not to mention now you have the compounded problem of your mistakes plus the calculator’s mistakes.
The computer on your desk has a number of errors just holding values in memory.
Yes, it's not 1%, but the argument is about them being imperfect devices. It's not a horrible thing to start with the presumption that calculators are not perfect.
Yes but I don’t depend on the output of my comp’s memory in such explicit terms and it doesn’t have lasting consequences. If my calculator literally gives me the wrong answer 1% of the time that’s a bigger problem.
There isn't a difference in the big picture. Error is error. Even when we have incredibly reliable things, there's error when they interface with humans. Humans have error interfacing with each other.
But you seem to have missed the main point I was making. See? Another error. They're everwhere! ;)
> But you seem to have missed the main point I was making. See? Another error. They're everwhere! ;)
Ah, but whose error? ;)
> But you seem to have missed the main point I was making. See? Another error. They're everwhere! ;)
You really could’ve done without this bit.