Hacker News

I was pointing out humans and LLMs have this failure mode so in a lot of ways it is no big deal/not some smoking gun that LLMs are useless and dangerous, or at least no more useless and dangerous than humans.

I personally would stay away from calling someone, or an LLM, 'stupid' for making this mistake because of several reasons. First, objectively intelligent high functioning people can and do mistakes similar to this so a blanket judgement of 'stupid' is pretty premature based on a common mistake. Second, everything is a probability, even in people. That is why scams work on security professionals as well as on your grandparents. The probability of a professional may be 1 in 10k while on your grandparents it may be 1/100 but that just means that the professional needs to get a lot more phishing attempts thrown at them before they accidentally bite. Someone/something isn't stupid for making a mistake, or even systemically making a mistake, everyone has blind spots that are unique to them. The bar for 'stupid' needs to be higher.

There are a lot of 'gotcha' articles like this one that point out some big mistake an LLM made or systemic blind spot in current LLMs and then conclude, or at least heavily imply, LLMs are dangerous and broken. If the whole world put me under a microscope and all of my mistakes made the front page of HN there would be no room left for anything other than documentation of my daily failures (the front page would really need to grow to just keep up with the last hour worth of mistakes more than likely).

I totally agree with the language ambiguity point. I think that is a feature and not a bug. It allows creativity to jump in. You say something ambiguous and it helps you find alternative paths to go down. It helps the people you are talking to also discover alternative paths more easily. This is really important in conflicts since it can help smooth over ill intentions since both sides can try to find ways of saying things that bridge their internal feelings with the external reality of dialogue. Finally, we often really don't know enough but we still need to say something and like gradient descent, an ambiguous statement may take us a step closer to a useful answer.

  > I personally would stay away from calling someone, or an LLM, 'stupid' for making this mistake because of several reasons.

I wouldn't. Because there's a difference between calling someone's action stupid and saying that someone is stupid. These are entirely dependent upon the context of the claim. Smart people frequently do stupid stuff. I have a PhD and by some metric that makes me "smart" but you'll also see me do plenty of stupid stuff every single day. Language is fuzzy...

But I think responses like yours are entirely dismissive at what's being attempted to be shown. What's being shown is how easily they are fooled. Another popular example right now being the cup with a sealed top and open bottom (lol "world model"?).

  > There are a lot of 'gotcha' articles

The point isn't about getting some gotcha, it is about a clear and concise example of how these systems fail.

What would not be a clear and concise example is showing something that requires domain subject expertise. That's absolutely useless as an example to everyone that isn't a subject matter expert.

The point of these types of experiments is to make people think "if they're making these types of errors that I can easily tell are foolish then how often are they making errors where I am unable to vet or evaluate the accuracy of its outputs?" This is literally the Gell-Mann Amnesia Effect in action[0].

  > I totally agree with the language ambiguity point. I think that is a feature and not a bug.

So does everybody. But there are limits to natural language and we've been discussing them for quite a long time[1]. There is in fact a reason we invented math and programming languages.

  > Finally, we often really don't know enough but we still need to say something and like gradient descent, an ambiguous statement may take us a step closer to a useful answer.

Was this sentence an illustrative example?

Sometimes I think we don't need to say something. I think we all (myself included) could benefit more by spending a bit longer before we open our mouths, or even not opening them as often. There's times where it is important to speak out but there are also times that it is important to not speak. It is okay to not know things and it is okay to not be an expert on everything.

[0] https://themindcollection.com/gell-mann-amnesia-effect/

[1] https://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667...