> You would be surprised, however, at how much detail humans also need to understand each other.

But in this given case, the context can be inferred. Why would I ask whether I should walk or drive to the car wash if my car is already at the car wash?

But also why would you ask whether you should walk or drive if the car is at home? Either way the answer is obvious, and there is no way to interpret it except as a trick question. Of course, the parsimonious assumption is that the car is at home so assuming that the car is at the car wash is a questionable choice to say the least (otherwise there would be 2 cars in the situation, which the question doesn't mention).

But you're ascribing understanding to the LLM, which is not what it's doing. If the LLM understood you, it would realise it's a trick question and, assuming it was British, reply with "You'd drive it because how else would you get it to the car wash you absolute tit."

Even the higher level reasoning, while answering the question correctly, don't grasp the higher context that the question is obviously a trick question. They still answer earnestly. Granted, it is a tool that is doing what you want (answering a question) but let's not ascribe higher understanding than what is clearly observed - and also based on what we know about how LLMs work.

> They still answer earnestly.

Gemini at least is putting some snark into its response:

“Unless you've mastered the art of carrying a 4,000-pound vehicle over your shoulder, you should definitely drive. While 150 feet is a very short walk, it's a bit difficult to wash a car that isn't actually at the car wash!”

Marketing plan comes to mind for labs: find AI tells, fix them, & astroturf on socials that only _your_ frontier model reallly understands the world

I think a good rule of thumb is to default to assuming a question is asked in good faith (i.e. it's not a trick question). That goes for human beings and chat/AI models.

In fact, it's particularly true for AI models because the question could have been generated by some kind of automated process. e.g. I write my schedule out and then ask the model to plan my day. The "go 50 metres to car wash" bit might just be a step in my day.

> I think a good rule of thumb is to default to assuming a question is asked in good faith (i.e. it's not a trick question).

Sure, as a default this is fine. But when things don't make sense, the first thing you do is toss those default assumptions (and probably we have some internal ranking of which ones to toss first).

The normal human response to this question would not be to take it as a genuine question. For most of us, this quickly trips into "this is a trick question".

Rule of thumb for who, humans or chatbots? For a human, who has their own wants and values, I think it makes perfect sense to wonder what on earth made the interlocutor ask that.

Rule of thumb for everyone (i.e. both). If I ask you a question, start by assuming I want the answer to the question as stated unless there is a good reason for you to think it's not meant literally. If you have a lot more context (e.g. you know I frequently ask you trick or rhetorical questions or this is a chit-chat scenario) then maybe you can do something differently.

I think being curious about the motivations behind a question is fine but it only really matters if it's going to affect your answer.

Certainly when dealing with technical problem solving I often find myself asking extremely simple questions and it often wastes time when people don't answer directly, instead answering some completely different other question or demanding explanations why I'm asking for certain information when I'm just trying to help them.

> Rule of thumb for everyone (i.e. both).

That's never been how humans work. Going back to the specific example: the question is so nonsensical on its face that the only logical conclusion is that the asker is taking the piss out of you.

> Certainly when dealing with technical problem solving I often find myself asking extremely simple questions and it often wastes time when people don't answer directly

Context and the nature of the questions matters.

> demanding explanations why I'm asking for certain information when I'm just trying to help them.

Interestingly, they're giving you information with this. The person you're asking doesn't understand the link between your question and the help you're trying to offer. This is manifesting as a belief that you're wasting their time and they're reacting as such. Serious point: invest in communication skills to help draw the line between their needs and how your questions will help you meet them.

Sure, in a context in which you're solving a technical problem for me, it's fair that I shouldn't worry too much about why you're asking - unless, for instance, I'm trying to learn to solve the question myself next time.

Which sounds like a very common, very understandable reason to think about motivations.

So even in that situation, it isn't simple.

This probably sucks for people who aren't good at theory of mind reasoning. But surprisingly maybe, that isn't the case for chatbots. They can be creepily good at it, provided they have the context - they just aren't instruction tuned to ask short clarifying questions in response to a question, which humans do, and which would solve most of these gotchas.

Therefore the correct response would be to inquire back to clarify the question being asked.