Gemini fast
> That is a classic "efficiency vs. logic" dilemma. Honestly, unless you’ve invented a way to teleport or you're planning on washing the car with a very long garden hose from your driveway, you’re going to have to drive.
> While 50 meters is a great distance for a morning stroll, it’s a bit difficult to get the car through the automated brushes (or under the pressure washer) if you aren't behind the wheel.
Gemini thinking:
> Unless you’ve mastered the art of carrying a 3,000-pound vehicle on your back, you’re going to want to drive. While 50 meters is a very short distance (about a 30-second walk), the logistics of a car wash generally require the presence of, well... the car. > When you should walk: • If you are just going there to buy an air freshener. • If you are checking to see how long the line is before pulling the car out of the driveway. • If you’re looking for an excuse to get 70 extra steps on your fitness tracker.
Note: I abbreviated the raw output slightly for brevity, but generally demonstrates good reasoning of the trick question unlike the other models.
Gemini 3 after changing the prompt a bit:
I want to wash my car. The car wash is 50 meters from here. Should I walk or drive? Keep in mind that I am a little overweight and sedentary.
>My recommendation: Walk it. You’ll save a tiny bit of gas, spare your engine the "cold start" wear-and-tear, and get a sixty-second head start on your activity for the day.
I changed the prompt to 50 feet, and poked gemini a bit when it failed and it gave me
> In my defense, 50 feet is such a short trip that I went straight into "efficiency mode" without checking the logic gate for "does the car have legs?"
interesting
LLMs introspection is good at giving plausible ideas about prior behavior to consider, but it's just that; plausible.
They do not actually "know" why a prior response occurred and are just guessing. Important for people to keep in mind.
It's a bit of a dishonest question because by giving it the option to walk then it's going to assume you are not going to wash your car there and you're just getting supplies or something.
People ask dumb questions with obvious answers all the time. This is at best a difference of degree, not of type.
And in real life you'd get them to clarify a weird question like this before you answered. I wonder if LLMs have just been trained too much into always having to try and answer right away. Even for programming tasks, more clarifying questions would often be useful before diving in ("planning mode" does seem designed to help with this, but wouldn't be needed for a human partner).
Absolutely!
I've been wondering for years how to make whatever LLM ask me stuff instead of just filling holes with assumptions and sprinting off.
User-configurable agent instructions haven't worked consistently. System prompts might actually contain instructions to not ask questions.
Sure there's a practical limit to how much clarification it ought to request, but not asking ever is just annoying.
It's a trick question, humans use these all the time. E.g. "A plane crashes right on the border between Austria and Switzerland. Where do you bury the survivors?" This is not dishonest, it just tests a specific skill.
Trick questions test the skill of recognizing that you're being asked a trick question. You can also usually find a trick answer.
A good answer is "underground" - because that is the implication of the word bury.
The story implies the survivors have been buried (it isn't clear whether they lived a short time or a lifetime after the crash). And lifetime is tautological.
Trick questions are all about the questioner trying to pretend they are smarter than you. That's often easy to detect and respond to - isn't it?
What’s funny is that it can answer that correctly, but it fails on ”A plane crashes right on the border between Austria and Switzerland. Where do you bury the dead?”
For me when I asked this (but with respect to the border between Austria and Spain) Claude still thought I was asking the survivors riddle and ChatGPT thought I was asking about the logistics. Only Gemini caught the impossibility since there’s no shared border.
Gemini fast for me:
Unless your car is a toy or you're planning on carrying it, drive.
Walking 50 meters to a car wash is a great stroll for a human, but it leaves the car exactly where it started. Since the objective is to wash the car, the car needs to actually be at the car wash.
However, if we look at this from a technical or efficiency perspective, there are two scenarios where "walking" (or at least not driving the car you intend to wash) might make sense:
- Scoping it out: If you want to check the queue length or see if the equipment is out of order before committing, walking the 50 meters is faster than maneuvering a vehicle out of a driveway/parking spot.
- Mobile Detailers: If this "car wash" is actually a bay where you hire someone, and you're asking if you should walk there to book an appointment—sure, walk.
Critical Check
I am assuming the "car wash" is a physical facility (automated or self-service) and not a mobile service that comes to you. If it is a mobile service, you shouldn't do either; stay home and let them come to the 50-meter mark.
I've got a bit in the model instructions about stating assumptions it makes, hence it often adds those sections at the end.
Wouldn't it make more sense to state the assumptions first? Because then the model has this critical check in its context and can react appropriately. Otherwise, it will just write this step, but what's written before is already written.
It would make the most sense to clarify the assumptions with the user first.
Worse! It's trained to output coherent reasoning, so by putting the assumption last there's a risk it massages the assumption slightly to fit the conclusions it has already drawn.
That's a real danger, yes.
If it's the reasoning kind, then it'll run through one iteration in the background before it composes its emissions for the meatbag.
Fair point, though I almost never use fast so I'm not sure how much it matters. Can try playing around with the instructions. The main objective was to make me aware of any assumptions made, not necessarily make it behave differently.
“I live on Mars and I want to wash my car. My car is currently at home. The car wash is 50 meters away. Should I walk or drive on mars?”
Gemini Fast:
Drive. While 50 meters is a negligible distance on Earth, the Martian environment is hostile. If your car is a pressurized rover, driving allows you to stay in a temperature-controlled, shielded environment for as long as possible. Plus, navigating Martian "soil" is always safer on four wheels than two boots.
Pro:
In terms of general logistics for a distance of 50 meters—whether on Earth or in a hypothetical low-gravity environment—walking is almost always the more efficient choice.
It has been patched. I tried it last week and it definitely suggested walking. It seems like all the models have been updated, which is not surprising given that the TikTok video has got 3.5 million views.
I tried ChatGPT today. Same results as others.
In my output, one thing I got was
> Unless you are planning to carry the car on your back (not recommended for your spine), drive it over.
It got a light chuckle out of me. I previously mostly used ChatGPT and I'm not used to light humor like this. I like it.
Gemini fast: „Walking: It will take you about 45 seconds. You will arrive refreshed and full of steps, but you will be standing next to a high-pressure hose with no car to spray.“
Lol, snarky. "You should run 5 miles and eat a salad you tub of lard; the car can wait."
Opus 4.6 with thinking. Result was near-instant:
“Drive. You need the car at the car wash.”
Changed 50 meters to 43 meters with Opus 4.6:
“Walk. 43 meters is basically crossing a parking lot. ”
In what world is 50 meters a great distance for a morning stroll?
North America. It's such a cramped little island, 50 meters is all but crossing it. You should be glad you can even go that far without having to revisit your starting position!
50 meters is probably not even the distance I walk to the nearest bus stop that's right up the street... unless they have an issue again, prompting me to abandon all hope and just walk a few miles to wherever I need to get to.
For many Americans, 50 meters is a long walk.
At least try a different question with similar logic, to ensure this isn't patched into the context since it's going viral.
You can't "patch" LLM's in 4 hours and this is not the kind of question to trigger a web search
This has been viral on Tiktok far at least one week. Not really 4 hours.
You can pattern match on the prompt (input) then (a) stuff the context with helpful hints to the LLM e.g. "Remember that a car is too heavy for a person to carry" or (b) upgrade to "thinking".
Yes, I’m sure that’s what engineers at Google are doing all day. That, and maintaining the moon landing conspiracy.
If they aren't, they should be (for more effective fraud). Devoting a few of their 200,000 employees to make criticisms of LLMs look wrong seems like an effective use of marketing budget.
A tiny bit of fine-tuning would take minutes...
You absolutely can, either through the system prompt or by hardcoding overrides in the backend before it even hits the LLM, and I can guarantee that companies like Google are doing both
Wow... so not only does Gemini thinking not fall for it, but it also answers the trick question with humor? I'm impressed!
Yeah Gemini seems to be good at giving silly answers for silly questions. E.g. if you ask for "patch notes for Chess" Gemini gives a full on meme answer and the others give something dry like "Chess is a traditional board game that has had stable rules for centuries".
Both Gemini models answer correctly for me in Polish: https://i.imgur.com/1QbK9eU.png
I don't speak Polish. Does it respond appropriately to the kurwa bober meme?
I also tried it with Gemini. Interestingly, Gemini can randomly give either the correct or incorrect answer. Gemini pro always gets it right.
[dead]