Meh, I feel that the car wash test is probably the worst question of all of those LLM test questions. The question is basically logically inconsistent and expect the model to work around the inconsistency.

It seems like a fine question to me. If the question is "logically inconsistent" (IMO it's more that it's vague if you don't say why you're going there), then we want a model to respond with a request asking for clarification that resolves the inconsistency to generate a correct answer, or an answer that outlines the different cases. Some models even fail when you say that you need to wash your car in the prompt.

Yeah I guess it being vague is more what I meant. But even if you told AI you need to wash the car, then why are you asking AI in the first place whether you should walk there or drive there. The question just doesn't make too much sense to me, doesn't look like it makes sense to the AI's either.

Riddles are IQ tests; not actual problems that you need to solve.