Once I asked ChatGPT "it takes 9 months for a woman to make one baby. How long does it take 9 women to make one baby?". The response was "it takes 1 month".

I guess it gives the correct answer now. I also guess that these silly mistakes are patched and these patches compensate for the lack of a comprehensive world model.

These "trap" questions dont prove that the model is silly. They only prove that the user is a smartass. I asked the question about pregnancy only to to show a friend that his opinion that LLMs have phd level intelligence is naive and anthropomorphic. LLMs are great tools regardless of their ability to understand the physical reality. I don't expect my wrenches to solve puzzles or show emotions.