Another version of this issue is when you push back but you were NOT "right to push back". In other words, the LLM original solution was better than the pushback.
Most of the time my pushbacks are true improvements, but I've seen a couple of instances where the LLM was happy to downgrade their own good solution.
I've had those as well. Sometimes I'm asking clarifying questions because I'm not sure about the solution, and the LLM "interprets" that as pushback (as opposed to curiosity / enquiry), and sycophancy takes over. Sometimes it will simply change the code without ever answering the questions, or it will answer the questions along with it, but incorrectly - or with bad assumptions.
> Answer grounded in truth, with evidence and concrete proof, no guessing or assumptions allowed, no changes to files on disk.
I've used this a bunch as a suffix to try to prevent that, works OK in most cases, but not always obviously, works better in the system/developer prompt if you have access to those. Seems I've used that about ~1000 times since 2025/08 when I started using codex (- transcription duplications, so maybe 1/2 of that?).
> Another version of this issue is when you push back but you were NOT "right to push back". In other words, the LLM original solution was better than the pushback.
Indeed, it's easy to surface this by sending one model a "Review" of their proposal to another, then bounce them back and forward, ask which one is best and both models will almost always say something like "The other proposal/review is better", I'm guessing because somehow they think it comes from the human, and "human is always right" or something.