Nice article and explanations!
On a tangential note, I keep noticing "why x matters", "it's crucial here" that just remind me of Claude. Recently Claude has been gaslighting me in complex problems with such statements and seeing them on an article is low-key infuriating at this point. I can't trust Claude anymore on the most complex problems where it sometimes gets the answer right but completely misses the point and introduces huge complex blocks of code and logic with precisely "why it matters", "this is crucial here".
I've seen many posts on Reddit in this AI-induced 'psychosis' when people end up believing the words that get generated for them without applying sufficient critical thought.
This sycophancy is a serious problem and exploits a weakness in the human psyche (flattery) which may be easier for the RLHF to find reward in than genuinely correct responses.
This problem is super pervasive in companies where the less technical individuals (that also happen to be decision makers) are using AI to fight/challenge the technical knowledge of their SMEs. It's super annoying. SMEs have some real gold in the form of niche/tribal knowledge that, by the grace of html Jesus, is not always sufficiently documented for an AI to absorb it into its pseudo-aggregate data sphere.
billionaires in psychosis is another level (shepards tone) of concerning thoughts.