Hacker News

Mostly agree, however this kind of quirk could issue entirely from post-training, where the preferences/habits of a tiny number of people (relative to the main training corpus) can have outsize influence of the style of the model's output. See also the "delve" phenomenon.