Would love if OpenAI did more of these types of posts. Off the top of my head, I'd like to understand:

- The sepia tint on images from gpt-image-1

- The obsession with the word "seam" as it pertains to coding

Other LLM phraseology that I cannot unsee is Claude's "___ is the real unlock" (try google it or search twitter!). There's no way that this phrase is overrepresented in the training data, I don't remember people saying that frequently.

It was always funny how easy it was to spot the people using a Studio Ghibli style generated avatar for their Discord or Slack profile, just from that yellow tinging. A simple LUT or tone-mapping adjustment in Krita/Photoshop/etc. would have dramatically reduced it.

The worst was you could tell when someone had kept feeding the same image back into chatgpt to make incremental edits in a loop. The yellow filter would seemingly stack until the final result was absolutely drenched in that sickly yellow pallor, made any photorealistic humans look like they were all suffering from advanced stages of jaundice.

For context, an example of what happens when you feed the same image back in repeatedly: https://www.instagram.com/reels/DJFG6EDhIHs/

This is just the model converging on some kind of average found in its training data distribution. Here you can see the same concept starting from Dwayne Johnson and then converging to some kind of digital neo-expressionist doodle: https://www.reddit.com/r/ChatGPT/comments/1kbj71z/i_tried_th...

If there's a hint of sepia in the original image and the training data contains a lot of sepia images, it will certainly get reinforced in this process. And the original distracted boyfriend meme certainly has some strong sepia tones in the background. Same way that Dwayne Johnson's face looks a tad cartoonish. And in the intermediate steps they both flow towards some averaged human representation that seems pretty accurate if you consider the real world's ethnic distribution.

Haha fantastic. I'd love to see a comparison reel of that same image-loop for the entire image gen series (gpt-image-1, gpt-image-1.5, gpt-image-2).

Fixed points are a window to the soul of a LLM

- Lucretius in "De rerum natura", probably

Mirror: https://files.catbox.moe/mu8env.mp4

0 bytes?

catbox has been doing that for videos recently, don't know why. try https://www.vxinstagram.com/reels/DJFG6EDhIHs/

I like how the AI seems forced to change their ethnicity to keep up with the color changes. Absolutely wild.

Enough internet for today

That is so creepy in a sci fi other worlds type way.

For me, the worst part is how these ghouls manage to ruin everything with their bullshit technology. Once they touch something unique and make it "AI" it just gets ruined. Now whenever I see something resembling that style, I have to assume it's the bullshit AI. And that's just a minor nuisance - now every underdeveloped idiot uses it to "up their game" with consequences we are only going to understand completely in the upcoming years.

Its called the piss filter

All GPTisms are like that. In moderation there's nothing wrong with any of them. But you start noticing them because a lot of people use these things, and c/p the responses verbatim (or now use claws, I guess). So they stand out.

I don't think it's training data overrepresentation, at least not alone. RLHF and more broadly "alignment" is probably more impactful here. Likely combined with the fact that most people prompt them very briefly, so the models "default" to whatever it was most straight-forward to get a good score.

I've heard plenty of "the system still had some gremlins, but we decided to launch anyway", but not from tens of thousands of people at the same time. That's "the catch", IMO.

Maybe the only solution to GPTisms is infinite context. If I'm talking to my coworker every day I would consciously recognize when I already used a metaphor recently and switch it up. However if my memory got reset every hour, I certainly might tell the same story or use the same metaphor over and over.

> However if my memory got reset every hour, I certainly might tell the same story or use the same metaphor over and over.

All people repeat the same stories and phraseology to some extent, and some people are as bad or worse than LLM chat bots in their predictability. I wonder if the latter have weak long-term memory on the scale of months to years, even if they remember things well from decades ago.

Honestly I think there is more to it - even with infinite context, the LLM needs some kind of intelligence to know what is noise and what is not, you resort to "thinking" - making it create garbage it then feeds to itself.

Learning a language is a big complex task, but it is far from real intelligence.

Another possibility is output watermarking. It's possible to watermark LLM generated text by subtly biasing the probability distribution away from the actual target distribution. Given enough text you can detect the watermark quite quickly, which is useful for excluding your own output from pre-training (unless you want it... plenty of deliberate synthetic data in SFT datasets now as this post-mortem makes clear).

I was told this was possible many years ago by a researcher at Google and have never really seen much discussion of it since. My guess is the labs do it but keep quiet about it to avoid people trying to erase the watermark.

I think the problem is that humans are not random, they are very biased. When you try to capture this bias with an LLM you get a biased pseudo random model

>with the word "seam" as it pertains to coding

I thought this was an established term when it comes to working with codebases comprised of multiple interacting parts.

https://softwareengineering.stackexchange.com/questions/1325...

thanks for this.

> the term originates from Michael Feathers Working Effectively with Legacy Code

I haven’t read the book but, taking the title and Amazon reviews at face value, I feel like this embodies Codex’s coding style as a whole. It treats all code like legacy code.

It's not in the top 10, but it's of the more well-known and widely recommended book in the software industry. I'd put it in the same bucket as "Clean Code" and maybe even "Domain Driven Design"; they're kinda from the same "thought school" in the software industry. So it's definitely over-represented in training data (I'd guess primarily in the form of articles and blog posts and educational material reiterating or rephrasing ideas from the book).

FWIW, I found the concept of "seams" from that book useful back when working on some legacy C++ monolithic code few years back, as TDD is a little more tricky than usual due to peculiarities of the language (and in particular its build model), and there it actually makes sense to know of different kind of "seams" and what they should vs. shouldn't be used for.

It's been a long time since I read it, but it was one of the better books I've read. It changed my approach to how to think about old code-bases.

No, it’s not an established term outside the mentioned books, beyond the generic meaning of the word.

I have frequently encountered the term in the context of unit testing and dependency injection.

Other references (and all predate chatgpt):

>Seams are places in your code where you can plug in different functionality

>Art of Unit Testing, 2nd edition page 54

(https://blog.sasworkshops.com/unit-testing-and-seams/)

>With the help of a technique called creating a seam, or subclass and override we can make almost every piece of code testable.

https://www.hodler.co/2015/12/07/testing-java-legacy-code-wi...

> seam; a point in the code where I can write tests or make a change to enable testing

https://danlimerick.wordpress.com/2012/06/11/breaking-hidden...

Maybe it all ultimately traces back to the book mentioned before, but I don't believe it's an obscure term in the circles of java-y enterprise code/DI. In fact the only reason I know the term is because that's how dependency injection was first defined to me (every place you inject introduces a "seam" between the class being injected and the class you're injecting into, which allows for easy testing). I can't remember where exactly I encountered that definition though.

For what it’s worth, there are many areas of programming where dependency injection is almost never used. Game dev, data science, and embedded systems, for example, rarely use dependency injection. It’s definitely most common in enterprise Java code and less common in Python, C, or C++. And even then, not everyone uses the term “seam”.

I can't say it isn't, but I have been writing code since about 2004 and this is the first time I've become aware that this is a thing.

The one phrase that irks me as overly dramatic and both GPT and Claude use it a lot is "__ is the real smoking gun!"

I'm a non-native English speaker, so maybe it's a really common idiom to use when debugging?

It probably was found in a bunch of meaningful code commit messages

I’m a British English speaker and find the use of cliched American idioms really quite disgusting. Don’t want to think about about ballparks, home runs, smoking guns, going all in, touchdowns or hitting it out the park.

Ironically (or not) I've seen smoking gun attributed to Arthur Conan Doyle in a Sherlock Holmes story. (It was smoking pistol in that story). Even if that's rubbish, I think that one is common across the English speaking world. The baseball/American football stuff is a bit different. In the commonwealth we might say "Hit for six" instead of hitting it out of the park. There are a bunch of other ones related to sports more common in England like snookered, own-goal, red card, etc.

That observation about Sherlock Holmes certainly puts the smackdown on me and gets you to home plate.

It actually probably wouldn’t be too expensive or difficult to finetune those sayings out of default behavior if it were made accessible to you, you could even automate most of the relabeling by having the model come up with a list of idioms and appropriate replacement terms so it calls eg cookies biscuits or removes references to baseball. Absolute bollocks they don’t offer that as a simple option anymore

Should send over a geezer to give them a slap.

In my user instructions I always have a point to "always use British English" which seems to reduce Americanisms. I am yet to see Claude give me a "back of the net!" though, sadly.

Crikey, you are correct!

My colleagues were joking about smoking guns yesterday after noticing that Claude was obsessed with it.

I like how your co-workers enjoy the language. I had a similar group of colleagues once who did similar pre LLM but with words in popular culture, very playful.

In the future these tells will be more identifiable. We will be easier to point back at text and code written in 2026 and more confidently say "this was written by an LLM". It takes time for patterns to form and takes time for it to be noticeable. "Smoking gun was so early 2026 claude".I find thinking of the future looking at now to be refreshing perspective on our usage.

> I'm a non-native English speaker, so maybe it's a really common idiom to use when debugging?

No. But it is something goblins say a lot.

Especially sleuth goblins...

Claude, at least 4.5, not checked recently, has/had an obsession with the number 47 (or numbers containing 47). Ask it to pick a random time or number, or write prose containing numbers, and the bias was crazy.

Also "something shifted" or "cracked".

Humans tend to be biased towards 47 as well. It’s almost halfway between 1 and 100 and prime so you’ll find people picking it when they have to choose a random number.

Then there’s the whole Pomona College thing https://en.wikipedia.org/wiki/47_(number)

The whole blue 7 thing [1] and variations is very fascinating, but we don't tend to repeatedly pick the same number in the same exact context, though. That's what made this stand out to me - I had a document where Claude had picked 47 for "random" things dozens of times.

[1] https://en.wikipedia.org/wiki/Blue%E2%80%93seven_phenomenon

I experienced this even second hand when a coworker excitedly told of an encounter with a cold reader, and I knew the answer would be blue 7 before he told me what his guess was. Just his recap of the conversation was enough.

I am biased towards 67

Funny, I didn't know there were 10 years old on hacker news!

Maybe Claude is just a fan of Alias.

i just want to know where emdash came from, as it is quite rare to see it on the public internet, so it must have been synthetically added to the dataset.

Emdash is very common in academic journals and professional writing. I remember my English professor in the early 2000s encouraging us to use it, it has a unique role in interrupting a sentence. Thoughtfully used, it conveys a little more editorial effort, since there is no dedicated key on the keyboard. It was disappointing to see it become associated with AI output.

Other than things other comments already mention, let's not forget that Microsoft Word auto-corrects "--" to em-dash, and so does (apparently - haven't checked myself) Outlook, Apple Pages, Notes and Mail. There's probably bunch of other such software (I vaguely recall Wordpress doing annoying auto-typography on me, some 15 years ago or so).

Because on the public internet people don’t have arts degrees which are where emdash users learn to wield it correctly.

I learned about em-dashes by reading Knuth about 40 years ago.

The very simplified answer is that the models are first trained on everything and then are later trained more heavily on golden samples with perfect grammar, spelling, etc..

[deleted]

although emdashes are not common on the internet, there are prevalent in books.

Logo_Daedalus tended to use it a lot

https://xcancel.com/Logo_Daedalus

`---` in TeX?

It has been rare. It's common now, even in meaningful human texts. (I know because I detest the correct usage without spaces, t looks wrong.) One of the ways AI is shaping our minds.

ChatGPT has a whole host of weird words that it uses about coding - anything changed is a “pass” done over the code, it loves talking about “chrome” in the UI, it’s always saying “I’m going to do X, not [something stupid that nobody would ever think of doing]”

gpt also loves talking about handwaving, "I'm going to do X, not just a hand-wavy victory lap"

One I noticed with gemini, especially 3 flash: "this is the classic _____".

> The obsession with the word "seam" as it pertains to coding

I quite liked this term when it started using it. And I appreciate the consistent way it talks about coding work even when working on radically different stacks and codebases

"Seam" has been stretched by AI from its original legacy-code context to any point in code where something can be plugged in. I actually asked an AI about this a few weeks ago because I was surprised by the consistent, frequent use of "seam".

Frequent words I see from GPT: "shape", "seam", "lane", "gate" (especially as verb), "clean", "honest", "land", "wire", "handoff", "surface" (noun), "(un)bounded", "semantics" (but this one is fair enough), and sometimes "unlock"

It feels like AI really likes to pick the shortest ways to express ideas even if they aren't the most common, which I suppose would make sense if that's actually what's happening.

"is the real" is such a strong Claude tell, whenever I encounter it, it makes me question what i'm reading.

Another I've noticed more recently is a slight obsession over refering to "Framing".

You're absolutely right. I was wrong in the first place

I miss being told “You’re absolutely right!” :’(

One I saw recently was "wires" and "wired" from opus.

It was using it like every 3rd sentence and I was like, yeah I have seen people say wired like this but not really for how it was using it in every sentence.

GPT started to ‘wire in’ stuff around 5.2 or 5.3 and clearly Opus, ahem, picked it up. I remember being a tiny bit shocked when I saw ‘wired’ for the first time in an Anthropic model.

Anthropic distills GPT?

Everybody training models on large amounts of lightly filtered internet text is partially distilling every other model that had its output posted verbatim to the internet.

And OpenAI probably distills anthropic, who would't?

It's all one big incestuous mess. In a couple of years we'll be talking about AI brainrot.

The number of things that Claude has told me are 'load-bearing' or 'belt-and-suspenders' is... very load-bearing

You are absolutely right to call that out!

for me, doing the heavy lifting is doing the heavy lifting

Fun fact: the word suffer comes from sub fer - under load, this relation (suffer - load bearing) is consistent across (unrelated) languages

Also too many lands and hits.

I had the feeling they didn't really answer the questions, that is why the goblins appeared. They simply "retired the “Nerdy” personality" because they couldn't fix it and went on.

Seams, spirals, codexes, recursion, glyphs, resonance, the list goes on and on.

Ask any LLM for 10 random words and most of them will give you the same weird words every time.

If you lower the temperature setting, it really will be the same 10 words every single attempt. :p

They are text completion algorithms with little randomness.

I thought the “why it matters” headline was a funny reference to ChatGPT phraseology

"shape" too, at least with gpt5.5, is coming up constantly.

Whenever Claude finishes some work it almost always says “Clean.” before finishing its closing remarks. It’s at the point where I repeat it out loud along with Claude to highlight the absurdity of the repetition.

With 4.5, I think because I would prompt it/guide it towards an outcome by calling it “the dream: <code example>” it would get almost reverential / shocked with awe as it got closer to getting it working or when it finally passed for the first time. Which was funny and reasonably context appropriate but sometimes felt so over the top that I couldn’t tell if it also “liked” the project/idea or if I had somehow accidentally manipulated it into assigning religious purpose to the task of unix-style streaming rpcs.

I think a lot of the “clean” stuff stems from system prompts telling it to behave in a certain way or giving it requirements that it later responds to conversationally.

Total aside: I actually really dislike that these products keep messing around with the system prompts so much, they clearly don’t even have a good way to tell how much it’s going to change or bias the results away from other things than whatever they’re explicitly trying to correct, and like why is the AI company vibe-prompting the behavior out when they can train it and actually run it against evals.

and "quietly"!

“I’ve got the shape of it now”