People are way worse at detecting LLM written short form content (like comments, blogs, articles etc) then they believe themselves to be...

With CVs/job applications? I guarantee you, if you'd actually do a real blind trial, you'd be wrong so often that you'd be embarrassed.

It does become detectable over time, as you get to know their own writing style etc, but it's bonkas people still think they're able to make these detections on first contact. The only reason you can hold that opinion is because you're never notified of the countless false positives and false negatives you've had.

There is a reason why the LLMs keep doing the same linguistic phrases like it's not x, it's y and numbered lists with Emojis etc... and that's because people have been doing that forever.

It's is RLHF that dominates the style of LLM produced text not the training corpus.

And RLHF tends towards rewarding text that first blush looks good. And for every one person (like me) who is tired of hearing "You're making a really sharp observation here..." There are 10 who will hammer that thumbs up button.

The end result is that the text produced by LLMs is far from representative of the original corpus, and it's not an "average" in the derisory sense people say.

But it's distinctly LLM and I can assure you I never saw emojis in job applications until people started using Chatgpt to right their personal statement.

> There is a reason why the LLMs keep doing the same linguistic phrases like it's not x, it's y and numbered lists with Emojis etc... and that's because people have been doing that forever.

They've been doing some of these patterns for a while in certain places.

We spent the first couple decades of the 2000s to train ever "business leader" to speak LinkedIn/PowerPoint-ese. But a lot of people laughed at it when it popped up outside of LinkedIn.

But the people training the models thought certain "thought leader" styles were good so they have now pushed it much further and wider than ever before.

>They've been doing some of these patterns for a while in certain places.

This exactly. LLMs learned these patterns from somewhere, but they didn't learn them from normal people having casual discussions on sites like Reddit or HN or from regular people's blog posts. So while there is a place where LLM-generated output might fit in, it doesn't in most places where it is being published.

Yeah, even when humans write in this artificial, punched-to-the-max, mic-drop style (as I've seen it described), there's a time and a place.

LLMs default to this style whether it makes sense or not. I don't write like this when chatting with my friends, even when I send them a long message, yet LLMs always default to this style, unless you tell them otherwise.

I think that's the tell. Always this style, always to the max, all the time.

Also with CVs people already use quite limited and establish language, with little variations in professional CVs. I image LLMs can easily replicate that