This article completely misses the point from the start.
The reason em dashes are a giveaway for AI generated text is simply because there is no em dash key on the keyboard - only an en dash key. The dash I used in that last sentence was an en dash, not an em dash.
Some publishing applications (including Microsoft Word) will automatically convert en dashes to em dashes where appropriate. But most email apps, chat apps, online posts/comments, and practically any application not designed for writing actual printed publications will not do that conversion for you. And without a dedicated key, it is far too cumbersome for most people to bother. They will just leave it as an en dash.
So yes, the em dash is still a reliable indicator of AI-generated content in many contexts.
The keyboard key is usually a hyphen, not an en-dash.
But I agree that because LLMs are trained on public documents, and most of those are written in Microsoft Word which has auto-format enabled by default, that is probably the source of so many LLMs using them.
Almost nobody, relatively speaking, even knows they exist, let alone goes out of their way to figure out the ALT code combination to use them. Most people can’t get their, they’re, and there right.
> The keyboard key is usually a hyphen, not an en-dash.
You are right. Thanks for catching that.
No, its a hyphen you used. - vs – vs — (hyphen, en, em). Most android keyboards make typing the em dash easy, and there are plenty of ways to set it up on desktop
Mobile device keyboards typically make it easy to type — by holding down - for a moment.