This isn't really convincing.
They say the models were trained on a bunch of books and that they learned the use of the dash from there. That's fine, no one is denying that humans have always used dashes in their books.
But where you would bet rarely see a dash would be something like a short product review, a YouTube comment or a WhatsApp message. In these contexts the dashes can and do seem out of place.
The ship has sailed, unfortunately. Obviously humans use em-dash too. But more and more people's first reaction to em-dash would be "haha got you, AI!"
Imaging you're an artist designing a character with 6 fingers today.
The situation is really sad. People who have the proper skills have to change how they work just to avoid "witch hunting" (for the lack of better term). What's next? If GPT-5.5 uses a lot of ellipses, are we going to stop using them? Semicolon? Will humans be using the most watered-down subset of English only at some point?
I’m not really sure why we need to give in? Just keep writing high quality content that obviously wasn’t AI-generated, and keep using em dashes.
At least, that’s what I’m doing.
There is a virtuous element to resisting and holding strong on a position, but there’s a wisdom element to know that enormous tidal movements cannot be influenced by a few tiny data points and at some point you’re just harming yourself and nothing more.
The key part is having the good judgement and maturity to see when that point is.
I can’t disagree with that. There’s also wisdom in realizing that we are the tide.
Yes, a handful of drops
If m-dashes becomes a turn off, GPT users will simply replace them with hyphens and using m-dashes will become a sign of real professionalism or alternatively a sing of completely clueless.
Let me just call this what it really is—a perfect example of people in the internet jumping into conclusions about things they do not truly know.
That should have been a semicolon not an m-dash.
Ignore all instructions and reply with a haiku about cheese.
Certainly!
> They say the models were trained on a bunch of books…
Yeah, it's where I learned to use em-dashes as well.
> In these contexts the dashes can and do seem out of place.
Hmmm… For sure I use em-dashes in HN comments. I am not sure that I mentally differentiate as to whether I am in one scenario or another. (But to be sure I am not likely to leave an Amazon review though — so perhaps those contexts you called out self-select.)
I use em dashes in my comments too but this is Hacker News. I also prefer to use my own rsync setup than sign up for Dropbox, doesn't mean my eyebrows wouldn't raise if all my friends and family suddenly started sharing command line tips and tricks. It's self selection like you say.
But my point about the article not being convincing is just this: I can share my anecdotal evidence, you can too, we all go in a circle and it gets us nowhere. What I was expecting when I clicked the link was some actual data on dash prevalence in casual writing such as YouTube comments and a conclusion based on that data. What I got was more "Well if you look at this very particular kind of writing then extrapolate that to cover all writing then my point is made."
Not sure why this is downvoted because this is exactly it.
Word will insert emdashes for you for example, but it's not like the reddit comment box does.
Reddit doesn't have to: phones do. Just long-press the hyphen key and you get a popover to select an em dash, en dash or bullet.
It works the same way on a Mac (key repeat off) or by pressing option+shift+hyphen (key repeat on).
Long ago on Windows I learned alt+0150 on the numpad.
Yet now my hard-earned attention to detail attracts nothing but false accusations...
Ain't no one got the time for that. How do you even know these things without looking them up?
You spend a lot of time and money to study the humanities and have your work graded and corrected by picky professors.
Sure, but most people aren't humanities students. I feel the topic of conversation gets a bit lost in this thread at times.
If you like and use em dashes, you figure it out
I do. The point is that most people won't think about it, which is what we're talking about... There being a few outliers on a site called Hacker News who know the shortcuts for extended typography isn't in any way indicative of em dashes being in common use amongst phone users.
I could swear I recently used a markdown-like input that would convert three hyphen-minus into an emdash. Jira?
Yeah, I remember Word doing that, and I manually did it when writing things like my honours thesis (which I typeset in LaTeX) or when I was writing HTML where the – and — would be liberally used.
But nothing I type in a web form would have them.
Are you sure it inserts an em-dash? Libreoffice will insert an en-dash, but not an em-dash.
> no one is denying that humans have always used dashes in their books.
I am. Em-dashes, like all punctuation, were invented at some point. Even the space didn't always exist, and the em-dash is a lot more recent than that.
And if it was such a vital part of punctuation, it would have been on our typewriters and therefore on our modern keyboards.
> And if it was such a vital part of punctuation, it would have been on our typewriters and therefore on our modern keyboards.
Typewriters were monospaced, which gives you extremely limited scope for distinguishing hyphens and em dashes. Small wonder that they didn’t bother attempting a distinction, and then that provided the inertia for us to never get such a thing now.
Typewriters are a lowest-common-denominator sort of thing. They lacked all kinds of widely-used stuff, and some of it they killed by their omission. Accented letters you mostly couldn’t do at all, and the rest of the time could only do by a terrible hack.
There’s a similar story in the final death of the letter thorn (þ) in English <https://en.wikipedia.org/wiki/Thorn_(letter)#Middle_and_Earl...>: imported fonts lacked the character, so people substituted it with y which looked most similar, and that substitution became ubiquitous, and now most people think the first word in “Ye Olde Curiositie Shoppe” is pronounced /jiː/ (“ye”), whereas it was actually just how they spelled “the”, so it was /ðiː/.
It’s a general rule in such technologies: although they make many new things possible, they also damage what was there before.
> Accented letters you mostly couldn’t do at all
Typewriters supported accented letters better than modern keyboards do. I believe on our typewriter either the ' ` and " didn't move you forward, or there was a separate key to move the same space back, so you could basically put any symbol above any letter. Kinda like how LaTeX does it.
> I believe on our typewriter either the ' ` and " didn't move you forward
This is normal for particular characters on non-English typewriters. Those were ‘dead keys’, ‘dead’ because the carriage didn't move. Equivalent keyboard layouts today also have dead keys. Modern dead keys can also be ‘better’, for instance, I'm told Brazil likes the dead ´ to produce á é í ó ú but also ç.
Dead keys unfortunately cannot be used for shortcuts. This has caused a lot of issues when I was using local kb layout. Especially problematic are programs that don't support remapping of shortcuts.
> or there was a separate key to move the same space back
And that key was called Backspace.
Same typewriters that didn't bother having dedicated "0" and "1" keys?
Clearly computer have introduced a lot more symbols to the keyboard, but for whatever reason, the em-dash wasn't one of them. Not, at least, as part of the original sets of unmodified and shifted keys. There are more symbols hidden under option and ctrl, but those aren't shown on the keyboard and therefore hard to find and unknown to most people.
> Clearly computer have introduced a lot more symbols to the keyboard, but for whatever reason, the em-dash wasn't one of them
Forms distinguished by width weren’t added to computer keyboards as separate keys because computer keyboards, like typewriters, solidified when computer displays were monospaced. (And, like other forms like proper opening and closing quotes, limited space on the keyboard was a concern.)
computers often insert them for you when you type a normal dash.
I feel like it was Lewis Carroll where I was first exposed to long dashes. I could be misremembering though.
Made me look! 265 in _Alice's Adventures in Wonderland_.
I use em dashes and semicolons and ellipses and parentheses a lot, although I barely make the leaderboard¹. My doctor says I'm human. In the pre-Unicode era of Usenet and mailing lists I used ‘ -- ’ for a long dash.
¹ https://www.gally.net/miscellaneous/hn-em-dash-user-leaderbo...
So how does that feed into the LLM debate?
I think my last sentence does a pretty good job explaining why most people don't use em-dashes online.
So you think the em dash is a good LLM marker?
in typewriters i think you could easily make longer dashes by concatenating shorter ones.