Indeed, I skimmed a bit and misread “unable to break” to mean that you wanted a line-break opportunity but the renderer didn’t allow for it when a letter is directly following an em dash. But it’s the other way around, you want a line break in the source after an em dash to not translate into a space in the rendering. This would likewise be possible to handle by regex replacement as a pre-rendering step.
More generally, I see markup languages and the details of how they are rendered as largely orthogonal. You don’t necessarily need to invent a different markup language in order to adjust the rendering.
> More generally, I see markup languages and the details of how they are rendered as largely orthogonal. You don’t necessarily need to invent a different markup language in order to adjust the rendering.
There’s not much to a markup language beyond how it’s rendered. If you don’t ever want to render it to something other than plain text, just write plain text however you desire. The reason for choosing a particular markup language is to express intended semantics (for plain-text and rendered use), and to render it. The semantics aspect is legitimate, so I won’t say the language and rendering are identical or parallel, but they’re definitely nothing like orthogonal. If you’re using a CommonMark pipeline, any preprocessing you do means you’re not actually writing in CommonMark, but an incompatible variant of it. You may well deem it worthwhile, but it’s no longer the same markup language.