Hacker News

Seeing your handle I am at risk of explaining something you may already know, but, this exists! And it was standardized in 1993, though I don't know when Unicode picked it up.

Ideographic Description Characters: https://www.unicode.org/charts/PDF/U2FF0.pdf

The fine people over at Wenlin actually have a renderer that generates characters based on this sort of programmatic definition, their Character Description Language: https://guide.wenlininstitute.org/wenlin4.3/Character_Descri... ... in many cases, they are the first digital renderer for new characters that don't yet have font support.

Another interesting bit, the Cantonese linguist community I regularly interface with generally doesn't mind unification. It's treated the same as a "single-storey a" (the one you write by hand) and a "two-storey a" (the one in this font). Sinitic languages fractured into families in part because the graphemes don't explicitly encode the phonetics + physical distance, and the graphemes themselves fractured because somebody's uncle had terrible handwriting.

I'm in Hong Kong, so we use 説 (8AAC, normalized to 8AAA) while Taiwan would use 說 (8AAA). This is a case my linguist friends consider a mistake, but it happened early enough that it was only retroactively normalized. Same word, same meaning, grapheme distinct by regional divergence. (I think we actually have three codepoints that normalize to 8AAA because of radical variations.)

The argument basically reduces "should we encode distinct graphemes, or distinct meanings." Unicode has never been fully-consistent on either side of that. The latest example, we're getting ready to do Seal Script as a separate non-unified code point. https://www.unicode.org/roadmaps/tip/

In Hong Kong, some old government files just don't work unless you have the font that has the specific author's Private Use Area mapping (or happen to know the source encoding and can re-encode it). I've regularly had to pull up old Windows in a VM to grab data about old code pages.

In short: it's a beautiful mess.