Hacker News

It made much more sense than UTF-16 or any of the existing multi-byte character sets, and the need for more than 256 characters had been apparent for decades. Seeing its simplicity, it made perfect sense almost immediately.

blindriver a day ago [ - ]

No, it didn't. Not at the time. Like I said processing and storage were a pain back around the 2000-ish time. Windows supported UCS-2 (predecessor to UTF-16) which was fixed width 16-bit and faster to decode and encode, and since most of the world was Windows at the time, it made more sense to use UCS-2. Also, the world was only beginning to be more connected so UTF-8 seemed overkill.

NOW in hindsight it makes more sense to use UTF-8 but it wasn't clear back 20 years ago it was worth it.

acdha an hour ago [ - ]

The need was clear even 30 years ago when UTF-16 was standardized in 1996. UCS-2 was known at the time to be inadequate but there was a period from the mid-80s to early 90s where western developers tried to rpetend that they could only support a tiny fraction of Asian languages like Chinese (>50k characters, even if Han unification was uncontroversial), scholarly and technical usage, etc. The language used in 1988 was “Unicode aims in the first instance at the characters published in modern text (e.g. in the union of all newspapers and magazines printed in the world in 1988)” with the idea that other characters could be punted into a private registry.

Once enough people accepted that this approach was impractical, UCS-2 was replaced with UTF-16 and surrogate codes. At that point it was clear that UTF-8 was better in almost every scenario because neither had an advantage for random access and UTF-8 was usually substantially smaller.

1. https://unicode.org/history/unicode88.pdf

gpvos a day ago [ - ]

Maybe if you were entrenched in the Windows world.

Storage-wise, UTF-8 is usually better since so much data is ASCII with maybe the occasional accented character. The speed issue only really matters to Windows NT since that was UCS-2 inside, but it wasn't a problem for many.