I suppose it's still possible to extend to 31 bits in the future, once UTF-16 has become obsolete enough. How big is the need for it right now?

Interestingly, in theory UTF-8 could be extended to 36 bits: the FLAC format uses an encoding similar to UTF-8 but extended to allow up to 36 bits (which takes seven bytes) to encode frame numbers: https://www.ietf.org/rfc/rfc9639.html#section-9.1.5

This means that frame numbers in a FLAC file can go up to 2^36-1, so a FLAC file can have up to 68,719,476,735 frames. If it was recorded at a 48kHz sample rate, there will be 48,000 frames per second, meaning a FLAC file at 48kHz sample rate can (in theory) be 14.3 million seconds long, or 165.7 days long.

So if Unicode ever needs to encode 68.7 billion characters, well, extended seven-byte UTF-8 will be ready and waiting. :-D

See my comment on how Perl stores up to 2^63-1 in a UTF-8-like format: https://news.ycombinator.com/item?id=45227396 .

The problem is that now there are a bunch of UTF-8 tools that won't handle code points beyond 21 bits.

Fair enough, it will take some time to weed those out.