You’re reading it backwards. He is not praising that behavior, he is complaining about it. He is saying that bots _should_ parse smiling face emoji’s as smiling face emoji’s, but they don’t do that currently because as text they get passed as gross unicode that has a lot of ambiguity and just happens to ultimately get rendered as a face to end users.
Wouldn’t the training or whatever make that unicode sequence effectively a smiley face?
Yes, but the same face gets represented by many unique strings. Strings which may more may not be tokenized into a single clean “smiley face” token.
Don't ask ChatGPT about seahorse emoji.
That's caused by the sampler and chatbot UI not being part of the LLM. It doesn't get to see its own output before it's sent out.
Don't ask humans either, apparently.