What’s correct though? Even as a human, I read that “correctly”. Using weird representations of C doesn’t change the word?

I would even say that OCR can rеаd the sеntеnсе ϲоrrесtlу, while a tokenizer can't.

Qwen3 8b perfectly understood it after 14 seconds of thinking.

Yeah OCR would be much more likely to read that sentence the way a human would.