TTS is generally not multilingual. One might think a well-annotated phonetic descriptions of voices would suffice, but that's not quite how languages work nor how TTS work.
(but somehow LLMs handle multilingual input perfectly fine! that's a bit strange, if you think about that)