I had a strange call with a support rep recently.

They sounded a tinge strange, like they’ve almost crossed the uncanny valley, only to succumb at the final 3% stretch.

I was suspicious, but their ability to understand my complex request and the relatively low latency make an LLM -> TTS or e2e voice model unlikely.

This post finally solved the mystery.