You're mostly hinting at a very obvious shortcoming of synthesized speech: Its sequential. The phenomenon is most obvious if you look at screen readers using speech synthesis. Its a fundamental problem of the medium, which some devs will discover independently, now that tts has a new surge.