Any idea what factors play into latency in TTS models?
Mostly model size, and input size. Some models which use attention are O(N^2)
Mostly model size, and input size. Some models which use attention are O(N^2)