My problem with TTS is that I've been struggling to find models that support less common use cases like mixed bilingual Spanish/English and also in non-ideal audio conditions. Still haven't found anything great, to be honest.
My problem with TTS is that I've been struggling to find models that support less common use cases like mixed bilingual Spanish/English and also in non-ideal audio conditions. Still haven't found anything great, to be honest.
Regarding the less than ideal audio conditions, there are also already models that have impressive noise cancellation. Like this https://github.com/Rikorose/DeepFilterNet one. If you put them in serial, maybe you get better results?
Hi. Our model at http://www.Gradium.ai has no problem with 'code-switching' between Spanish English and we have excellent background noise suppression. Please feel free to give it a try and let me know what you think!
Looks interesting! How did you train it and how many hours of material did you use?