Last time I tried whisper, it hallucinated an elaborate conversation from sounds of slapping and moaning and it took minutes to spit every single line of it.
Last time I tried whisper, it hallucinated an elaborate conversation from sounds of slapping and moaning and it took minutes to spit every single line of it.
Parakeet has been trained to detect non-voice sounds and exclude that from identification, so you might have better luck with that family.
If I remember correctly, the whisper documentation actually recommends to trim non-speech portions as the models halucinate heavily during those portions.