Not a fan of high resource use or reliance on proprietary vendors/services. DeepSpeech/Vosk were pre-AI and still worked well on local devices, but they were a huge pain to set up and use. Anyone have better versions of those? Looks like one successor was Coqui STT, which then evolved into Coqui TTS which seems still maintained. Kaldi seems older but also still maintained.

edit: nvm, this overview explains the different options: https://www.gladia.io/blog/best-open-source-speech-to-text-m... and https://www.gladia.io/blog/thinking-of-using-open-source-whi...

Sorry for the delayed response, thank you for sharing these articles! I agree. I hope that we get a lot better open-source STT options in the future.