Hey folks, I am the developer behind muesli - which is your one stop app for all your speech to text needs, be it voice dictation or meeting transcriptions that runs on device on your Apple Neural Engine using CoreML based STT models (Parakeet, Whisper, Cohere transcribe). Everything is open source and we are at 160 stars - au naturale - would love for folks to use it and contribute further to the development
How are you handling the on device speech pipeline, especially around model size, latency, and accuracy tradeoffs on consumer hardware?
Github: https://github.com/pHequals7/muesli
Looking to add on device CUA and support more models (MSFT Vibevoice, IBM Granite etc)
Looking for something like this. An OSS on device version where I can store transcripts as markdowns in my file system.
Finally something that works local and feels polished!
let me know if you face any issues - and always looking for more collaborators!
love the sly name!