How are you balancing accuracy vs. time-to-word-on-live-transcript? Is this something you're actively balancing, or can allow an end user to tune?

I find myself often using otter.ai - because while it's inferior to Whisper in many ways, and anything but on-device, it's able to show words on the live transcript with minimal delay, rather than waiting for a moment of silence or for a multi-second buffer to fill. That's vital if I'm using my live transcription both to drive async summarization/notes and for my operational use in the same call, to let me speed-read to catch up to a question that was just posed to me while I was multitasking (or doing research for a prior question!)

It sometimes boggles me that we consider the latency of keypress-to-character-on-screen to be sacrosanct, but are fine with waiting for a phrase or paragraph or even an entire conversation to be complete before visualizing its transcription. Being able to control this would be incredible.

It is more like ai model problem(then app logic. doing it more frequently will require more computation. Things like speculative decoding can help though).

Doing it locally is hard, but we expect to ship it very soon. Please join our Discord(https://hyprnote.com/discord) if you are interested to hear from us.