This is local, but I've found that external inference is fast enough, as long as you're okay with the possible lack of privacy. My PC isn't beefy enough to really run whisper locally without impacting my workflow, so I use Groq via a shell script. It records until I tell it to stop, then it either copies it to the clipboard or writes it into the last position the cursor was in.

What computer are you using? You really should give Parakeet a try, I find it runs in a few hundred milliseconds even on a Skylake i5 from 10 years ago.