Try Moonshine with a browser GUI:
uv tool install rift-local && rift-local serve --open
This opens RIFT[1], my web frontend for local transcription with a copy button. You can also compare against Web Speech API and other models (including cloud API's).