Also look at Vibe:
It even supports speaker differentiation/recognition and is open source on mac/windows/linux;
https://github.com/thewh1teagle/vibe
It uses whisper, but also directly calls other tools and puts everything under one nice Gui
It uses whisper, but also directly calls other tools and puts everything under one nice Gui