I use the baked in Apple transcription and haven't had any issues. But what I do is usually pretty simple.

What makes the others vastly better?

I’ve rarely had macOS TTS produce a sentence I didn’t have to edit

Whisper models I barely bother checking anymore