Does it do speaker diarization? That's the one thing that I wish Whisper did out of the box. (I know WhisperX exists, but I haven't had a chance to try it yet.)

EDIT: Ah, I see this was already answered.