Having used whisper and noticed the useless quality due to their 30-second chunks, I would stay far away from software working on even a shorter duration.
The short duration effectively means that the transcription will start producing nonsense as soon as a sentence is cut up in the middle.