Is there some well established independent benchmark where I can easily (looking at a couple of graphs) compare all popular (especially self-hosted) transcription models?

Not that I am aware of unfortunately