Cracking non-English or accented / mispronounced English is the white whale of text-to-speech I think; I don't know about you, but in our day to day chats there's a lot of jargon, randomly inserted English words, etc. And when they speak in English it's often what I call expat-English which is what you get when non-native speakers only speak the language with other non-native speakers.
Add poor microphone quality (using a laptop to broadcast a presentation to a room audience isn't very good) and you get a perfect storm of untranscribeable presentations or meetings.
All I want from e.g. Teams is a good transcript and, more importantly, a clever summary. Because when you think about it, imagine all the words spoken in a meeting and write them down - that's pages and pages of content that nobody would want to read in full.