Try ottex with Gemini 3 flash as a transcription model. I'm bilingual as well and frequently switch between languages - Gemini handles this perfectly and even the case when I speak two languages in one transcription.