An interesting side bit about the gemini voice model is that you can use it in AI studio and type messages instead of using the microphone.

On the backend google does TTS to feed the model, which then speaks back you via sound on your speakers.