voice to voice models can call tools. no need for TTS.