are there any non-Whisper-based voice models/tech/APIs?

Yes, we currently support OpenAI/ElevenLabs/Deepgram APIs that all use non-Whisper models (presumedly) under the hood. Speaches also supports other models that are not Whisper. Hopefully adding Parakeet support later too!