Is it 0.003 per minute of audio uploaded, or "compute minute"?

For example fal.ai has a Whisper API endpoint priced at "$0.00125 per compute second" which (at 10-25x realtime) is EXTREMELY cheaper than all the competitors.

It can actually go much lower. Gemini costs around $0.01/hour of transcription last time I checked.

I think the point is having it for real-time; this is for conversations rather than transcribing audio files.

That quote was for the non-realtime model.

Both AWS and Mistral prices above are per minute of input audio.