They wouldn't do full transcription, it'd be keyword spotting of useful nouns ("baby", "pain", "desk", etc).

The iPhone already does this when you wake it up with Siri.

I really doubt that’s what the iPhone does.