I started looking into this with a Pi 5. It seemed like it was not quite performant enough. But I'm not an expert with these things and maybe someone else could make it work. We definitely have the technology to pull this off in this form factor. It would just be really expensive (maybe $500) and might also get a little hot.
If I was building it to be 'local only' I would run the inference on a remote host in my house.
Having a microcontroller in the phone is nice because it is WAY less likely to break. I love being able to flash a simple firmware/change things would fighting it too much.
Oh! Also I do all the 'WebRTC/AI dev' in the browser. When I get it working how I like, then do I switch over to doing the microcontroller stuff.