The ESP32S3 has wake word support: https://components.espressif.com/components/espressif/esp-sr...

The rest is just some vibe coding…

If it's possible via vibe coding, then there are a few projects out there that do just exactly that.