imho it is more elegant to do this way if you are not google than to spin off your own browser.

about privacy concerns - if you limit it to your work (and if your company is cool with data leakage risks), you can still do things like the video shows.

i do wonder if there could be more potential use cases if the underlying models also support audio. not for user input but rather audio playing in the browser.