I've made something (probably) very similar for quick GB vs US pronunciation check that also leeches on Google's snapshot of what I believe is a licensed copy of the Oxford collection the same way the shell script does, but mine "runs in browser's URL bar" instead. It's a super tiny dataURI HTML document, intended to be bookmarked with a keyword (say, "say"):

    data:text/html;charset=utf-8,<title>US-GB pronunciation 2.0.2</title><body onload=x='https://ssl.gstatic.com/dictionary/static/sounds/20160317/' text=snow bgcolor=black><button onfocus=click() onclick=a.src=x+i.value+'--_us_1.mp3';a.play()>US</button><input id=i placeholder=(shift+)tab value="%s"><button onfocus=click() onclick=a.src=x+i.value+'--_gb_1.mp3';a.play()>GB</button><audio id=a onplay=i.focus()></audio>
so when I do

    Alt+D, "say something", Enter
then hitting Tab plays it in British and Shift+Tab plays it in US English. It uses older 2016 batch, because I totally adore the US voice in it: just listen to "music" [1] and tell it isn't pure ASMR.

(I'm afraid it just a matter of time they will prevent our mischief, though.)

[0] oxfordlearnersdictionaries.com uses the same collection. [1] https://ssl.gstatic.com/dictionary/static/sounds/20160317/mu...

Dubiously, I clicked. Yeah, I could listen to her read the dictionary as I waft off to sleep...

Ha ha, really glad to hear that. (The fact is, I am kinda freak/junkie about human voices, and that particular one stands really high on my list of irresistible tingles-inducing specimens. So happy to hear I am not alone.)

Have you found any you like in the AI world for text to speech? I know ElevenLabs and OpenAI have voices, but I'm hoping to build something that can be run locally.

Would be nice if there were enough words in a sentence with that voice to create an ai voice clone.