It’s a neat idea, but giving a 2B model full JS execution privileges on a live page is a bit sketchy from a security standpoint. Plus, why tie inference to the browser lifecycle at all? If Chrome crashes or the tab gets discarded, your agent's state is just gone. A local background daemon with a "dumb" extension client seems way more predictable and robust fwiw
> but giving a 2B model full JS execution privileges on a live page is a bit sketchy from a security standpoint.
Every webpage I've ever visited has full JS execution privileges and I trust half of them less than an LLM
Note that every webpage does not have full JS execution privileges on other parts of the web.
At least in this case (not so sure about the Prompt API case mentioned in another thread) the agent is "in" the page. And that means that the agent is constrained by the same CORS limits that constrain the behavior of the page's own JS.
If you think about it, everything we've done to make malicious webpages unable to fiddle around with your state on other sites using XHRs, are exactly and already the proper set of constraints we'd want to prevent models working with webpages from doing the same thing.
There's indexed db, opfs, etc. Plenty of ways to store stuff in a browser that will survive your browser restarting. Background daemons don't work unless you install and start them yourself. That's a lot of installation friction. The whole point of a browser app is that you don't have to install stuff.
And what you call sketchy is what billions of people default to every day when they use web applications.
I was thinking the same thing: better to run models using a local service not in the wen browser. I use Ollama and LM Studio, switching between which service I have running depending on what I am working on. It should be straight forward to convert this open source project to use a different back end.
That said this looks like a cool project. It is so valuable writing projects like this that use local models, both for tool building and self education. I am writing my own “Emacs native” agentic coding harness and I am learning a lot.