Seems like this might be a great way to do web software testing. We’ve had Selenium and Puppeteer for a long time but they are a bit brittle with respect to the web design. Change something about the design and there’s a high likelihood that a test will break. Seems like this might be able to be smarter about adapting to changes. That’s also a great use for a smaller model like this.

Yeah. I think that's an interesting use case. Especially if I can kick it off or schedule it when I'm not actively working. Inference speed (especially with tool calling involved) won't be great on my machines, but if I schedule nightly usability tests of dev sites while I sleep, that could be really cool.

You’re right about inference speed being a concern. I was assuming it’s a small model but even then, one of the browser automation frameworks is going to be faster.