Anthropic blog post outlining the research process: https://www.anthropic.com/news/developing-computer-use
Computer use API documentation: https://docs.anthropic.com/en/docs/build-with-claude/compute...
Computer Use Demo: https://github.com/anthropics/anthropic-quickstarts/tree/mai...
On their "Developing a computer use model" post they have mention > On one evaluation created to test developers’ attempts to have models use computers, OSWorld, Claude currently gets 14.9%. That’s nowhere near human-level skill (which is generally 70-75%), but it’s far higher than the 7.7% obtained by the next-best AI model in the same category.
Here, "next-best AI model in the same category" referes to which model.
This needs to be brought up. Was looking for the demo and ended up on the contact form
Thanks for these. Wonder how many people will use this at work to pretend that they are doing work while they listen to a podcast.
This is cover for the people whose screens are recorded. Run this on the monitorred laptop to make you look busy then do the actual work on laptop 2, some of which might actually require thinking so no mouse movements.