Claude Computer Use
by Anthropic
Claude takes the mouse and keyboard. A vision-based agent that controls a real desktop, not just a browser.
Agents
Claude Computer Use
Added 17 May 2026
Overview
Computer Use is the Anthropic capability that lets Claude control a desktop the same way a human does: it looks at a screenshot, decides where to click, types, scrolls, and reads the next screenshot to plan its next move. Shipped as an API capability; reference Docker image makes it easy to spin up in a sandbox.
Best for
Best for
Teams stuck with legacy or no-API systems that an agent still needs to drive
Use cases
- Drive any desktop application that has no API
- Automate enterprise software that engineering can't touch
- Run agent-driven QA across native apps, not just web
- Bridge legacy systems into modern agent workflows
Notes
Why it matters
Computer Use is the credible answer to “what about the systems that will never get APIs?” Treating it as the universal fallback for browser and desktop automation is reasonable.
How teams use it in production
Always inside a sandbox. Always with a budget. Always with a human-in-the-loop checkpoint for irreversible actions like payments or sends.
What to watch
As model speed and accuracy improve, the line between Computer Use and deterministic browser automation blurs. Eventually the agent picks the right tool per step.
Pros
- Truly universal: if a human can do it, Claude can attempt it
- Strong reasoning over screenshots, not just bounding boxes
- Reference sandbox lowers the security floor
- Improving fast with each model release
Cons
- Slow, deliberate per step, not great for high-volume work
- Sandboxing is mandatory for anything serious
- Error recovery still requires careful prompt scaffolding
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.