Run Anthropic Computer Use in Your Web Browser
We’ve integrated Anthropic Computer Use into the Ui.Vision browser extension. This gives everyone the option to use/demo Anthropic Computer Use directly in their browser - no Docker required. You only need to install the open-source Ui.Vision browser extension for Chrome, Edge, or Firefox.
Step-by-Step Instructions
- Install the Ui.Vision browser extension for Chrome, Edge, or Firefox
- Install the XModule native app (available for Windows, Mac, and Linux). This app allows the browser extension to simulate mouse clicks and keyboard typing
- Enter your Anthropic API key into the Ui.Vision extension. That’s all - you’re ready to go!
Computer Use Tutorial
To get started, we recommend trying one of the pre-made Computer Use macros. You’ll find them at the top of the macro list in Ui.Vision. Macros that use Computer Use start with “CU_”.
Computer Use (inside Ui.Vision) playing TicTacToe in the browser:
(Soon:) If you want to have a chat-like conversation with Claude Computer Use, open the sidebar and select the AI tab. Then start prompting.
Desktop Automation
Since Ui.Vision can handle both browser and desktop automation, you can also demo Anthropic Claude Computer Use directly on your desktop. To do so, go to the Ui.Vision settings and switch from browser automation to desktop automation.
Anthropic Computer Use Pricing
The Ui.Vision browser extension is open-source and free for everyone to use. The only costs involved are the Anthropic API credits. For example, running the PlayTicTacToe demo macro will cost approximately $0.30 in API usage credits.
Computer Use Insights from Our Experience
Here are some insights we’ve gained from using Computer Use:
-
Computer Use appears to be trained to move the mouse before clicking. However, this is often unnecessary. Tell it “No mouse movements, only clicks.” This simple tweak can halve the automation time and API cost!
-
Keep the browser viewport (for browser automation) or the desktop screen size (for desktop automation) as small as possible. The smaller the screenshot area, the faster and cheaper the API response. Smaller screen sizes also improve the accuracy of the returned x,y values.
-
Shorter prompts often work better than longer ones. Ironically, if you ask Claude for prompting help, it will write long and often overly detailed prompts.
Prompts can be as simple as “Fill out this form with random data” or “Search for a flight.” To get started, check out the new Computer Use demo macros for example prompts that work well.
In general, we’ve found that - at least for now - Computer Use works better for browser automation than desktop automation. Google’s Project Jarvis team seems to have reached the same conclusion.