trycua/cua
Summary
trycua/cua is an open-source toolkit and Python library for building, benchmarking, and deploying computer-use agents, featuring macOS background automation and cross-platform agent-ready sandboxes.
View Cached Full Text
Cached at: 05/13/26, 12:12 PM
trycua/cua
Source: https://github.com/trycua/cua
Choose Your Path
Cua Driver - Background computer-use on macOS
Drive any native macOS app in the background — agents click, type, and verify without stealing the cursor, focus, or Space, even on non-AX surfaces like Chromium web content and canvas-based tools (Blender, Figma, DAWs, game engines). Use with the CLI or MCP server for Claude Code, Cursor, and custom clients. Every session records as a replayable trajectory.
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/cua-driver/scripts/install.sh)"
Full tool reference, architecture notes, and the Claude Code skill ship with the package: libs/cua-driver/README.md.
Cua - Agent-Ready Sandboxes for Any OS
Build agents that see screens, click buttons, and complete tasks autonomously. One API for any VM or container image — cloud or local.
pip install cua
# Requires Python 3.11 or later
from cua import Sandbox, Image
# Same API regardless of OS or runtime
async with Sandbox.ephemeral(Image.linux()) as sb: # or .macos() .windows() .android()
result = await sb.shell.run("echo hello")
screenshot = await sb.screenshot()
await sb.mouse.click(100, 200)
await sb.keyboard.type("Hello from Cua!")
await sb.mobile.gesture((100, 500), (100, 200)) # multi-touch gestures
| Linux container | Linux VM | macOS | Windows | Android | BYOI (.qcow2, .iso) | |
|---|---|---|---|---|---|---|
| Cloud (cua.ai) | ✅ | ✅ | ✅ | ✅ | ✅ | 🔜 soon |
| Local (QEMU) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
Get Started | Examples | API Reference
CuaBot - Co-op computer-use for any agent
cuabot gives any coding agent a seamless sandbox for computer-use. Individual windows appear natively on your desktop with H.265, shared clipboard, and audio.
npx cuabot # Setup onboarding
# Run any agent in a sandbox
cuabot claude # Claude Code
cuabot openclaw # OpenClaw in the sandbox
# Run any GUI workflow in a sandbox
cuabot chromium
cuabot --screenshot
cuabot --type "hello"
cuabot --click <x> <y> [button]
Built-in support for agent-browser and agent-device (iOS, Android) out of the box.
Get Started | Installation | First spotted at ClawCon
Cua-Bench - Benchmarks & RL Environments
Evaluate computer-use agents on OSWorld, ScreenSpot, Windows Arena, and custom tasks. Export trajectories for training.
# Install and create base image
cd cua-bench
uv tool install -e . && cb image create linux-docker
# Run benchmark with agent
cb run dataset datasets/cua-bench-basic --agent cua-agent --max-parallel 4
Get Started | Partner With Us | Registry | CLI Reference
Lume - macOS Virtualization
Create and manage macOS/Linux VMs with near-native performance on Apple Silicon using Apple’s Virtualization.Framework.
# Install Lume
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh)"
# Pull & start a macOS VM
lume run macos-sequoia-vanilla:latest
Get Started | FAQ | CLI Reference
Packages
| Package | Description |
|---|---|
| cuabot | Multi-agent computer-use sandbox CLI |
| cua-agent | AI agent framework for computer-use tasks |
| cua-sandbox | SDK for creating and controlling sandboxes |
| cua-computer-server | Driver for UI interactions and code execution in sandboxes |
| cua-bench | Benchmarks and RL environments for computer-use |
| lume | macOS/Linux VM management on Apple Silicon |
| lumier | Docker-compatible interface for Lume VMs |
Resources
- Documentation — Guides, examples, and API reference
- Blog — Tutorials, updates, and research
- Discord — Community support and discussions
- GitHub Issues — Bug reports and feature requests
Contributing
We welcome contributions! See our Contributing Guidelines for details.
License
MIT License — see LICENSE for details.
Third-party components have their own licenses:
- Kasm (MIT)
- OmniParser (CC-BY-4.0)
- Optional
cua-agent[omni]includes ultralytics (AGPL-3.0)
Trademarks
Apple, macOS, Ubuntu, Canonical, and Microsoft are trademarks of their respective owners. This project is not affiliated with or endorsed by these companies.
Similar Articles
ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents
ToolCUA is a new agent framework that optimizes GUI-tool path selection for computer use agents through staged training and reinforcement learning. It achieves state-of-the-art performance on OSWorld-MCP by effectively interleaving GUI actions and high-level tool calls.
Securing Computer-Use Agents: A Unified Architecture-Lifecycle Framework for Deployment-Grounded Reliability
This academic paper proposes a unified architecture-lifecycle framework for securing computer-use agents (CUAs) as they transition from benchmarks to real-world software environments. It analyzes reliability challenges across perception, decision, execution layers and creation, deployment, operation, maintenance stages.
@NousResearch: Computer use with any model Hermes Agent × @trycua
NousResearch announces that their Hermes model can now be used for computer use tasks in combination with the trycua framework.
Computer-Using Agent
OpenAI introduced the Computer-Using Agent (CUA), a model combining GPT-4o's vision with reinforcement learning to interact with GUIs like a human, powering the new Operator agent. CUA sets new state-of-the-art benchmarks including 38.1% on OSWorld and 58.1% on WebArena, and is available as a research preview for ChatGPT Pro users in the US.
bytedance/UI-TARS-desktop
ByteDance released TARS, a multimodal AI agent stack comprising Agent TARS (a CLI/Web UI-based general AI agent for GUI, browser, and terminal tasks) and UI-TARS Desktop (a native desktop application powered by the UI-TARS model for local and remote computer/browser automation). The stack integrates multimodal LLMs with MCP tools for human-like task completion.