trycua/cua

GitHub Trending (daily) Tools

Summary

trycua/cua is an open-source toolkit and Python library for building, benchmarking, and deploying computer-use agents, featuring macOS background automation and cross-platform agent-ready sandboxes.

Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).
Original Article Export to Word Export to PDF
View Cached Full Text

Cached at: 05/13/26, 12:12 PM

trycua/cua

Source: https://github.com/trycua/cua

Cua logo

Build, benchmark, and deploy agents that use computers

cua.ai Discord Twitter Documentation
trycua%2Fcua | Trendshift

Choose Your Path

Cua Driver — The background computer-use agent
Cua & Cua Sandbox Cua Bench Lume

Cua Driver - Background computer-use on macOS

Drive any native macOS app in the background — agents click, type, and verify without stealing the cursor, focus, or Space, even on non-AX surfaces like Chromium web content and canvas-based tools (Blender, Figma, DAWs, game engines). Use with the CLI or MCP server for Claude Code, Cursor, and custom clients. Every session records as a replayable trajectory.

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/cua-driver/scripts/install.sh)"

Full tool reference, architecture notes, and the Claude Code skill ship with the package: libs/cua-driver/README.md.


Cua - Agent-Ready Sandboxes for Any OS

Build agents that see screens, click buttons, and complete tasks autonomously. One API for any VM or container image — cloud or local.

pip install cua
# Requires Python 3.11 or later
from cua import Sandbox, Image

# Same API regardless of OS or runtime
async with Sandbox.ephemeral(Image.linux()) as sb:   # or .macos() .windows() .android()
    result = await sb.shell.run("echo hello")
    screenshot = await sb.screenshot()
    await sb.mouse.click(100, 200)
    await sb.keyboard.type("Hello from Cua!")
    await sb.mobile.gesture((100, 500), (100, 200))  # multi-touch gestures
Linux containerLinux VMmacOSWindowsAndroidBYOI (.qcow2, .iso)
Cloud (cua.ai)🔜 soon
Local (QEMU)

Get Started | Examples | API Reference


CuaBot - Co-op computer-use for any agent

cuabot screenshot

cuabot gives any coding agent a seamless sandbox for computer-use. Individual windows appear natively on your desktop with H.265, shared clipboard, and audio.

npx cuabot                 # Setup onboarding
# Run any agent in a sandbox
cuabot claude              # Claude Code
cuabot openclaw            # OpenClaw in the sandbox

# Run any GUI workflow in a sandbox
cuabot chromium
cuabot --screenshot
cuabot --type "hello"
cuabot --click <x> <y> [button]

Built-in support for agent-browser and agent-device (iOS, Android) out of the box.

Get Started | Installation | First spotted at ClawCon

cuaXclawdbot_nbg

Cua-Bench - Benchmarks & RL Environments

Evaluate computer-use agents on OSWorld, ScreenSpot, Windows Arena, and custom tasks. Export trajectories for training.

# Install and create base image
cd cua-bench
uv tool install -e . && cb image create linux-docker

# Run benchmark with agent
cb run dataset datasets/cua-bench-basic --agent cua-agent --max-parallel 4

Get Started | Partner With Us | Registry | CLI Reference


Lume - macOS Virtualization

Create and manage macOS/Linux VMs with near-native performance on Apple Silicon using Apple’s Virtualization.Framework.

# Install Lume
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh)"

# Pull & start a macOS VM
lume run macos-sequoia-vanilla:latest

Get Started | FAQ | CLI Reference


Packages

PackageDescription
cuabotMulti-agent computer-use sandbox CLI
cua-agentAI agent framework for computer-use tasks
cua-sandboxSDK for creating and controlling sandboxes
cua-computer-serverDriver for UI interactions and code execution in sandboxes
cua-benchBenchmarks and RL environments for computer-use
lumemacOS/Linux VM management on Apple Silicon
lumierDocker-compatible interface for Lume VMs

Resources

  • Documentation — Guides, examples, and API reference
  • Blog — Tutorials, updates, and research
  • Discord — Community support and discussions
  • GitHub Issues — Bug reports and feature requests

Contributing

We welcome contributions! See our Contributing Guidelines for details.

License

MIT License — see LICENSE for details.

Third-party components have their own licenses:

  • Kasm (MIT)
  • OmniParser (CC-BY-4.0)
  • Optional cua-agent[omni] includes ultralytics (AGPL-3.0)

Trademarks

Apple, macOS, Ubuntu, Canonical, and Microsoft are trademarks of their respective owners. This project is not affiliated with or endorsed by these companies.


Stargazers over time

Thank you to all our GitHub Sponsors!

coderabbit-cli

Similar Articles

Computer-Using Agent

OpenAI Blog

OpenAI introduced the Computer-Using Agent (CUA), a model combining GPT-4o's vision with reinforcement learning to interact with GUIs like a human, powering the new Operator agent. CUA sets new state-of-the-art benchmarks including 38.1% on OSWorld and 58.1% on WebArena, and is available as a research preview for ChatGPT Pro users in the US.

bytedance/UI-TARS-desktop

GitHub Trending (daily)

ByteDance released TARS, a multimodal AI agent stack comprising Agent TARS (a CLI/Web UI-based general AI agent for GUI, browser, and terminal tasks) and UI-TARS Desktop (a native desktop application powered by the UI-TARS model for local and remote computer/browser automation). The stack integrates multimodal LLMs with MCP tools for human-like task completion.