@steipete: People freaking out over my AI spend. What nobody sees: Part of what excites me so much about working on OpenClaw is th…

X AI KOLs Timeline 05/15/26, 09:48 PM News

ai-agents automation codex crabbox open-source devops

Summary

A developer shares how they extensively use multiple Codex AI agents to automate PR reviews, issue dedup, security scanning, and more for the OpenClaw project, while also introducing Crabbox, a tool for remote agent workspaces.

People freaking out over my AI spend. What nobody sees: Part of what excites me so much about working on OpenClaw is that I'm trying to answer the question: How would we build software in the future if tokens don't matter? We constant run ~100 codex in the cloud, reviewing every PR, every issue. If a fix on main lands, @clawsweeper will eventually find that 6 month old issue and close it with an exact reference. We run codex on every commit to review for security issues (as it's far too easy to miss). We run codex to de-duplicate issues and find clusters and send reports for the most pressing issues. We have agents that can recreate complex setups, spin up ephemeral http://crabbox.sh machines, log into e.g. Telegram, make a video and post before/after fix on the PR. There's codex that watch new issues and - if it fits our documented vision well, automatically create a PR of it. (that then another codex reviews) We have codex running that scans comments for spam and blocks people. We have codex instances running that verify performance benchmarks and report regressions into Discord. We have agents that listen on our meetings and proactively start work, e.g. create PRs when we discuss new features while we discuss them. We build http://clawpatch.ai to split all our projects into functional units to review and find bugs and regresssions. We do the same split for security with Vercel's deepsec and Codex Security to find regressions and vulnerabilities. All that automation allows us to run this project extremely lean.

Original Article

View Cached Full Text

Cached at: 05/17/26, 01:25 AM

People freaking out over my AI spend. What nobody sees: Part of what excites me so much about working on OpenClaw is that I’m trying to answer the question:

How would we build software in the future if tokens don’t matter?

We constant run ~100 codex in the cloud, reviewing every PR, every issue. If a fix on main lands, @clawsweeper will eventually find that 6 month old issue and close it with an exact reference.

We run codex on every commit to review for security issues (as it’s far too easy to miss).

We run codex to de-duplicate issues and find clusters and send reports for the most pressing issues.

We have agents that can recreate complex setups, spin up ephemeral http://crabbox.sh machines, log into e.g. Telegram, make a video and post before/after fix on the PR.

There’s codex that watch new issues and - if it fits our documented vision well, automatically create a PR of it. (that then another codex reviews)

We have codex running that scans comments for spam and blocks people.

We have codex instances running that verify performance benchmarks and report regressions into Discord.

We have agents that listen on our meetings and proactively start work, e.g. create PRs when we discuss new features while we discuss them.

We build http://clawpatch.ai to split all our projects into functional units to review and find bugs and regresssions.

We do the same split for security with Vercel’s deepsec and Codex Security to find regressions and vulnerabilities.

All that automation allows us to run this project extremely lean.

🦀 Crabbox Docs - Crabbox Docs

Source: https://crabbox.sh/

Local loop, remote box

Keep your editor and git workflow. Crabbox rsyncs your dirty checkout to a leased remote box and streams the run back.

Brokered, not BYO creds

A Cloudflare Worker holds provider credentials and serializes lease state. Your CLI only carries a bearer token.

Cost-aware leases

TTL-bounded machines, monthly spend caps, and per-user / per-org / per-provider usage from the broker.

Reuse what’s warm

crabbox warmupkeeps a box hot. Reuse it with\-\-idacross runs, SSH, and CI hydration.

Many providers, one loop

Brokered Hetzner / AWS / Azure, delegated E2B / Daytona / Blacksmith / Semaphore, or static SSH targets - Linux, Windows, and macOS.

Plays with Actions

actions hydratereuses your repository’s GitHub Actions setup steps so local runs land in the same hydrated workspace.

Desktop in your browser

crabbox webvncstreams a Linux, macOS, or Windows desktop into the browser. Drive UI tests, capture screenshots, hand the live session to a teammate - no VPN.

Proof for every run

crabbox artifacts collectbundles screenshots, video, JUnit summaries, logs, and lease metadata. Drop it on a PR as before/after evidence instead of scraping log output.

Warm a box, sync the diff, run the suite.

#What Crabbox is

Crabbox is a shared agent workspace control plane for software maintainers and AI agents. The goal is to keep the local developer story unchanged - edit, save, run - while moving compute, tests, and review evidence onto owned or provider-backed remote capacity.

Acrabbox runcommand leases a brokered cloud machine, reuses a static SSH host, or delegates to a sandbox provider, syncs your tracked and nonignored local files, executes the command remotely, streams stdout and stderr back, and releases or unclaims the target. Behind the scenes a small Cloudflare-hosted broker owns cloud provider credentials, lease state, cleanup, usage, and cost guardrails so individual machines and CLIs never need to.

#How it fits together

your laptop                Cloudflare Worker            cloud provider
-------------              ------------------           --------------
crabbox CLI    -- HTTPS --> Fleet Durable Object  -->   Hetzner / AWS / Azure / GCP
   |                         lease + cost state              |
   |                                                         |
   +------------ SSH + rsync to leased runner <--------------+

The CLI is a Go binary. The broker is a Cloudflare Worker plus a single Durable Object. Brokered Linux runners are vanilla Ubuntu boxes prepared by cloud-init with SSH, Git, rsync, curl, jq, and/work/crabbox; AWS can also broker managed Windows/WSL2 and EC2 Mac desktop targets, while Azure can broker native Windows SSH/sync/run, desktop/VNC, and Windows WSL2 targets. Static hosts are existing SSH machines selected withprovider: ssh. Project runtimes come from Actions hydration or repo-owned setup. Runners hold no broker credentials - they are leaf nodes.

#A run, end to end

CLI loads config from flags, env, repo, user, defaults.
CLI mints a per-lease SSH key and slug, then callsPOST /v1/leaseson the broker.
Worker checks active-lease and monthly spend caps, reserves worst-case TTL cost, provisions a server, returns host / port / user / workdir / expiry / slug.
CLI waits forcrabbox\-ready, seeds remote Git when possible, rsyncs the Git file-list manifest, runs sync guardrails and sanity checks, hydrates the configured base ref.
CLI runs the command over SSH, streams output, records run events, sends heartbeats/touches.
CLI releases the lease unless\-\-keepis set; kept leases still auto-release after idle timeout, and the broker frees reserved cost when the lease closes.

SeeHow Crabbox Worksfor the full picture, including warm-machine reuse and the brokered vs direct provider paths. SeeSource Mapwhen you need to trace a documented behavior back to code.

#Install

brew install openclaw/tap/crabbox

Verify withcrabbox \-\-version.

#Quick start

# log in once per machine - stores a broker token in user config
crabbox login

# one-shot run on a fresh leased box
crabbox run -- pnpm test

# keep a warm box around for repeated runs; output includes an ID and slug
crabbox warmup
crabbox run --id blue-lobster -- pnpm test:changed
crabbox ssh --id blue-lobster
crabbox stop blue-lobster

crabbox doctorvalidates local config, network reachability, and SSH key availability before you commit to a long workflow.crabbox usagesummarizes recent spend by user, org, provider, and server type.

#OpenClaw plugin

The repository root is also a native OpenClaw plugin package. Once installed in OpenClaw, it exposes Crabbox operations as agent tools:

crabbox\_run
crabbox\_warmup
crabbox\_status
crabbox\_list
crabbox\_stop

The plugin shells out to the configuredcrabboxbinary with argv arrays, so local Crabbox config, broker login, repo claims, and sync behavior stay owned by the CLI. Configureplugins\.entries\.crabbox\.config\.binaryif the binary is not onPATH.

Run history and inspection are intentionally handled by the Crabbox CLI and repo skill, not extra plugin tools. Usecrabbox history,crabbox events \-\-after \-\-limit,crabbox attach,crabbox logs,crabbox results, andcrabbox usagefrom a shell-capable agent.

#Where to read next

Pick whichever matches your intent:

Start here:Getting started,How Crabbox Works,Concepts and glossary.
Get the mental model:Architecture,Orchestrator.
Use the CLI:CLI,Commands,Features,Configuration,Jobs,Actions hydration,Browser portal,Telemetry.
Pick or add a target:Provider reference,Providers feature overview,Provider authoring,Provider backends,AWS,Azure,Google Cloud,Hetzner,Proxmox,Static SSH,Blacksmith Testbox,Namespace Devbox,Semaphore,Sprites,Daytona,Islo,E2B,Modal,Tensorlake,Interactive desktop and VNC.
Operate it:Operations,Observability,Troubleshooting,Performance.
Set it up or audit it:Infrastructure,Security,Source Map,MVP Plan.

#About these docs

Markdown in this directory is the user-facing documentation source. Implementation truth stays in code;Source Maplists the files behind each documented behavior. The GitHub Pages site at https://openclaw\.github\.io/crabbox/\ is generated from these Markdown files byscripts/build\-docs\-site\.mjsand deployed by\.github/workflows/pages\.yml. Pages must be enabled on the repository or organization for the workflow to publish.

Build the docs site locally:

npm run docs:check
open dist/docs-site/index.html