@leerob: https://x.com/leerob/status/2065469795529588940
Summary
Cursor AI describes its recursive agent system for scaling training of its Composer model, using a fleet of agents that self-manage and alert humans when issues arise. The system enables parallel experiments and accelerates research, treating researcher time as the scarcest resource.
View Cached Full Text
Cached at: 06/13/26, 01:04 AM
Building recursive agent systems
At Cursor, we run thousands of agents to help us train the next version of Composer.
We give them research tasks, and if they aren’t succeeding or run into issues, they DM us on Slack or page us via PagerDuty.
Scaling training for Composer
We’ve built an org chart of agents that work together.
As we’ve scaled training for Composer, we’ve wanted to run thousands more experiments. This was possible before, but it was slow and hard to keep track of every experiment’s status. To speed things up and parallelize work, we built an always-running agent system (yes, it’s a loop).
An agent system for research
Here’s how the system works:
-
The main agent runs on a massive remote machine with all the tools you’d use locally, plus a file on disk acting as an “inbox” for the fleet.
-
It SSHes into machines running hundreds of child agents and collects their statuses into the inbox.
-
On every loop, it checks fleet health, keeps healthy tasks running in the background, and surfaces anything broken to the team on Slack.
-
Like all infra, the agents occasionally hit transient issues or need to be poked, so the main agent can control the whole fleet, quitting or restarting processes as needed.
This “fleet manager” builds on our previously published research on long-running agents. We’ve given the manager many different skills that encode tacit knowledge for how to run ML experiments, review and monitor results, and more.
Researchers with superpowers
Training a great model means trying a bunch of ideas for creating useful RL data.
A single laptop is not enough here, you really want an army of computers in the cloud to run experiments in parallel. And since we aren’t compute-constrained, we rolled out this infra for everyone in ML.
Cursor@cursor_ai·May 19Replying to @cursor_aiTogether with SpaceXAI, we’re training a significantly larger model from scratch, using 10x more total compute.
With Colossus 2’s million H100-equivalents and our combined data and training techniques, we expect this to be a major leap in model capability.894014.8K1.1M
Researcher time is our scarcest resource and we’ve found a way to scale their leverage by orders of magnitude. Imagine if you had a human manager with 10,000 direct reports. Obviously that wouldn’t work well, but this human → agent “org” kind of does!
If you have a problem that is verifiable, where throwing more tokens at it will solve it faster or better, it’s worth considering building a system like this. It’s enabled us to have swarms of agents crawling through Composer’s data to recursively improve itself for future versions.
And if this sounds exciting, we’re hiring!
Similar Articles
@ericzakariasson: https://x.com/ericzakariasson/status/2062199026544787576
Cursor introduces Debug Mode, which lets the AI agent instrument code, collect runtime logs, and fix bugs based on actual evidence instead of guessing, improving accuracy and handling hard-to-reproduce issues.
@Av1dlive: Cursor pays engineers $1,100,000 a year to run teams of AI agents that ship code while they sleep. [The CEO of Cursor e…
Cursor pays engineers $1.1M annually to manage teams of AI agents that autonomously handle planning, coding, testing, and shipping PRs, achieving 100x speed. Humans only at scoping and review stages.
@mikenevermiss: https://x.com/mikenevermiss/status/2066401066518802637
Boris Cherny and others describe shifting from prompting AI agents to designing autonomous loops that run continuously, using memory files and evaluator patterns for code quality.
@hwchase17: https://x.com/hwchase17/status/2053157547985834227
The article outlines a systematic 'Agent Development Lifecycle' (Build, Test, Deploy, Monitor) for creating and managing AI agents effectively, highlighting key frameworks like LangChain, LangGraph, and CrewAI.
Mythos Begets Fable, Cursor's Composer 2.5, Agents Building Agents
Andrew Ng discusses the rise of desktop AI agents and coding CLI tools, introduces the open-source OpenCoworker project, and examines agent harness designs where LLMs drive autonomous task execution.