Tag
This paper proposes WRIT, a pipeline for synthesizing multi-turn agent training trajectories that balance write-intensive and read-heavy complexity. The method generates diverse tasks and simulations, enabling small models to achieve strong performance with reduced inference cost.
Introduces GUI-RobustEval, a benchmark for error recovery in GUI agents, and Robustness-driven Trajectory Synthesis (RoTS) to generate training data, achieving state-of-the-art on OSWorld.
EnvFactory automates the creation of executable tool environments and natural multi-turn trajectories for training LLMs with agentic reinforcement learning, achieving superior performance on benchmarks like BFCLv3 and MCP-Atlas with fewer environments than prior work.