Tag
This paper introduces ISE, a three-stage synthesis paradigm for generating multi-turn OS-agent trajectories with grounded execution, demonstrating that fine-tuning on the resulting ISE-Trace dataset significantly improves agent performance on ClawEval.
This paper investigates execution-grounded automated AI research by building an automated executor that implements LLM-generated ideas and runs experiments. It shows that execution-guided evolutionary search can find methods that significantly outperform baselines in both pre-training and post-training tasks.