Tag
This paper introduces SENTINEL, a failure-driven reinforcement learning framework for training tool-using language model agents. It uses a Controller-Proposer-Solver loop to generate targeted training tasks from failed trajectories, improving performance on benchmarks.