Tag
This paper introduces RePro, a framework that trains LLM agents to self-generate progress signals through a forward-then-reflect rollout paradigm, achieving up to 12% absolute success rate gains on WebShop, ALFWorld, and Sokoban benchmarks.