language-agents

Tag

Cards List
#language-agents

Milestone-Guided Policy Learning for Long-Horizon Language Agents

arXiv cs.CL · 5d ago Cached

This paper introduces BEACON, a milestone-guided policy learning framework designed to improve credit assignment and sample efficiency for long-horizon language agents. It demonstrates significant performance improvements over GRPO and GiGPO on benchmarks like ALFWorld, WebShop, and ScienceWorld.

0 favorites 0 likes
← Back to home

Submit Feedback