group-normalization

Tag

Cards List
#group-normalization

Rethinking Groups in Critic-Free RLVR

arXiv cs.LG · 2026-06-17 Cached

This paper rethinks the role of grouping in critic-free reinforcement learning for LLMs and proposes negative token filtering to enable stable training with a single rollout per prompt, achieving comparable or better performance on reasoning and agentic tasks.

0 favorites 0 likes
← Back to home

Submit Feedback