verifiers

#verifiers

Reward Hacking in Rubric-Based Reinforcement Learning

Hugging Face Daily Papers ↗ · 2d ago Cached

This paper investigates reward hacking in rubric-based reinforcement learning, analyzing the divergence between training verifiers and evaluation metrics. It introduces a diagnostic for the 'self-internalization gap' and demonstrates that stronger verification reduces but does not eliminate reward hacking.

0 favorites 0 likes

#verifiers

AgentV-RL: Scaling Reward Modeling with Agentic Verifier

arXiv cs.CL ↗ · 2026-04-20 Cached

AgentV-RL introduces an Agentic Verifier framework that enhances reward modeling through bidirectional verification with forward and backward agents augmented with tools, achieving 25.2% improvement over state-of-the-art ORMs. The approach addresses error propagation and grounding issues in verifiers for complex reasoning tasks through multi-turn deliberative processes combined with reinforcement learning.

0 favorites 0 likes

#verifiers

Solving math word problems

OpenAI Blog ↗ · 2021-10-29 Cached

OpenAI trained a system using verifiers to solve grade school math word problems with 90% of child-level accuracy, nearly doubling fine-tuned GPT-3 performance. The approach addresses language models' weakness in multistep reasoning by training verifiers to evaluate candidate solutions and select the best one.

0 favorites 0 likes

verifiers

Reward Hacking in Rubric-Based Reinforcement Learning

AgentV-RL: Scaling Reward Modeling with Agentic Verifier

Solving math word problems

Submit Feedback