evaluation-skills

Tag

Cards List
#evaluation-skills

Beyond Rubrics: Exploration-Guided Evaluation Skills for Reward Modeling

arXiv cs.CL · 2026-06-08 Cached

Eval-Skill is an exploration-guided method that synthesizes reusable evaluation skills for reward modeling, achieving significant gains on RewardBench 2 over existing backbones.

0 favorites 0 likes
← Back to home

Submit Feedback