rubric-reward

Tag

Cards List
#rubric-reward

Improving Heart-Focused Medical Question Answering in LLMs via Variance-Aware Rubric Rewards with GRPO

arXiv cs.CL · 2026-06-05 Cached

This paper proposes a Variance-Aware Reward Framework using GRPO to improve LLM performance on heart-focused medical question answering, achieving significant accuracy and F1 gains on a HealthBench subset.

0 favorites 0 likes
← Back to home

Submit Feedback