annotator-calibration

Tag

Cards List
#annotator-calibration

PEBS: Per-rater Empirical-Bayes Shrinkage for RLHF Reward-Model Calibration

arXiv cs.LG · 2d ago Cached

Introduces PEBS, a per-rater empirical-Bayes shrinkage estimator for calibrating reward models in RLHF, reducing within-user RMSE by over 8.5% on PRISM and over 9.6% on PluriHarms.

0 favorites 0 likes
← Back to home

Submit Feedback