Tag
Introduces PEBS, a per-rater empirical-Bayes shrinkage estimator for calibrating reward models in RLHF, reducing within-user RMSE by over 8.5% on PRISM and over 9.6% on PluriHarms.