Tag
This paper presents the first study of probability calibration as a mitigation for evaluator preference coupling in LLM agent feedback loops, showing that calibrated evaluator judgments reduce coupling coefficients by 20-49% and divergence by 45-67%.