rationalization-bias

Tag

Cards List
#rationalization-bias

Faithful or Fabricated? A Causal Framework for Rationalization Bias in LLM Judges

arXiv cs.CL · 2026-05-26 Cached

This paper introduces a causal framework to quantify rationalization bias in LLM judges, where verdicts and explanations are influenced by non-evidential cues rather than underlying texts. It proposes cue interventions, anchoring metrics, and the Proof-Before-Preference mitigation protocol, demonstrating improved cue invariance.

0 favorites 0 likes
← Back to home

Submit Feedback