causal-probing

Tag

Cards List
#causal-probing

Faithful or Fabricated? A Causal Framework for Rationalization Bias in LLM Judges

arXiv cs.CL · 2026-05-26 Cached

This paper introduces a causal framework to quantify rationalization bias in LLM judges, where verdicts and explanations are influenced by non-evidential cues rather than underlying texts. It proposes cue interventions, anchoring metrics, and the Proof-Before-Preference mitigation protocol, demonstrating improved cue invariance.

0 favorites 0 likes
#causal-probing

Causal Probing for Internal Visual Representations in Multimodal Large Language Models

arXiv cs.AI · 2026-05-08 Cached

This paper proposes a causal framework for probing internal visual representations in Multimodal Large Language Models, revealing differences in how entities and abstract concepts are encoded. The study highlights that increasing model depth is crucial for encoding abstract concepts and uncovers a disconnect between perception and reasoning in current MLLMs.

0 favorites 0 likes
← Back to home

Submit Feedback