Tag
This paper introduces a bidirectional diagnostic, Compliance Asymmetry, and finds that LLMs exhibit 'directional blindness' in moral judgments: they comply equally to helpful and harmful social nudges, unlike in factual domains where they selectively follow helpful corrections. The phenomenon persists across models and nudge types, highlighting a distinct failure mode in current LLM alignment.