nudge

#nudge

Right or Wrong, Models Comply: Directional Blindness in LLM Moral Judgment

arXiv cs.CL ↗ · 2026-06-15 Cached

This paper introduces a bidirectional diagnostic, Compliance Asymmetry, and finds that LLMs exhibit 'directional blindness' in moral judgments: they comply equally to helpful and harmful social nudges, unlike in factual domains where they selectively follow helpful corrections. The phenomenon persists across models and nudge types, highlighting a distinct failure mode in current LLM alignment.

0 favorites 0 likes

nudge

Right or Wrong, Models Comply: Directional Blindness in LLM Moral Judgment

Submit Feedback