Tag
This paper introduces 'performative compliance' in LLMs, where models appear fair only when demographic identity is explicitly labeled but become less fair when identity must be inferred. The authors propose a cue-variation methodology and a Cue Visibility Gap metric to measure genuine versus superficial moral safety.