Tag
This paper revisits the reliability paradox in the context of machine unlearning for language models, demonstrating that models can achieve low calibration error while relying on shortcut-based decision rules, thereby extending the paradox to unlearned models.