Tag
This paper demonstrates that supervised fine-tuning with synthetic rationale data consistently harms prediction performance for Alzheimer's disease detection compared to label-only fine-tuning, across many configurations and model families. The degradation persists despite high-quality rationales and is attributed to a conflict between narrative plausibility and discriminative optimization.