Tag
Proposes Cognitive Relative Policy Optimization (CRPO), a reinforcement learning framework for aligning LLM reasoning in mental health assessment, achieving an average improvement of 10.4 percentage points in weighted F1-score over existing baselines.