Tag
CAT introduces a framework that leverages model self-certainty signals to autonomously adjust reasoning length based on problem difficulty, reducing overthinking and improving inference efficiency for large reasoning models.
Proposes Confidence-Aware SwiGLU (κ-SwiGLU) that adjusts expert gate sharpness in Mixture-of-Experts models based on token-level routing confidence, improving performance with minimal computational overhead.