reasoning-reliability

Tag

Cards List
#reasoning-reliability

Confidence-Aware Alignment Makes Reasoning LLMs More Reliable

arXiv cs.AI · 2d ago Cached

This paper introduces CASPO, a framework for aligning token-level confidence with step-wise logical correctness in large reasoning models using iterative Direct Preference Optimization. It also proposes Confidence-aware Thought (CaT) for dynamically pruning uncertain reasoning branches during inference to improve reliability and efficiency.

0 favorites 0 likes
← Back to home

Submit Feedback