hypothesis-lock-in

Tag

Cards List
#hypothesis-lock-in

Calibration Drift Under Reasoning: How Chain-of-Thought Budgets Induce Overconfidence in Large Language Models

arXiv cs.CL · 5d ago Cached

This paper identifies Calibration Drift Under Reasoning (CDUR), where increasing chain-of-thought reasoning budgets causes LLMs to become systematically overconfident in incorrect answers, and proposes a Hypothesis Lock-In model and a calibration-aware stopping rule (CABStop) to mitigate the issue.

0 favorites 0 likes
← Back to home

Submit Feedback