confidence-estimation

#confidence-estimation

Margin-Adaptive Confidence Ranking for Reliable LLM Judgement

arXiv cs.LG ↗ · 19h ago Cached

This paper introduces a margin-based confidence ranking method for LLM-as-a-judge systems, learning a dedicated estimator to ensure monotonicity between confidence and human-disagreement risk, with generalization guarantees and improved ranking accuracy across datasets.

0 favorites 0 likes

#confidence-estimation

LLMs Know When They Know, but Do Not Act on It: A Metacognitive Harness for Test-time Scaling

arXiv cs.LG ↗ · 3d ago Cached

This paper proposes a metacognitive harness that separates monitoring from reasoning in LLMs, using pre-solve feeling-of-knowing and post-solve judgment-of-learning signals to control when to trust, retry, or aggregate answers, improving accuracy on text, code, and multimodal benchmarks without parameter updates.

0 favorites 0 likes

confidence-estimation

Margin-Adaptive Confidence Ranking for Reliable LLM Judgement

LLMs Know When They Know, but Do Not Act on It: A Metacognitive Harness for Test-time Scaling

Submit Feedback