Tag
BaLoRA introduces a Bayesian extension to Low-Rank Adaptation (LoRA) that provides calibrated uncertainty estimates and improves prediction accuracy by narrowing the gap with full fine-tuning.
This paper presents a framework for error attribution in multi-agent systems using conformal prediction, providing statistical guarantees for identifying decisive errors in agent trajectories. The approach enables automated recovery and debugging by isolating errors within contiguous prediction sets.
This paper proposes Online Localized Conformal Prediction (OLCP) to address covariate heterogeneity in online learning and time-series settings. It introduces OLCP-Hedge for bandwidth selection and demonstrates valid long-run coverage with narrower prediction sets compared to existing baselines.
This paper proposed Distribution-Aligned Adversarial Distillation (DisAAD), a method that uses a lightweight proxy model to estimate uncertainty in black-box LLMs with only 1% of the original model size, achieving reliable quantification without requiring internal parameters or multiple sampling.
MIT CSAIL researchers introduce RLCR, a method using Brier scores in reinforcement learning to train AI models to output calibrated confidence estimates, significantly reducing overconfidence without sacrificing accuracy.
Researchers introduce SHADE, a hybrid estimator that combines Good-Turing coverage with graph-spectral cues to quantify semantic uncertainty and detect LLM hallucinations when only a few black-box samples are available.
This paper introduces FRANQ, a method for detecting hallucinations in Retrieval-Augmented Generation (RAG) systems by applying distinct uncertainty quantification techniques to distinguish between factuality and faithfulness to retrieved context. The authors construct a new dataset annotated for both factuality and faithfulness, and demonstrate that FRANQ outperforms existing approaches in detecting factual errors across multiple datasets and LLMs.
This paper proposes a conformal prediction framework for LLMs that leverages internal representations rather than output-level statistics, introducing Layer-Wise Information (LI) scores as nonconformity measures to improve validity-efficiency trade-offs under distribution shift. The method demonstrates stronger robustness to calibration-deployment mismatch compared to text-level baselines across QA benchmarks.
MIT researchers developed a new method for identifying overconfident LLMs by measuring cross-model disagreement across similar models, rather than relying solely on self-consistency metrics. This approach better captures epistemic uncertainty and more accurately identifies unreliable predictions in high-stakes applications.
OpenAI researchers demonstrate that GPT-3 can learn to express calibrated uncertainty about its answers in natural language without using model logits, introducing the CalibratedMath benchmark suite to evaluate this capability. The approach shows robust generalization under distribution shift and represents the first evidence of models expressing well-calibrated verbal uncertainty about their own predictions.