Tag
Introduces a conformal prediction method for ordinal classification using the ranked probability score as a nonconformity function, producing median-centered contiguous prediction sets and achieving favorable balance between set width and ordinal miscoverage.
This paper formulates orchestration of coding agents as cost-sensitive sequential hypothesis testing using a Bayesian controller that dynamically decides when to gather evidence, refine, verify, or stop. Experiments across six generators and nine benchmarks show Bayesian control is most valuable when verification is costly and critics are informative but imperfect.
The paper introduces CALIBER, a method for calibrating confidence in reasoning language models by eliciting confidence estimates both before and after reasoning, with supervision targets matched to the information state. It achieves significant reductions in Expected Calibration Error (up to 52.5%) and strong Brier scores and AUROC across multiple benchmarks.
This paper proposes a human-on-the-loop orchestration framework for AI-assisted legal discovery, introducing a taxonomy of agentic failures and a four-layer verification architecture to reduce privilege-waiver risk.
Proposes REEF-GP, a post-hoc uncertainty quantification framework that fits a Gaussian process to the residuals of a frozen neural operator using its internal embeddings, enabling geometry-aware and calibrated uncertainties at low cost.
A deep learning framework for probabilistic CO2 column retrieval from OCO-2 spectra using Laplace approximations and normalizing flows, achieving faster inference and better uncertainty quantification than traditional methods.
This paper presents a generalized adaptive sequential sampling method for constructing polynomial chaos expansion surrogate models to improve uncertainty quantification in multi-output engineering structures, balancing variance contribution and spatial exploration.
This paper diagnoses and repairs shape-prior shortcuts in learning-based long-range single-shot fringe projection profilometry, using mechanistic interpretability and conformal uncertainty quantification. The proposed PhiCalNet architecture achieves a 3.3x reduction in object MAE by replacing depth regression with wrapped-phase output and a differentiable calibration layer.
This paper proposes a framework for strategic decision support for AI agents, formulating an optimization problem to minimize support usage while controlling missed-support error. The authors develop an online algorithm and calibration method, demonstrating effectiveness across information gathering, human-AI collaboration, and tool use scenarios.
This paper proposes structure-preserving neural surrogates for partial differential equations that integrate Gaussian process regression to provide tractable uncertainty quantification, enabling real-time simulation with closed-form error estimates.
Proposes the first application of split conformal prediction to neural operator-based physics simulation, providing distribution-free prediction intervals with finite-sample coverage guarantees and adaptive-width intervals using MC Dropout uncertainty.
Co-GLANCE is a real-time onboard perception and decision-making system for heterogeneous robot teams that distills vision-language model capabilities into efficient models and uses conformal prediction with selective abstention to quantify and resolve perceptual uncertainty, outperforming cloud-based VLM baselines by 25-36% while achieving 350x lower latency.
This paper proposes Global-Local Uncertainty (GLU), an unsupervised single-pass score that fuses token-level local entropy with hidden-state geometric global entropy for uncertainty quantification in LLMs, showing that the two are near-orthogonal and together capture confident-but-wrong failures.
This paper proposes an uncertainty-aware multi-fidelity framework based on conditional normalizing flows to improve the predictive accuracy of reduced-order models (ROMs) for complex multiscale systems. The method learns a probabilistic mapping from low-fidelity to high-fidelity coefficients and is demonstrated on a vortex merging problem, showing improved accuracy with uncertainty quantification.
This paper introduces GNOVA, a GRU-Neural ODE Variational Autoencoder framework for reconstructing and forecasting Alzheimer's disease cognitive trajectories from routine clinical data without expensive neuroimaging or biomarkers, achieving low error and uncertainty estimation on the ADNI dataset.
A comprehensive survey on uncertainty quantification in symbolic regression, reviewing frequentist, Bayesian, and model selection approaches to address the lack of reliability support in real-world decision processes.
This paper proposes TRUST, a method that integrates uncertainty quantification into reinforcement learning reward design to improve LLM agents' tool-calling decisions, enhancing decision quality and maintaining reliable uncertainty estimates.
This paper introduces A4D, a framework that maps visual observations into a shared latent space structured around affordances (e.g., 'movable') for robot planning. It achieves 94% inference accuracy on existing affordances, outperforming state-of-the-art by 15%, and enables 100x faster inference with superior generalization to unseen object functionalities.
This paper proposes an extended family of structured spatial priors combining total variation (TV) with ℓ_p norms for Bayesian T1 mapping, enabling uncertainty quantification. The method is evaluated on synthetic and real MRI datasets, showing improved spatial coherence and reduced uncertainty.
ADAPTOOD is a novel framework that uses data uncertainty to quantify distribution shift severity and guide fine-tuning of ECG time series models for out-of-distribution settings. It combines uncertainty estimation with low-rank model updates and adaptive hyperparameter optimization, achieving up to 7% higher accuracy and 12.9% higher precision than existing OOD adaptation methods.