uncertainty

Tag

Cards List
#uncertainty

Built an agent that says "I haven't seen this before" instead of guessing — here's what changed once it had memory

Reddit r/AI_Agents · 3h ago

The article discusses building an AI agent that can recognize and express uncertainty instead of guessing, and explores the impact of adding memory to its decision-making process.

0 favorites 0 likes
#uncertainty

Sam Altman unsure about gpt 5.6 release outside of US

Reddit r/singularity · 7h ago

Sam Altman expresses uncertainty about releasing GPT-5.6 outside the US, raising questions about geographic availability.

0 favorites 0 likes
#uncertainty

Uncertainty-aware reinforcement learning for chemical language models

arXiv cs.LG · 2d ago Cached

Proposes two complementary approaches to incorporate predictive uncertainty into reinforcement learning for chemical language models, improving robustness and increasing true hit rate by 0.25 in de novo molecular design.

0 favorites 0 likes
#uncertainty

Why do AI systems still struggle to interpret uncertainty in human conversation?

Reddit r/artificial · 2026-06-19

The article discusses why AI systems have difficulty interpreting uncertainty and ambiguity in human conversation, highlighting ongoing challenges in natural language understanding.

0 favorites 0 likes
#uncertainty

Optimizing Lithium Production Decisions under Geological, Demand, and Pricing Uncertainties: A POMDP Framework for Multi-Objective Decision Making

arXiv cs.AI · 2026-06-18 Cached

This paper proposes a POMDP framework for multi-objective decision making in lithium production, addressing geological, demand, and pricing uncertainties to optimize mine opening and extraction method selection. The approach outperforms human-inspired heuristics by dynamically adapting to shifting price regimes through belief state planning.

0 favorites 0 likes
#uncertainty

Decision-Driven Geosteering Under Uncertainty: A Unified Framework for Sequential Decision Optimization

arXiv cs.LG · 2026-06-17 Cached

Presents an uncertainty-aware geosteering framework integrating particle filtering for probabilistic subsurface interpretation with reinforcement learning for sequential decision-making, evaluated on an industrial simulator.

0 favorites 0 likes
#uncertainty

Quantifying Consistency in LLM Logical Reasoning via Structural Uncertainty

arXiv cs.AI · 2026-06-17 Cached

This paper introduces structural uncertainty, a framework that evaluates LLM reasoning consistency by measuring the stability of self-preference rankings among sampled reasoning solutions, complementing traditional answer-dispersion methods for identifying unreliable reasoning.

0 favorites 0 likes
#uncertainty

Minimal Oversight: Uncertainty-Aware Governance for Delegated AI Systems

arXiv cs.AI · 2026-06-16 Cached

The paper proposes the Minimum Sufficient Oversight Principle (MSO) for governing delegated AI systems, deriving mathematical solutions for autonomy allocation and trust calibration, and introduces concepts like water-filling allocation and masking pathology.

0 favorites 0 likes
#uncertainty

Have we trusted the agent recommendations too early?

Reddit r/AI_Agents · 2026-06-11

An opinion piece questioning whether we rely too heavily on confident agent recommendations (human or AI) when underlying data is often messy and incomplete, suggesting that agents should express uncertainty.

0 favorites 0 likes
#uncertainty

WorldKernel: A World Model is the Coupling Kernel of Admissible Possible Worlds

arXiv cs.AI · 2026-06-10 Cached

The paper identifies a failure mode where predictors collapse to a point on unidentified counterfactual couplings and proposes a framework using a positive semidefinite coupling kernel to bound counterfactuals, showing that prediction cannot represent uncertainty over cross-world couplings and that enforcing kernel constraints yields tractable bounds.

0 favorites 0 likes
#uncertainty

Calibrating Overconfidence Without Sacrificing Confidence: Probe-Conditioned Head Intervention for LLMs

arXiv cs.LG · 2026-06-10 Cached

The paper introduces Probe-Conditioned Head Intervention (PCHI), an inference-time method for LLMs that selectively reduces overconfidence on wrong answers without significantly reducing confidence on correct ones, by conditionally rescaling attention head outputs when the model is likely wrong but confident.

0 favorites 0 likes
#uncertainty

Using Probabilistic Programs to Train Inductive Reasoning in Large Language Models

arXiv cs.CL · 2026-06-10 Cached

This paper introduces Program-based Posterior Training (PPT), a method that uses LLM-generated probabilistic programs to create distributional targets for fine-tuning inductive reasoning, improving estimation accuracy and calibration on held-out tasks and human-alignment benchmarks.

0 favorites 0 likes
#uncertainty

An Efficient Method for the Optimal Control of Microgrids Under Uncertainties using Local Reduction

Hugging Face Daily Papers · 2026-06-10 Cached

Proposes and compares two mathematical formulations for robust microgrid sizing and power scheduling under uncertainties, using a local reduction algorithm that achieves high feasibility rates in Monte Carlo simulations.

0 favorites 0 likes
#uncertainty

How Language Models Fail: Token-Level Signatures of Committed and Persistent Reasoning Failures

arXiv cs.CL · 2026-06-08 Cached

This paper characterizes two distinct processes by which language models fail in reasoning—committed failure and persistent uncertainty—using token-level uncertainty signals, and demonstrates implications for self-consistency and failure detection strategies.

0 favorites 0 likes
#uncertainty

Performance Variation in Deep Reinforcement Learning

arXiv cs.LG · 2026-06-08 Cached

This paper identifies limitations of conventional uncertainty estimates for deep reinforcement learning and proposes percentile-based statistics and visualization to better assess run-to-run performance variation. Case studies demonstrate the method on PPO, SAC, TD-MPC, DQN, and Rainbow algorithms.

0 favorites 0 likes
#uncertainty

Faithful uncertainty in LLM agents: calibration vs utility tradeoff in practice[D]

Reddit r/MachineLearning · 2026-06-04

A practitioner discusses the calibration vs. utility tradeoff in LLM agents, sharing experience with a verifier-based pipeline that reduces hallucinated tool calls by ~60% but introduces latency costs and drops easy correct answers.

0 favorites 0 likes
#uncertainty

Uncertainty-Aware Clarification in LLM Agents with Information Gain

arXiv cs.AI · 2026-06-03 Cached

Proposes a goal-oriented clarification framework using Information Gain Reward to train LLM agents to ask effective clarification questions under underspecified user instructions, improving task success rate by 3.7% with minimal interaction overhead.

0 favorites 0 likes
#uncertainty

On the evolution of the concept of probability as a mirror of the evolution of reason

arXiv cs.AI · 2026-06-02 Cached

This paper argues that probability theory is a historically evolving form of rationality, tracing its development from combinatorial games to Bayesian inference and contrasting it with fuzzy logic and deep learning.

0 favorites 0 likes
#uncertainty

Emergence of Exploration in Policy Gradient Reinforcement Learning via Retrying

arXiv cs.LG · 2026-06-02 Cached

This paper introduces ReMax, a new objective for reinforcement learning that induces exploration as an emergent property by evaluating policies based on expected maximum return over multiple samples, without explicit exploration bonuses. The authors derive a policy gradient formulation and propose RePPO, a PPO variant that achieves efficient exploration on MinAtar and Craftax benchmarks.

0 favorites 0 likes
#uncertainty

Where is this AI going ?

Reddit r/ArtificialInteligence · 2026-05-30

The author reflects on mixed signals in the AI industry, noting high spending without proportional productivity gains and Anthropic's move to restrict Claude Code access while raising massive funding, questioning the direction of AI's revolutionary claims.

0 favorites 0 likes
Next →
← Back to home

Submit Feedback