When Should Models Change Their Minds? Contextual Belief Management in Large Language Models
Summary
This paper introduces Contextual Belief Management (CBM) for LLMs to handle long-term information, proposes the BeliefTrack benchmark for evaluation, and demonstrates that reinforcement learning and representation-level steering significantly reduce belief management failures.
View Cached Full Text
Cached at: 05/29/26, 02:59 AM
Paper page - When Should Models Change Their Minds? Contextual Belief Management in Large Language Models
Source: https://huggingface.co/papers/2605.30219
Abstract
Language models struggle with managing long-term information through contextual belief management, which involves updating, preserving, and filtering relevant information, and can be improved using reinforcement learning and representation-level steering techniques.
Long-horizon interactions require language models to manage accumulating information: when to update their state, when to preserve their state, and what to ignore. We study this challenge asContextual Belief Management(CBM): maintaining a predictedbelief statealigned with formal evidence while isolating task-irrelevant noise. To make CBM measurable, we introduce BeliefTrack, aclosed-world benchmarkspanningRule DiscoveryandCircuit Diagnosis, where a finite belief space andsymbolic verifiersenable exact turn-level evaluation. BeliefTrack diagnoses three failures: Failed Stay, Failed Update, and Failed Isolation. Across multiple LLMs, vanilla models exhibit severe CBM failures, while explicit belief-tracking prompts provide limited gains. In contrast,reinforcement learningwith belief-state rewards reduces failure rates by 70.9\% on average. Further probing reveals latent belief-state dynamics behind these failures, andrepresentation-level steeringreduces failure rates by 46.1\% across two tasks\footnote{Code is coming soon at https://github.com/zjunlp/CBM.
View arXiv pageView PDFAdd to collection
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.30219 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.30219 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.30219 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
@HuggingPapers: When should LLMs update, preserve, or ignore information? Contextual Belief Management is what long-horizon reasoning w…
Introduces BeliefTrack, a method for contextual belief management in LLMs, reducing reasoning failures by over 70%.
OmniToM: Benchmarking Theory of Mind in LLMs via Explicit Belief Modeling
OmniToM introduces a benchmark that evaluates large language models' theory of mind by requiring explicit belief structure extraction and labeling, revealing a bottleneck in tracking actor-specific beliefs despite strong performance on endpoint QA tasks.
When Correct Beliefs Collapse: Epistemic Resilience of LLMs under Clinical Pressure
This paper investigates how large language models maintain correct beliefs under adversarial pressure in clinical settings, proposing R-FT fine-tuning to improve epistemic resilience while balancing corrigibility, and demonstrating significant robustness gains on medical benchmarks.
Can LLMs Take Retrieved Information with a Grain of Salt?
This paper investigates how large language models adapt to the certainty of retrieved information, identifying systematic limitations in handling uncertainty. It proposes an interaction strategy that reduces obedience errors by 25% without modifying model weights.
Belief Engine: Configurable and Inspectable Stance Dynamics in Multi-Agent LLM Deliberation
The paper introduces the Belief Engine, an auditable belief-update layer for LLM agents that makes stance changes in multi-agent deliberation configurable and inspectable by treating belief as an evidential state with explicit update rules.