Preregistered Belief Revision Contracts
Summary
This paper introduces Preregistered Belief Revision Contracts (PBRC), a protocol-level mechanism for multi-agent systems (including LLM-based agents) that separates open communication from admissible belief changes by publicly fixing evidence triggers and revision operators. The work addresses dangerous conformity effects in agent deliberation and provides formal guarantees that social-only pressure cannot drive false consensus.
View Cached Full Text
Cached at: 04/20/26, 08:29 AM
# Preregistered Belief Revision Contracts Source: https://arxiv.org/html/2604.15558 ###### Abstract Deliberative multi-agent systems, including recent LLM-based agent societies, allow agents to exchange messages and revise beliefs over time. While this interaction is meant to improve performance, it can also create dangerous conformity effects: agreement, confidence, prestige, or majority size may be treated as if they were evidence, producing high-confidence convergence to false conclusions. To address this, we introduce **PBRC** (*Preregistered Belief Revision Contracts*), a protocol-level mechanism that strictly separates open communication from admissible epistemic change. A PBRC contract publicly fixes first-order evidence triggers, admissible revision operators, a priority rule, and a fallback policy. Crucially, a non-fallback step is accepted only when it cites a preregistered trigger and provides a nonempty witness set of externally validated evidence tokens. This ensures that every substantive belief change is both enforceable by a router and auditable after the fact. In this paper, we first prove that under evidential contracts with conservative fallback, social-only rounds cannot increase confidence and cannot generate purely conformity-driven "wrong-but-sure" cascades. Second, we show that auditable trigger protocols admit evidential PBRC normal forms that preserve belief trajectories and canonicalized audit traces. Third, we demonstrate that sound enforcement yields epistemic accountability: any change of top hypothesis is attributable to a concrete validated witness set. Fourth, for token-invariant contracts, we prove that enforced trajectories depend only on token-exposure traces; under flooding dissemination, these traces are characterized exactly by truncated reachability, giving tight diameter bounds for universal evidence closure. Finally, we introduce a companion contractual dynamic doxastic logic to specify trace invariants, and provide simulations illustrating cascade suppression, auditability, and robustness–liveness trade-offs. **Keywords:** belief revision · dynamic doxastic logic · protocol semantics · auditability · graph reachability · multi-agent deliberation · large language models. ## 1 Introduction Belief revision in multi-agent settings becomes fragile when agents are allowed to treat one another's assertions as epistemic reasons. This issue has reappeared with unusual force in recent LLM-based multi-agent systems, where agents deliberate by exchanging messages, critiques, and self-reports of confidence. Empirical work reports conformity, peer-pressure effects, and topology-sensitive "wrong-but-sure" cascades: the population can become more confident precisely when it is moving toward the wrong answer [[Weng et al. (2025)](https://arxiv.org/html/2604.15558#bib.bibx49), [Han et al. (2026)](https://arxiv.org/html/2604.15558#bib.bibx17), [Song et al. (2025)](https://arxiv.org/html/2604.15558#bib.bibx42), [Ashery et al. (2024)](https://arxiv.org/html/2604.15558#bib.bibx5), [Ashery et al. (2025)](https://arxiv.org/html/2604.15558#bib.bibx6)]. These findings motivate a basic question for both logic and system design: how can a deliberation protocol preserve open communication while refusing purely social pressure as a warrant for belief change? The central failure mode is not communication itself, but the lack of a public distinction between *persuasion* and *evidence*. Agreement, prestige, fluency, rapport, or majority size can become de facto triggers for revision even though none of them is an externally checkable reason for changing belief. Our starting point is therefore intentionally modest. We do not propose a new underlying revision operator, an aggregation rule, or a controller that chooses whose answer to trust. Instead, we introduce a protocol layer that governs *when* revision is admissible. The layer should be explicit enough to audit, strong enough to enforce, and abstract enough to sit on top of different revision operators and different evidence channels. This paper introduces **PBRC** (*Preregistered Belief Revision Contracts*). Before interaction, an agent publicly preregisters (i) first-order triggers over validated evidence tokens, (ii) the revision operators that may be used when those triggers fire, (iii) a priority rule, and (iv) a fallback policy. During deliberation, a non-fallback step is accepted only if it cites a preregistered trigger and supplies a *nonempty witness set* of externally validated tokens that makes the trigger checkable by a router or auditor. PBRC is thus not a voting rule or meta-judge; it is a contract semantics for admissible belief change. LLM-based agent societies provide the motivating application throughout, but the formal development is general and applies to any finite-hypothesis, token-mediated deliberation protocol. ### 1.1 Contributions We make five contributions. 1. **Protocol semantics for evidence-gated revision** ([Section 4](https://arxiv.org/html/2604.15558#S4)). We formalize PBRC contracts as public tuples of first-order triggers, revision operators, priority, and fallback, together with witness-carrying certificates and explicit router semantics. This yields a clean separation between message exchange and admissible epistemic change. 2. **Social-only safety guarantees** ([Section 6](https://arxiv.org/html/2604.15558#S6)). Under evidential contracts with conservative fallback, social-only rounds cannot amplify confidence and cannot generate purely conformity-driven "wrong-but-sure" cascades ([Theorems 1](https://arxiv.org/html/2604.15558#ThmTheorem1) and [2](https://arxiv.org/html/2604.15558#ThmTheorem2)). The results isolate the exact structural role of evidence-gating and argmax-preserving fallback. 3. **Normal forms, enforcement, and accountability** ([Section 8](https://arxiv.org/html/2604.15558#S8)). We prove that auditable trigger protocols admit evidential PBRC normal forms preserving both belief trajectories and canonicalized audit traces ([Theorem 7](https://arxiv.org/html/2604.15558#ThmTheorem7)). We also show that sound enforcement projects arbitrary trigger protocols onto their explicit evidence-gated behavior. This yields gate transparency for compliant contracts and epistemic accountability: any top-hypothesis change under enforcement is attributable to a concrete nonempty witness set of validated tokens ([Theorem 17](https://arxiv.org/html/2604.15558#ThmTheorem17)). 4. **Token-trace factorization and tight topology results** ([Section 8](https://arxiv.org/html/2604.15558#S8)). For token-invariant contracts, enforced belief dynamics depend only on validated token exposure traces, not on rhetorical presentation ([Theorem 10](https://arxiv.org/html/2604.15558#ThmTheorem10)). Under flooding dissemination, these traces are characterized exactly by truncated reachability. We prove both necessity and sufficiency of reachability equivalence for trace equivalence, together with tight diameter bounds for universal evidence closure ([Theorems 13](https://arxiv.org/html/2604.15558#ThmTheorem13) and [14](https://arxiv.org/html/2604.15558#ThmTheorem14)). 5. **Robustness analysis and a specification logic** ([Sections 9](https://arxiv.org/html/2604.15558#S9) and [10](https://arxiv.org/html/2604.15558#S10)). We formalize forgery, replay, collusion, and omission adversaries; derive freshness and multi-attestation robustness conditions; and prove a completeness-style failure taxonomy that localizes first wrong-top transitions to a small set of auditable failure modes ([Theorem 22](https://arxiv.org/html/2604.15558#ThmTheorem22)). We also introduce **CDDL**, a contractual dynamic doxastic logic with full PDL iteration for specifying and verifying invariants over audited runs ([Theorem 23](https://arxiv.org/html/2604.15558#ThmTheorem23)), with complete soundness and completeness proofs in [Appendix A](https://arxiv.org/html/2604.15558#A1). Simulations and benchmark protocols ([Section 13](https://arxiv.org/html/2604.15558#S13)) serve as empirical illustrations of the logical claims rather than as their foundation. **Scope.** PBRC blocks *social-only* belief cascades under integrity assumptions on token validity and labeling. It does not by itself repair shared bad evidence or semantic mislabeling, prevent evidence-generation steering unless query policies are also contracted, or guarantee liveness under withholding and denial-of-service. ### 1.2 What PBRC is not PBRC is not a stacking ensemble, voting rule, meta-judge, or controller that averages opinions. It never treats social consensus as a substitute for evidence. Its only task is to constrain each agent's admissible transition: belief changes without a preregistered, validated trigger are inadmissible. This separation matters conceptually and technically: the results below concern admissibility, enforceability, auditability, and information flow, while remaining agnostic about the choice of underlying revision operator. The paper is organized as follows. [Section 3](https://arxiv.org/html/2604.15558#S3) states the operational model; [Section 4](https://arxiv.org/html/2604.15558#S4) defines PBRC contracts, certificates, and enforcement. [Sections 6](https://arxiv.org/html/2604.15558#S6)–[8](https://arxiv.org/html/2604.15558#S8) develop the logical and semantic core: social-only guarantees, minimality, normal forms, token-sufficiency, and topology-to-trace factorization. [Sections 9](https://arxiv.org/html/2604.15558#S9)–[12](https://arxiv.org/html/2604.15558#S12) treat adversaries, implementation guidance, and verification cost. [Section 13](https://arxiv.org/html/2604.15558#S13) provides empirical illustrations of the protocol's qualitative behavior and overhead trade-offs. ## 2 Related Work Empirical studies of LLM multi-agent systems (MAS) document systematic conformity and peer-pressure effects whose strength depends on the interaction protocol, peer reliability, rapport, and network structure. BenchForm provides a benchmarked characterization of conformity across interaction protocols [[Weng et al. (2025)](https://arxiv.org/html/2604.15558#bib.bibx49)]. KAIROS studies peer-pressure under heterogeneous peer reliability and rapport [[Song et al. (2025)](https://arxiv.org/html/2604.15558#bib.bibx42)]. Recent results also emphasize that topology and self–social weighting can modulate conformity and "wrong-but-sure" cascades [[Han et al. (2026)](https://arxiv.org/html/2604.15558#bib.bibx17)]. Complementary work adapts classic social-psychology paradigms and reports that AI agents exhibit conformity patterns aligned with Social Impact Theory [[Bellina et al. (2026)](https://arxiv.org/html/2604.15558#bib.bibx9)], connecting to foundational human-group evidence on conformity and social impact [[Asch (1951)](https://arxiv.org/html/2604.15558#bib.bibx4), [Latané (1981)](https://arxiv.org/html/2604.15558#bib.bibx25)]. Our contribution is orthogonal to measuring conformity: PBRC specifies an enforceable admissibility layer that prevents purely social rhetoric from being accepted as belief change unless accompanied by verifiable evidence artifacts. Repeated interaction among LLM agents can also yield emergent conventions and collective biases [[Ashery et al. (2024)](https://arxiv.org/html/2604.15558#bib.bibx5), [Ashery et al. (2025)](https://arxiv.org/html/2604.15558#bib.bibx6)]. These phenomena motivate mechanisms that distinguish coordination signals from epistemic justification. PBRC targets the epistemic side: contracts restrict which belief transitions are admissible and require certificates with validated evidence tokens, so convention formation cannot by itself justify epistemic flips absent evidence. Belief revision and belief-change formalisms provide the canonical language for rational change of epistemic states. AGM belief revision axiomatizes rational postulates for theory change [[Alchourrón et al. (1985)](https://arxiv.org/html/2604.15558#bib.bibx2), [Gärdenfors (1988)](https://arxiv.org/html/2604.15558#bib.bibx14), [Hansson (n.d.)](https://arxiv.org/html/2604.15558#bib.bibx18)]. Belief update distinguishes revising by new information from updating after world change [[Katsuno and Mendelzon (1991)](https://arxiv.org/html/2604.15558#bib.bibx21)], and iterated revision studies sequential change [[Darwiche and Pearl (1997)](https://arxiv.org/html/2604.15558#bib.bibx11)]. Ranking-theoretic approaches provide alternative representations of epistemic states and revision dynamics [[Spohn (2012)](https://arxiv.org/html/2604.15558#bib.bibx43)]. Dynamic epistemic logic (DEL) models informational actions and announcements [[van Ditmarsch et al. (n.d.)](https://arxiv.org/html/2604.15558#bib.bibx47), [van Ditmarsch et al. (2007)](https://arxiv.org/html/2604.15558#bib.bibx46)], and dynamic logics of belief change connect AGM-style ideas with dynamic and plausibility semantics in both single- and multi-agent settings [[van Benthem and Smets (2015)](https://arxiv.org/html/2604.15558#bib.bibx45)]. PBRC does not propose a new underlying revision operator; instead it constrains which transitions are admissible under social interaction via preregistered, token-witnessable triggers, and it supports auditing via checkable certificates. Relatedly, belief merging and judgment aggregation study how to combine information from multiple sources under inconsistency [[Konieczny and Pérez (2002)](https://arxiv.org/html/2604.15558#bib.bibx23), [Konieczny and Pérez (2011)](https://arxiv.org/html/2604.15558#bib.bibx24)]; PBRC is compatible with such operators but adds an enforceable evidence gate for when they may be applied. Combining belief modalities with dynamic/program operators has a substantial history in dynamic doxastic logic and related systems [[Leitgeb and Segerberg (2007)](https://arxiv.org/html/2604.15558#bib.bibx26), [Schmidt and Tishkovsky (2008)](https://arxiv.org/html/2604.15558#bib.bibx39)]; see [[Fagin et al. (1995)](https://arxiv.org/html/2604.15558#bib.bibx13), [van Benthem (2011)](https://arxiv.org/html/2604.15558#bib.bibx44)] for broader background on multi-agent epistemic/doxastic logic and logical dynamics. Our novelty is not the presence of KD45 + PDL per se, but the use of program structure to encode PBRC contracts and to specify invariants over enforced, certificate-carrying belief transitions ([Section 10](https://arxiv.org/html/2604.15558#S10)). Compliance monitoring in MAS is often framed in terms of commitment-based protocols and social/interactional commitments [[Singh (1999)](https://arxiv.org/html/2604.15558#bib.bibx41), [Yolum and Singh (2002)](https://arxiv.org/html/2604.15558#bib.bibx52)]. PBRC enables compliance monitoring for epistemic transitions: belief changes are admissible only when accompanied by verifiable evidence tokens and witness sets that can be checked by an external router or auditor. This certificate discipline is reminiscent of proof-carrying mechanisms, where untrusted producers attach checkable artifacts enabling efficient validation by a consumer [[Necula (1997)](https://arxiv.org/html/2604.15558#bib.bibx35)]. Tamper-evident audit logs are a standard accountability primitive [[Kelsey et al. (1999)](https://arxiv.org/html/2604.15558#bib.bibx22)]; PBRC certificates are designed to be stored in such append-only logs. If one additionally wants to bind correct execution of updates (not just admissibility), verifiable computation and succinct proof systems provide relevant primitives [[Gennaro et al. (2010)](https://arxiv.org/html/2604.15558#bib.bibx15), [Parno et al. (2013)](https://arxiv.org/html/2604.15558#bib.bibx36)]] ([Section 4.4](https://arxiv.org/html/2604.15558#S4.SS4)). Trust and reputation models are another common response to unreliable peers in MAS [[Sabater and Sierra (2005)]...
Similar Articles
Belief Engine: Configurable and Inspectable Stance Dynamics in Multi-Agent LLM Deliberation
The paper introduces the Belief Engine, an auditable belief-update layer for LLM agents that makes stance changes in multi-agent deliberation configurable and inspectable by treating belief as an evidential state with explicit update rules.
Agent-BRACE: Decoupling Beliefs from Actions in Long-Horizon Tasks via Verbalized State Uncertainty
This paper introduces Agent-BRACE, a method that decouples LLM agents into belief state and policy models to handle long-horizon tasks in partially observable environments. By verbalizing state uncertainty, it achieves significant performance improvements over baselines while maintaining constant context window size.
Belief Memory: Agent Memory Under Partial Observability
This paper introduces BeliefMem, a novel memory paradigm for LLM agents that stores multiple candidate conclusions with probabilities to handle partial observability and reduce self-reinforcing errors. Empirical evaluations show it outperforms deterministic baselines on LoCoMo and ALFWorld benchmarks.
When Should Models Change Their Minds? Contextual Belief Management in Large Language Models
This paper introduces Contextual Belief Management (CBM) for LLMs to handle long-term information, proposes the BeliefTrack benchmark for evaluation, and demonstrates that reinforcement learning and representation-level steering significantly reduce belief management failures.
Recall Isn't Enough: Bounding Commitments in Personalized Language Systems
Introduces Contract-Bounded Evidence Activation (CBEA) with Lexicographic Commitment Validation (LCV) to prevent runtime control failures in personalized language systems where systems make incorrect commitments despite having relevant context. Achieves zero failures within validator scope at 0.49–0.60 availability, significantly outperforming baselines.