From Context-Aware to Conflict-Aware: Generalizing Contrastive Decoding for Knowledge Conflict in LLMs

arXiv cs.AI Papers

Summary

The paper generalizes contrastive decoding to a conflict-aware paradigm that dynamically allocates authority between external context and parametric priors, proposes the TriState-Bench evaluation protocol, and introduces Adaptive Regime Routing (ARR) to resolve asymmetry between correction and resistance.

arXiv:2606.10298v1 Announce Type: new Abstract: When large language models generate from retrieved or augmented contexts, conflicts between external context and parametric priors remain a central reliability bottleneck. Existing contrastive decoding methods follow a \emph{context-aware} paradigm that unilaterally amplifies context over parametric priors, overwriting correct priors when the context is erroneous. We generalize this to the \textbf{conflict-aware} paradigm that dynamically allocates authority between prior and context based on conflict signals, rather than presupposing context trustworthiness. We show that the affine combination of prior and context logits yields a \textbf{power family} with an inherent \textbf{regime asymmetry}: extrapolation amplifies errors unboundedly when the prior is correct, interpolation under-corrects when the context is correct, and no static regime covers both. Existing contrastive decoding methods are instances of this family, mostly extrapolative. To evaluate both conflict directions, we propose TriState-Bench, a model-aware evaluation protocol that calibrates per-model prior knowledge to measure three conflict states: correction, resistance, and agreement. To resolve the asymmetry, we propose Adaptive Regime Routing (ARR), which routes between regimes at each step, lifting resistance EM from below 6 to 16--33 without sacrificing correction or agreement. Our code is available at https://github.com/keith-Jiang/conflict-aware-decoding.
Original Article
View Cached Full Text

Cached at: 06/10/26, 06:14 AM

# From Context-Aware to Conflict-Aware: Generalizing Contrastive Decoding for Knowledge Conflict in LLMs
Source: [https://arxiv.org/html/2606.10298](https://arxiv.org/html/2606.10298)
Runze Jiang1,2,Taiqiang Wu3,Yan Wang2, Bingyu Zhu2†,Longtao Huang2 1Peking University,2Alibaba Group,3The University of Hong Kong †Corresponding authors

###### Abstract

When large language models generate from retrieved or augmented contexts, conflicts between external context and parametric priors remain a central reliability bottleneck\. Existing contrastive decoding methods follow a*context\-aware*paradigm that unilaterally amplifies context over parametric priors, overwriting correct priors when the context is erroneous\. We generalize this to theconflict\-awareparadigm that dynamically allocates authority between prior and context based on conflict signals, rather than presupposing context trustworthiness\. We show that the affine combination of prior and context logits yields apower familywith an inherentregime asymmetry: extrapolation amplifies errors unboundedly when the prior is correct, interpolation under\-corrects when the context is correct, and no static regime covers both\. Existing contrastive decoding methods are instances of this family, mostly extrapolative\. To evaluate both conflict directions, we propose TriState\-Bench, a model\-aware evaluation protocol that calibrates per\-model prior knowledge to measure three conflict states: correction, resistance, and agreement\. To resolve the asymmetry, we propose Adaptive Regime Routing \(ARR\), which routes between regimes at each step, lifting resistance EM from below 6 to 16–33 without sacrificing correction or agreement\. Our code is available at[https://github\.com/keith\-Jiang/conflict\-aware\-decoding](https://github.com/keith-Jiang/conflict-aware-decoding)\.

## 1Introduction

When large language models generate from retrieved or augmented contexts, conflicts between external context and parametric priors remain a central reliability bottleneck\. Although LLMs encode substantial factual knowledge in their parameters\(Petroni et al\.,[2019](https://arxiv.org/html/2606.10298#bib.bib15); Roberts et al\.,[2020](https://arxiv.org/html/2606.10298#bib.bib17)\), this parametric memory is often incomplete, outdated, or incorrect\(Mallen et al\.,[2023](https://arxiv.org/html/2606.10298#bib.bib13); Kasai et al\.,[2023](https://arxiv.org/html/2606.10298#bib.bib4)\), motivating the use of retrieval\-augmented generation\(Lewis et al\.,[2020](https://arxiv.org/html/2606.10298#bib.bib7)\)and web search\(Nakano et al\.,[2021](https://arxiv.org/html/2606.10298#bib.bib14)\)at inference time\. When external context disagrees with the parametric prior, a*knowledge conflict*arises\(Xu et al\.,[2024](https://arxiv.org/html/2606.10298#bib.bib22)\)\.

![Refer to caption](https://arxiv.org/html/2606.10298v1/x1.png)Figure 1:The unified power\-family framework \(upper\) and comparison among proposed ARR and existing methods \(lower\)\.To address this issue, contrastive decoding methods contrast the output distributions with and without context\(Shi et al\.,[2024](https://arxiv.org/html/2606.10298#bib.bib18); Wang et al\.,[2025](https://arxiv.org/html/2606.10298#bib.bib20); Yuan et al\.,[2024](https://arxiv.org/html/2606.10298#bib.bib24); Khandelwal et al\.,[2025](https://arxiv.org/html/2606.10298#bib.bib5)\)\. These methods follow the*context\-aware*paradigm that implicitly assumes context is always more reliable than the prior, unidirectionally amplifying the context\-over\-prior logit increment\. However, when the context is incorrect, this unidirectional amplification indiscriminately pushes the distribution toward it, overriding the prior’s correct probability structure and biasing generation toward wrong answers\.

Therefore, we generalize the problem from the context\-aware paradigm to theconflict\-awareparadigm: rather than presuming the context to be trustworthy, we dynamically allocate authority between the prior and the context at each decoding step based on conflict signals\. This extends intervention to both sides and handles the two opposing conflict states, correction and resistance\.

Under this paradigm, a single\-scalar affine combination ofppri,tp\_\{\\text\{pri\},t\}andpctx,tp\_\{\\text\{ctx\},t\}in logit space yields the minimal parameterization: apower familyqτ,t​\(y\)∝ppri,t​\(y\)1−τ​pctx,t​\(y\)τq\_\{\\tau,t\}\(y\)\\propto p\_\{\\text\{pri\},t\}\(y\)^\{1\-\\tau\}p\_\{\\text\{ctx\},t\}\(y\)^\{\\tau\}, of which existing methods are all instances\. The family partitions atτ=1\\tau=1into two regimes:*interpolation*\(τ∈\(0,1\)\\tau\\in\(0,1\)\), the unique optimum of a KL\-constrained problem that bounds trust reallocation; and*extrapolation*\(τ\>1\\tau\>1\), a penalized objective that suppresses prior\-preferred tokens\. Aregime asymmetryemerges: extrapolation overrides a correct prior, interpolation underweights a correct context, so no single static regime handles both\. Existing contrastive methods lie predominantly on the extrapolation side, structurally lacking resistance coverage\.

To expose and address this asymmetry, we approach from both the evaluation and decoding sides\. On the evaluation side, we proposeTriState\-Bench, a model\-aware evaluation protocol that dynamically assigns each question to one of three conflict states \(correction, resistance, or agreement\), measuring corrective ability, prior preservation, and generation stability separately\. On the decoding side, we proposeAdaptive Regime Routing \(ARR\), a theory\-informed instantiation of the conflict\-aware paradigm that routes between regimes per step from conflict signals inpprior,tp\_\{\\text\{prior\},t\}andpctx,tp\_\{\\text\{ctx\},t\}\. Across four model families, ARR covers both conflict directions, lifting resistance EM from below 6 to 16–33 without loss on correction or agreement\.

Our contributions can be summarized as follows:

- •Power family and regime asymmetry\.We propose a power family as the minimal parameterization of the conflict\-aware paradigm, subsuming existing contrastive decoding methods, and identify an asymmetry between interpolation and extrapolation regimes\.
- •TriState\-Bench\.The first model\-aware tristate benchmark for knowledge conflict, measuring correction, resistance and agreement\.
- •Adaptive Regime Routing \(ARR\)\.A theory\-informed instantiation of the conflict\-aware paradigm that dynamically routes between two regimes per step from conflict signals\.

## 2Related Work

#### Knowledge Conflict in LLMs

Knowledge conflict arises when contextual input contradicts parametric knowledge stored in the weights\(Petroni et al\.,[2019](https://arxiv.org/html/2606.10298#bib.bib15); Roberts et al\.,[2020](https://arxiv.org/html/2606.10298#bib.bib17); Xu et al\.,[2024](https://arxiv.org/html/2606.10298#bib.bib22)\)\. Existing mitigation falls into two families: training\-time fine\-tuning for context faithfulness\(Li et al\.,[2023a](https://arxiv.org/html/2606.10298#bib.bib8); Zhou et al\.,[2023](https://arxiv.org/html/2606.10298#bib.bib26)\)and inference\-time decoding adjustments\. The latter further divides into contrastive decoding, which reweights the prior distributionppri,tp\_\{\\text\{pri\},t\}against the contextual distributionpctx,tp\_\{\\text\{ctx\},t\}\(Shi et al\.,[2024](https://arxiv.org/html/2606.10298#bib.bib18); Wang et al\.,[2025](https://arxiv.org/html/2606.10298#bib.bib20); Yuan et al\.,[2024](https://arxiv.org/html/2606.10298#bib.bib24); Khandelwal et al\.,[2025](https://arxiv.org/html/2606.10298#bib.bib5)\), and hidden\-state intervention that modifies intermediate representations or attention patterns\(Li et al\.,[2025](https://arxiv.org/html/2606.10298#bib.bib9); Zhao et al\.,[2025](https://arxiv.org/html/2606.10298#bib.bib25)\)\. We focus on contrastive decoding methods; hidden\-state methods operate on different internal objects and fall outside our scope\.

#### Contrastive Decoding under Knowledge Conflict

Contrastive decoding\(Li et al\.,[2023b](https://arxiv.org/html/2606.10298#bib.bib10)\)was originally proposed to contrast expert and amateur models\. Subsequent work adapts it to knowledge conflict by contrasting the same model’s output distributions with and without context, denotedpctx,tp\_\{\\text\{ctx\},t\}andppri,tp\_\{\\text\{pri\},t\}respectively\. Methods in this line progress from static to adaptive weighting and from single\-signal to multi\-signal gating: CAD\(Shi et al\.,[2024](https://arxiv.org/html/2606.10298#bib.bib18)\)amplifies the context\-over\-prior logit difference with a fixedα\\alpha; AdaCAD\(Wang et al\.,[2025](https://arxiv.org/html/2606.10298#bib.bib20)\)replaces the fixed weight with a Jensen–Shannon\-based step\-wise coefficient; COIECD\(Yuan et al\.,[2024](https://arxiv.org/html/2606.10298#bib.bib24)\)introduces token\-level conflict detection; and CoCoA\(Khandelwal et al\.,[2025](https://arxiv.org/html/2606.10298#bib.bib5)\)extends the scalar signal to a multi\-signal gate\. These methods are uniformly context\-aware; we generalize to a conflict\-aware paradigm and subsume them as special cases of a unified power family\.

#### Evaluation of Knowledge Conflict

Existing benchmarks evaluate knowledge conflict along different axes\. NQ\-Swap\(Longpre et al\.,[2021](https://arxiv.org/html/2606.10298#bib.bib11)\)and NQ\-Synth\(Wang et al\.,[2025](https://arxiv.org/html/2606.10298#bib.bib20)\)both reduce conflict to a single faithfulness axis: the former substitutes gold entities to test whether the model follows context over its parametric answer, while the latter replaces the context answer with the model’s own output as a context\-agrees\-with\-prior control\. ClashEval\(Wu et al\.,[2024](https://arxiv.org/html/2606.10298#bib.bib21)\)moves to a bidirectional view, separately measuring prior\-biased and context\-biased errors\. Two gaps remain: NQ\-Swap and NQ\-Synth miss the correction state entirely, and ClashEval, though bidirectional, targets end\-to\-end LLM behavior rather than isolating decoding methods\. Moreover, all three assign conflict labels statically, without conditioning on what the model actually believes\. Our protocol addresses both gaps \(Section[5](https://arxiv.org/html/2606.10298#S5)\)\.

## 3Preliminaries and Generalized Framework

### 3\.1Task Setup and Notation

Given a queryxx, an external contextcc, and steptt, we consider the same model’s distributions without and with context:

ppri,t​\(⋅\)\\displaystyle p\_\{\\text\{pri\},t\}\(\\cdot\)=pθ\(⋅∣x,y<t\),\\displaystyle=p\_\{\\theta\}\(\\cdot\\mid x,y\_\{<t\}\),\(1\)pctx,t​\(⋅\)\\displaystyle p\_\{\\text\{ctx\},t\}\(\\cdot\)=pθ\(⋅∣x,c,y<t\)\.\\displaystyle=p\_\{\\theta\}\(\\cdot\\mid x,c,y\_\{<t\}\)\.Their discrepancy reflects the influence ofccon the model’s knowledge state\.

### 3\.2Background

#### Context\-aware Decoding \(CAD\)\.

CAD applies a PMI\-style adjustment that amplifies the context\-over\-prior increment, with a single contrastive strengthα\\alphashared across all tokens:

qtCAD​\(y\)∝pctx,t​\(y\)​\[pctx,t​\(y\)ppri,t​\(y\)\]α\.q\_\{t\}^\{\\text\{CAD\}\}\(y\)\\propto p\_\{\\text\{ctx\},t\}\(y\)\\left\[\\frac\{p\_\{\\text\{ctx\},t\}\(y\)\}\{p\_\{\\text\{pri\},t\}\(y\)\}\\right\]^\{\\alpha\}\.\(2\)

#### COIECD\.

COIECD identifies a conflict token set𝒞t⊆V\\mathcal\{C\}\_\{t\}\\subseteq Vvia the Stable Entropy Hypothesis and switches the base distribution between conflicting and non\-conflicting tokens\. Withgt​\(y\)=log⁡pctx,t​\(y\)−log⁡ppri,t​\(y\)g\_\{t\}\(y\)=\\log p\_\{\\text\{ctx\},t\}\(y\)\-\\log p\_\{\\text\{pri\},t\}\(y\)\. The score is:

stCOIECD​\(y\)=\{log⁡ppri,t​\(y\)\+α​gt​\(y\),y∈𝒞t,log⁡pctx,t​\(y\)\+α​gt​\(y\),y∉𝒞t,s\_\{t\}^\{\\text\{COIECD\}\}\(y\)=\\begin\{cases\}\\log p\_\{\\text\{pri\},t\}\(y\)\+\\alpha g\_\{t\}\(y\),&y\\in\\mathcal\{C\}\_\{t\},\\\\ \\log p\_\{\\text\{ctx\},t\}\(y\)\+\\alpha g\_\{t\}\(y\),&y\\notin\\mathcal\{C\}\_\{t\},\\end\{cases\}\(3\)softmax\-normalization toqtCOIECDq\_\{t\}^\{\\text\{COIECD\}\}\.

#### AdaCAD\.

AdaCAD replaces the static hyperparameterα\\alphain CAD with a dynamic weight based on Jensen\-Shannon divergence,αtJSD=JSD⁡\(ppri,t∥pctx,t\)\\alpha\_\{t\}^\{\\text\{JSD\}\}=\\operatorname\{JSD\}\\\!\\left\(p\_\{\\text\{pri\},t\}\\,\\\|\\,p\_\{\\text\{ctx\},t\}\\right\):

qtAdaCAD​\(y\)∝pctx,t​\(y\)​\[pctx,t​\(y\)ppri,t​\(y\)\]αtJSD\.q\_\{t\}^\{\\text\{AdaCAD\}\}\(y\)\\propto p\_\{\\text\{ctx\},t\}\(y\)\\left\[\\frac\{p\_\{\\text\{ctx\},t\}\(y\)\}\{p\_\{\\text\{pri\},t\}\(y\)\}\\right\]^\{\\alpha\_\{t\}^\{\\text\{JSD\}\}\}\.\(4\)

#### CoCoA\.

CoCoA introduces three conflict signals \(Rényi divergence, entropy gap, and contextual peakedness\) and fuses them into an adaptive gating weightλt\\lambda\_\{t\}to construct the final distribution:

qtCoCoA​\(y\)∝pctx,t​\(y\)λt​ppri,t​\(y\)1−λt\.q\_\{t\}^\{\\text\{CoCoA\}\}\(y\)\\propto p\_\{\\text\{ctx\},t\}\(y\)^\{\\lambda\_\{t\}\}p\_\{\\text\{pri\},t\}\(y\)^\{1\-\\lambda\_\{t\}\}\.\(5\)

#### Summary\.

All four methods are*context\-aware*: they presume the context trustworthy, reducing the design goal to*how to leverage the context more strongly*\. However, a trustworthy context is only a special case\. In general, the relative reliability of prior and context is not known at decoding time: sometimes the context carries correct evidence while the prior reflects outdated or incorrect memory \(*correction state*,𝒮cor\\mathcal\{S\}\_\{\\text\{cor\}\}\); sometimes the prior is more reliable while the context is noisy, incorrect, or mismatched to the query \(*resistance state*,𝒮res\\mathcal\{S\}\_\{\\text\{res\}\}\); sometimes the two already agree, in which case excessive contrast disrupts stable generation \(*agreement state*,𝒮agr\\mathcal\{S\}\_\{\\text\{agr\}\}\)\. We term this broader setting the*conflict\-aware*paradigm\. The output distribution at each decoding step is then determined by dynamically arbitrating betweenppri,tp\_\{\\text\{pri\},t\}andpctx,tp\_\{\\text\{ctx\},t\}based on conflict signals𝒮t\\mathcal\{S\}\_\{t\}:

qt​\(⋅\)=ℱt​\(ppri,t​\(⋅\),pctx,t​\(⋅\),𝒮t\)\.q\_\{t\}\(\\cdot\)=\\mathcal\{F\}\_\{t\}\\\!\\left\(p\_\{\\text\{pri\},t\}\(\\cdot\),\\,p\_\{\\text\{ctx\},t\}\(\\cdot\),\\,\\mathcal\{S\}\_\{t\}\\right\)\.\(6\)

### 3\.3A Generalized Power\-Family View

We instantiateℱ​t\\mathcal\{F\}twith the simplest parameterization: an affine combination oflog⁡p​pri,t\\log p\{\\text\{pri\},t\}andlog⁡pctx,t\\log p\_\{\\text\{ctx\},t\}in logit space, normalized to a power\-family distribution:

qτ,t​\(y\)=1Zτ,t​ppri,t​\(y\)1−τ​pctx,t​\(y\)τ,Zτ,t=∑y′∈Vppri,t​\(y′\)1−τ​pctx,t​\(y′\)τ\.\\displaystyle q\_\{\\tau,t\}\(y\)=\\frac\{1\}\{Z\_\{\\tau,t\}\}\\,p\_\{\\text\{pri\},t\}\(y\)^\{1\-\\tau\}\\,p\_\{\\text\{ctx\},t\}\(y\)^\{\\tau\},\\quad Z\_\{\\tau,t\}=\\sum\_\{y^\{\\prime\}\\in V\}p\_\{\\text\{pri\},t\}\(y^\{\\prime\}\)^\{1\-\\tau\}\\,p\_\{\\text\{ctx\},t\}\(y^\{\\prime\}\)^\{\\tau\}\.\(7\)τ∈\[0,1\]\\tau\\in\[0,1\]interpolates between prior and context distributions;τ\>1\\tau\>1extrapolates beyond the context by applying a negative exponent to the prior\. The four methods in Section[3\.2](https://arxiv.org/html/2606.10298#S3.SS2)are all special cases that differ only in their choice ofτ\\tau, each occupying a fixed one\-sided position \(Figure[1](https://arxiv.org/html/2606.10298#S1.F1), Table[1](https://arxiv.org/html/2606.10298#S3.T1); extended discussion and derivations in Appendix[A](https://arxiv.org/html/2606.10298#A1)\)\.

MethodFunctional formτ\\tauRegimeCADpctx,t1\+α​ppri,t−αp\_\{\\text\{ctx\},t\}^\{1\+\\alpha\}p\_\{\\text\{pri\},t\}^\{\-\\alpha\}τ=1\+α\\tau=1\+\\alphaExtrapolationAdaCADpctx,t1\+αtJSD​ppri,t−αtJSDp\_\{\\text\{ctx\},t\}^\{1\+\\alpha\_\{t\}^\{\\text\{JSD\}\}\}p\_\{\\text\{pri\},t\}^\{\-\\alpha\_\{t\}^\{\\text\{JSD\}\}\}τt=1\+αtJSD\\tau\_\{t\}=1\+\\alpha\_\{t\}^\{\\text\{JSD\}\}ExtrapolationCOIECDppri,t1−λt​pctx,tλtp\_\{\\text\{pri\},t\}^\{1\-\\lambda\_\{t\}\}p\_\{\\text\{ctx\},t\}^\{\\lambda\_\{t\}\}τt∈\{α,1\+α\}\\tau\_\{t\}\\in\\\{\\alpha,1\+\\alpha\\\}ExtrapolationCoCoA∗ppri,t1−λt​pctx,tλtp\_\{\\text\{pri\},t\}^\{1\-\\lambda\_\{t\}\}p\_\{\\text\{ctx\},t\}^\{\\lambda\_\{t\}\}τt=λt\+γ\\tau\_\{t\}=\\lambda\_\{t\}\+\\gammaExtrapolationTable 1:Existing contrastive decoding methods cast into the unified power familyqτ,t​\(y\)∝ppri,t​\(y\)1−τ​pctx,t​\(y\)τq\_\{\\tau,t\}\(y\)\\propto p\_\{\\text\{pri\},t\}\(y\)^\{1\-\\tau\}p\_\{\\text\{ctx\},t\}\(y\)^\{\\tau\}\.

## 4Regime Structure and Conflict Asymmetry

### 4\.1Interpolation vs\. Extrapolation

###### Theorem 1\(Interpolation as a KL\-Constrained Optimum\)\.

For anyϵ∈\[0,𝔻KL​\(ppri,t∥pctx,t\)\]\\epsilon\\in\\bigl\[0,\\,\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(p\_\{\\text\{pri\},t\}\\,\\\|\\,p\_\{\\text\{ctx\},t\}\)\\bigr\], consider the constrained optimization problem

minq∈Δ​\(V\)⁡𝔻KL​\(q∥ppri,t\)s\.t\.​𝔻KL​\(q∥pctx,t\)≤ϵ\.\\displaystyle\\min\_\{q\\in\\Delta\(V\)\}\\ \\mathbb\{D\}\_\{\\mathrm\{KL\}\}\\bigl\(q\\,\\\|\\,p\_\{\\text\{pri\},t\}\\bigr\)\\quad\\text\{s\.t\.\}\\ \\mathbb\{D\}\_\{\\mathrm\{KL\}\}\\bigl\(q\\,\\\|\\,p\_\{\\text\{ctx\},t\}\\bigr\)\\;\\leq\\;\\epsilon\.\(8\)The problem admits a unique optimumq⋆q^\{\\star\}, given in closed form by

q⋆​\(y\)=1Zτ,t​ppri,t​\(y\)1−τ​pctx,t​\(y\)τ,q^\{\\star\}\(y\)=\\frac\{1\}\{Z\_\{\\tau,t\}\}\\,p\_\{\\text\{pri\},t\}\(y\)^\{1\-\\tau\}\\,p\_\{\\text\{ctx\},t\}\(y\)^\{\\tau\},\(9\)whereτ∈\[0,1\]\\tau\\in\[0,1\]is in monotone bijection withϵ\\epsilon\. The endpointsτ=0\\tau=0andτ=1\\tau=1recoverppri,tp\_\{\\text\{pri\},t\}andpctx,tp\_\{\\text\{ctx\},t\}, respectively\.

###### Theorem 2\(Extrapolation as a KL\-Penalized Optimum\)\.

For anyη∈\[0,1\)\\eta\\in\[0,1\), consider the penalized optimization problem

minq∈Δ​\(V\)⁡𝔻KL​\(q∥pctx,t\)−η​𝔻KL​\(q∥ppri,t\)\.\\min\_\{q\\in\\Delta\(V\)\}\\;\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\\bigl\(q\\,\\\|\\,p\_\{\\text\{ctx\},t\}\\bigr\)\\;\-\\;\\eta\\,\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\\bigl\(q\\,\\\|\\,p\_\{\\text\{pri\},t\}\\bigr\)\.\(10\)The problem admits a unique optimumq⋆q^\{\\star\}, given in closed form by

q⋆​\(y\)=1Zτ,t​ppri,t​\(y\)1−τ​pctx,t​\(y\)τ,q^\{\\star\}\(y\)=\\frac\{1\}\{Z\_\{\\tau,t\}\}\\,p\_\{\\text\{pri\},t\}\(y\)^\{1\-\\tau\}\\,p\_\{\\text\{ctx\},t\}\(y\)^\{\\tau\},\(11\)whereτ=11−η∈\[1,\+∞\)\\tau=\\frac\{1\}\{1\-\\eta\}\\in\[1,\+\\infty\)is in monotone bijection withη\\eta\. Atη=0\\eta=0,τ=1\\tau=1recoverspctx,tp\_\{\\text\{ctx\},t\}; asη↑1\\eta\\uparrow 1,τ→\+∞\\tau\\to\+\\infty\. The exponent1−τ1\-\\tauis now negative: the objective actively pushesqqaway fromppri,tp\_\{\\text\{pri\},t\}while keeping it close topctx,tp\_\{\\text\{ctx\},t\}\.

The two theorems mathematically justify why the power family is the right parameterization and reveal each regime’s structure \(Appendix[B](https://arxiv.org/html/2606.10298#A2)\)\. Interpolation \(τ∈\[0,1\]\\tau\\in\[0,1\]\) yields a*bounded trust reallocation*:qτ,tq\_\{\\tau,t\}is the unique optimum balancing prior\-closeness against context\-movement, with values always sandwiched betweenppri,tp\_\{\\text\{pri\},t\}andpctx,tp\_\{\\text\{ctx\},t\}\. Extrapolation \(τ\>1\\tau\>1\) advances pastpctx,tp\_\{\\text\{ctx\},t\}and imposes a*negative\-exponent suppression on prior\-favored tokens*\. The two meet atτ=1\\tau=1, the limit of interpolation \(ϵ→0\\epsilon\\to 0\) and the start of extrapolation \(η→0\\eta\\to 0\)\.

### 4\.2Regime Asymmetry

###### Definition 3\(Pairwise Log\-Odds\)\.

For anya,b∈Va,b\\in V, the pairwise log\-odds under the power familyqτ,tq\_\{\\tau,t\}is defined as

ℓa,b​\(τ\):=log⁡qτ,t​\(a\)qτ,t​\(b\)=\(1−τ\)​ℓa,bpri\+τ​ℓa,bctx,\\ell\_\{a,b\}\(\\tau\)\\;:=\\;\\log\\frac\{q\_\{\\tau,t\}\(a\)\}\{q\_\{\\tau,t\}\(b\)\}\\;=\\;\(1\-\\tau\)\\,\\ell^\{\\text\{pri\}\}\_\{a,b\}\\;\+\\;\\tau\\,\\ell^\{\\text\{ctx\}\}\_\{a,b\},\(12\)whereℓa,bpri=log⁡\[ppri,t​\(a\)/ppri,t​\(b\)\]\\ell^\{\\text\{pri\}\}\_\{a,b\}=\\log\[p\_\{\\text\{pri\},t\}\(a\)/p\_\{\\text\{pri\},t\}\(b\)\]andℓa,bctx=log⁡\[pctx,t​\(a\)/pctx,t​\(b\)\]\\ell^\{\\text\{ctx\}\}\_\{a,b\}=\\log\[p\_\{\\text\{ctx\},t\}\(a\)/p\_\{\\text\{ctx\},t\}\(b\)\]\.

###### Proposition 4\(Pairwise Reversal Threshold\)\.

LetΔa,b:=ℓa,bctx−ℓa,bpri\\Delta\_\{a,b\}:=\\ell^\{\\text\{ctx\}\}\_\{a,b\}\-\\ell^\{\\text\{pri\}\}\_\{a,b\}\. IfΔa,b≠0\\Delta\_\{a,b\}\\neq 0, there exists a unique*pairwise reversal threshold*

τa,b⋆:=−ℓa,bpriΔa,b\\tau^\{\\star\}\_\{a,b\}:=\-\\frac\{\\ell^\{\\text\{pri\}\}\_\{a,b\}\}\{\\Delta\_\{a,b\}\}\(13\)such thatℓa,b​\(τa,b⋆\)=0\\ell\_\{a,b\}\(\\tau^\{\\star\}\_\{a,b\}\)=0\. Asτ\\taucrossesτa,b⋆\\tau^\{\\star\}\_\{a,b\}, the preference ofqτ,tq\_\{\\tau,t\}betweenaaandbbreverses\. IfΔa,b=0\\Delta\_\{a,b\}=0, the pairwise log\-odds is constant and no reversal occurs\.

###### Corollary 5\(Conflict\-State Geometry of the Crossover Point\)\.

Letaabe the ground\-truth token andbba competing token\.

- •Correction \(𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}\):ℓa,bpri<0<ℓa,bctx\\ell^\{\\mathrm\{pri\}\}\_\{a,b\}<0<\\ell^\{\\mathrm\{ctx\}\}\_\{a,b\}, soτa,b⋆∈\(0,1\)\\tau^\{\\star\}\_\{a,b\}\\in\(0,1\): the flip from incorrect to correct occurs within interpolation\.
- •Resistance \(𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}\):ℓa,bpri\>0\>ℓa,bctx\\ell^\{\\mathrm\{pri\}\}\_\{a,b\}\>0\>\\ell^\{\\mathrm\{ctx\}\}\_\{a,b\}, soτa,b⋆∈\(0,1\)\\tau^\{\\star\}\_\{a,b\}\\in\(0,1\)but the flip is reversed: increasingτ\\taumovesqτq\_\{\\tau\}from correct to incorrect, and extrapolation \(τ\>1\\tau\>1\) further amplifies the error\.
- •Agreement \(𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}\):ℓa,bpri,ℓa,bctx\>0\\ell^\{\\mathrm\{pri\}\}\_\{a,b\},\\ell^\{\\mathrm\{ctx\}\}\_\{a,b\}\>0; forτ∈\[0,1\]\\tau\\in\[0,1\],ℓa,b​\(τ\)\\ell\_\{a,b\}\(\\tau\)stays positive\. The family is automatically stable\.

###### Corollary 6\(Length Distortion under Extrapolation\)\.

Letccbe a continuation token andssa stop token\. By Proposition[4](https://arxiv.org/html/2606.10298#Thmcorollary4),ℓc,s​\(τ\)=ℓc,sctx\+\(τ−1\)​Δc,s\\ell\_\{c,s\}\(\\tau\)=\\ell^\{\\mathrm\{ctx\}\}\_\{c,s\}\+\(\\tau\-1\)\\Delta\_\{c,s\}\. Forτ\>1\\tau\>1, the log\-odds extends pastpctxp\_\{\\mathrm\{ctx\}\}with unbounded displacement:

- •Δc,s\>0\\Delta\_\{c,s\}\>0:*over\-generation*\(continuation amplified beyond context\)\.
- •Δc,s<0\\Delta\_\{c,s\}<0:*early stopping*\(stopping amplified beyond context\)\.
- •Δc,s=0\\Delta\_\{c,s\}=0: no overshoot\.

The two corollaries reveal a two\-level asymmetry\. At the*answer level*, interpolation confines log\-odds to the convex hull of the two endpoints: it corrects the prior in𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}, never overshoots either source, and is stable in𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}\. Extrapolation extends log\-odds pastpctxp\_\{\\mathrm\{ctx\}\}without bound, amplifying the wrong preference in𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}\. At the*generation level*, this yields over\-generation \(Δ​c,s\>0\\Delta\{c,s\}\>0\) or early termination \(Δc,s<0\\Delta\_\{c,s\}<0\)\. No staticτ\>1\\tau\>1is therefore universally safe: the strength that corrects in𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}catastrophically amplifies errors in𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}and distorts generation length\.

## 5TriState\-Bench: Model\-Aware Tri\-State Evaluation Protocol

To observe the conflict states in Corollary[5](https://arxiv.org/html/2606.10298#Thmcorollary5), we construct TriState\-Bench: a model\-independent fact repositoryℱ\\mathcal\{F\}paired with per\-model tri\-state labels\.

#### Step 1: Fact Repository \(Model\-independent\)\.

As shown in Figure[2](https://arxiv.org/html/2606.10298#S5.F2), each factfi∈ℱf\_\{i\}\\in\\mathcal\{F\}contains a ground\-truth answeraia\_\{i\}with alias set𝒜i\\mathcal\{A\}\_\{i\}, an answer type drawn from three categories: person, location, and scientific term, three question variantsqi\(1\.\.3\)q\_\{i\}^\{\(1\.\.3\)\}targeting the sameaia\_\{i\}, and a correct/incorrect context pairci\+/ci−c\_\{i\}^\{\+\}/c\_\{i\}^\{\-\}linked through an alternate entitya~i\\tilde\{a\}\_\{i\}\. Facts are bucket\-sampled from DBpedia, grounded by Wikipedia, and rewritten by an LLM \(details in Appendix[C\.1](https://arxiv.org/html/2606.10298#A3.SS1)\)\.

#### Step 2: Prior Calibration \(Per\-model\)\.

The target modelMMdecodes the three questions of each factfif\_\{i\}without context, combining a greedy hard gate with stochastic sampling\. A fact is assigned toℱrightM\\mathcal\{F\}\_\{\\mathrm\{right\}\}^\{M\}if all three questions are matched, toℱwrongM\\mathcal\{F\}\_\{\\mathrm\{wrong\}\}^\{M\}if all are missed, or excluded as uncertain otherwise\. The precise calibration rules are given in Appendix[C\.2](https://arxiv.org/html/2606.10298#A3.SS2)\.

#### Step 3: Benchmark Assembly \(per\-model\)\.

Given the prior labels from Step 2, we revisit the fact repository and assemble tri\-state samples by pairing each fact with the corresponding pre\-generated context\.

- •Correction𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}\(prior wrong, context right\):fi∈ℱwrongMf\_\{i\}\\in\\mathcal\{F\}\_\{\\text\{wrong\}\}^\{M\}paired withci\+c\_\{i\}^\{\+\}, targeting correction capability\.
- •Resistance𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}\(prior right, context wrong\):fi∈ℱrightMf\_\{i\}\\in\\mathcal\{F\}\_\{\\text\{right\}\}^\{M\}paired withci−c\_\{i\}^\{\-\}, targeting prior preservation\.
- •Agreement𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}\(prior right, context right\): the samefi∈ℱrightMf\_\{i\}\\in\\mathcal\{F\}\_\{\\text\{right\}\}^\{M\}paired withci\+c\_\{i\}^\{\+\}, targeting generation stability\.

The doubly\-wrong case \(prior wrong, context wrong\) is excluded as it lies outside the power family’s scope\. Finally, we construct 6,471 facts and sample 400 per benchmark under the inference budget\.

![Refer to caption](https://arxiv.org/html/2606.10298v1/x2.png)Figure 2:Overview of TriState\-Bench pipeline\.

## 6Adaptive Regime Routing \(ARR\)

The regime asymmetry established in Section[4\.2](https://arxiv.org/html/2606.10298#S4.SS2)imposes two design requirements onℱt\\mathcal\{F\}\_\{t\}in Eq\.[6](https://arxiv.org/html/2606.10298#S3.E6):

- •Bidirectional routing with a directional gate\.Corollary[5](https://arxiv.org/html/2606.10298#Thmcorollary5)shows that interpolation is needed for𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}and𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}, while extrapolation is needed for𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}\. The method must routeτ\\tauto both sides of 1, which in turn requires a gate that resolves which side is trustworthy, not merely that conflict exists\.
- •Bounded strength\.Corollary[6](https://arxiv.org/html/2606.10298#Thmcorollary6)shows that unboundedτ\\taucauses length distortion \(over\-generation or early termination\)\. The contrastive strength must remain bounded\.

ARR instantiates the conflict\-aware paradigm under these requirements with a binary gate, a bounded divergence measure, and a compositional routing rule\.

#### Gate\.

A binary signaldt∈\{0,1\}d\_\{t\}\\in\\\{0,1\\\}routes between interpolation and extrapolation\. Candidate gate signals fall into two categories:*divergence magnitude*signals that detect conflict existence without resolving direction, and*confidence asymmetry*signals that additionally indicate which side is more committed\. Since the gate must distinguish𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}from𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}, directionality is essential\.

dt=𝟙​\[gt\>0\],d\_\{t\}=\\mathbb\{1\}\\\!\\left\[\\,g\_\{t\}\>0\\,\\right\],\(14\)We adopt max\-probability gap as the gate in this work:

gt=maxy⁡pctx,t​\(y\)−maxy⁡ppri,t​\(y\)\.g\_\{t\}=\\max\_\{y\}p\_\{\\mathrm\{ctx\},t\}\(y\)\-\\max\_\{y\}p\_\{\\mathrm\{pri\},t\}\(y\)\.\(15\)Intuitively, a higher top\-1 probability indicates that the distribution commits more mass to a single token\. When the context side is more committed than the prior, it signals that the context has formed a concentrated prediction, characteristic of𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}, where the context points toward the correct answer\. Empirical validation against alternative signals is provided in Section[7\.3](https://arxiv.org/html/2606.10298#S7.SS3)\.

#### Strength\.

A bounded signalst∈\[0,1\]s\_\{t\}\\in\[0,1\]controls the contrastive magnitude\. We use the normalized Jensen–Shannon divergence:

st=JSD​\(pctx,t∥ppri,t\)/log⁡2\.s\_\{t\}=\\mathrm\{JSD\}\\bigl\(p\_\{\\mathrm\{ctx\},t\}\\,\\\|\\,p\_\{\\mathrm\{pri\},t\}\\bigr\)/\\log 2\.\(16\)Unlike a fixed contrastive weight,τ\\tauvaries continuously with per\-token conflict: strong contrast where the two distributions sharply disagree, near\-zero adjustment where they align, removing the need for a single hyperparameter to cover all positions\. We adopt JSD rather than KL divergence because the gate already resolves directionality; the strength need only quantify how far both sides deviate from their midpoint without encoding a directional preference\. JSD’s symmetric formulation naturally fills this role\.

#### Routing Rule\.

The gate and strength compose into a mixing coefficient and decoding distribution:

τt=1\+\(2​dt−1\)​st,\\tau\_\{t\}=1\+\(2d\_\{t\}\-1\)\\,s\_\{t\},\(17\)qτt,t​\(y\)∝ppri,t​\(y\)1−τt​pctx,t​\(y\)τt\.q\_\{\\tau\_\{t\},t\}\(y\)\\propto p\_\{\\mathrm\{pri\},t\}\(y\)^\{1\-\\tau\_\{t\}\}\\,p\_\{\\mathrm\{ctx\},t\}\(y\)^\{\\tau\_\{t\}\}\.\(18\)dt=1d\_\{t\}\{=\}1extrapolates \(τt∈\[1,2\]\\tau\_\{t\}\\in\[1,2\]\);dt=0d\_\{t\}\{=\}0interpolates \(τt∈\[0,1\]\\tau\_\{t\}\\in\[0,1\]\)\. Unlike existing methods restricted toτ≥1\\tau\\geq 1, ARR can also pull back toward the prior \(τt<1\\tau\_\{t\}<1\)\.

## 7Experiments and Results

### 7\.1Experimental Setup

#### Datasets and Metrics\.

We evaluate on two sets of benchmarks\. The first group consists of four established short\-form QA datasets: Natural Questions \(NQ;Kwiatkowski et al\.,[2019](https://arxiv.org/html/2606.10298#bib.bib6)\), TriviaQA\(Joshi et al\.,[2017](https://arxiv.org/html/2606.10298#bib.bib3)\), HotpotQA\(Yang et al\.,[2018](https://arxiv.org/html/2606.10298#bib.bib23)\), and the tabular dataset TabMWP\(Lu et al\.,[2022](https://arxiv.org/html/2606.10298#bib.bib12)\)\. Together, they represent a context\-faithful QA setting where the gold context is trustworthy and parametric priors play a minimal role \(NQ, TriviaQA\), a multi\-hop setting that requires aggregating evidence across documents \(HotpotQA\), and a structured\-context setting that requires numerical reasoning over tables \(TabMWP\)\.

The second group is our TriState\-Bench, which evaluates the three conflict states: correction \(𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}\), resistance \(𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}\), and agreement \(𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}\)\. Each state contains 400 benchmark samples\. Figure[3](https://arxiv.org/html/2606.10298#S7.F3)reports the share of𝒫right\\mathcal\{P\}\_\{\\mathrm\{right\}\},𝒫uncertain\\mathcal\{P\}\_\{\\mathrm\{uncertain\}\}, and𝒫wrong\\mathcal\{P\}\_\{\\mathrm\{wrong\}\}for each model on the same fact pool\. The composition varies noticeably:𝒫wrong\\mathcal\{P\}\_\{\\mathrm\{wrong\}\}ranges from 11\.1% to 20\.3%, and Person\-related facts dominate both correct and incorrect priors, while Location facts are disproportionately well\-known and Scientific Terms disproportionately unknown\. These patterns confirm the necessity of our per\-model prior calibration\.

![Refer to caption](https://arxiv.org/html/2606.10298v1/x3.png)Figure 3:Distribution of prior knowledge states across four LLMs\. \(a\) Prior state distribution showing the proportion of facts where each model holds correct \(𝒫right\\mathcal\{P\}\_\{\\text\{right\}\}\), uncertain \(𝒫uncertain\\mathcal\{P\}\_\{\\text\{uncertain\}\}\), or incorrect \(𝒫wrong\\mathcal\{P\}\_\{\\text\{wrong\}\}\) priors\. \(b–c\) Entity\-type composition of𝒫right\\mathcal\{P\}\_\{\\text\{right\}\}and𝒫wrong\\mathcal\{P\}\_\{\\text\{wrong\}\}\.We report Exact Match \(EM\) and token\-level F1 on all benchmarks\. For additional details of QA datasets, refer to Appendix[D\.1](https://arxiv.org/html/2606.10298#A4.SS1)\.

#### Source of Context\.

For NQ, TriviaQA, and HotpotQA we use the provided gold context; for TabMWP, the table is the context and the problem statement is the query\. For TriState\-Bench, the gold context is a web\-retrieved evidence block grounded to the canonical answer, and the corrupted context swaps the gold span for a type\-matched distractor\. The examples of\(x,c\)\(x,c\)are shown in Table[5](https://arxiv.org/html/2606.10298#A4.T5); prompts are listed in Appendix[H](https://arxiv.org/html/2606.10298#A8)\.

#### Models\.

We conduct experiments on four open\-weight families, each in base and instruction\-tuned variants: Llama2\-13B\(Touvron et al\.,[2023](https://arxiv.org/html/2606.10298#bib.bib19)\), Llama3\-8B\(Grattafiori et al\.,[2024](https://arxiv.org/html/2606.10298#bib.bib1)\), Mistral\-7B\(Jiang et al\.,[2023](https://arxiv.org/html/2606.10298#bib.bib2)\), and Qwen2\.5\-7B\(Qwen et al\.,[2025](https://arxiv.org/html/2606.10298#bib.bib16)\)\. Across this set, we first test whether the regime asymmetry of Corollary[5](https://arxiv.org/html/2606.10298#Thmcorollary5)manifests empirically, and then whether ARR remains robust across families and instruction tuning\.

#### Baselines\.

We compare ARR against five test\-time decoding baselines: Greedy Decoding; CAD\(Shi et al\.,[2024](https://arxiv.org/html/2606.10298#bib.bib18)\)\(α=1\\alpha=1\); COIECD\(Yuan et al\.,[2024](https://arxiv.org/html/2606.10298#bib.bib24)\)\(λ=0\.25\\lambda=0\.25, innerα=1\\alpha=1\); AdaCAD\(Wang et al\.,[2025](https://arxiv.org/html/2606.10298#bib.bib20)\)\(JSD\-drivenαt\\alpha\_\{t\}\); and CoCoA\(Khandelwal et al\.,[2025](https://arxiv.org/html/2606.10298#bib.bib5)\)\(Rényi order0\.50\.5, peakedness weightz=5\.0z=5\.0, entropy\-gap weightγ=1\.0\\gamma=1\.0\)\. To further probe how regime choice affects each conflict state, we also include Greedy\-no\-ctx, which decodes without context as a prior\-only reference, together with sweeps of simple interpolation \(τ∈\{0\.25,0\.5,0\.75,1\.0\}\\tau\\in\\\{0\.25,0\.5,0\.75,1\.0\\\}\) and tuned CAD \(α∈\{0\.25,0\.5,0\.75,1\.0\}\\alpha\\in\\\{0\.25,0\.5,0\.75,1\.0\\\}\)\.

### 7\.2Main Results

ModelMethodNQTabMWPHotpotQATriviaQATriStateAvg\.EMF1EMF1EMF1EMF1EMF1EMF1Llama3\-8BGreedy31\.6147\.6829\.5035\.3118\.5633\.0654\.4064\.4642\.9251\.4935\.4046\.40CAD14\.7630\.7412\.6016\.405\.4415\.6518\.5527\.4614\.5828\.7013\.1923\.79COIECD22\.2639\.8119\.1023\.0412\.0525\.9333\.9543\.3922\.3334\.8721\.9433\.41AdaCAD28\.3344\.9329\.7035\.3416\.8431\.0851\.2561\.6334\.4244\.7632\.1143\.55CoCoA22\.2338\.9925\.1030\.6610\.5623\.5939\.0048\.8620\.7533\.7023\.5335\.16ARR\(Ours\)36\.3751\.5830\.0035\.1619\.0333\.1156\.1066\.5061\.6768\.1440\.6350\.90Mistral\-7BGreedy39\.2550\.317\.9011\.1819\.5428\.2256\.4564\.7159\.7565\.1736\.5843\.92CAD15\.8422\.765\.707\.464\.627\.2123\.8532\.9831\.9237\.1516\.3921\.51COIECD35\.9846\.0110\.2012\.3016\.3023\.1041\.7548\.0152\.9257\.8831\.4337\.46AdaCAD36\.8646\.523\.706\.4018\.8726\.8754\.9063\.2958\.8364\.0934\.6341\.43CoCoA13\.8629\.246\.208\.211\.199\.2723\.8532\.9829\.3341\.7614\.8924\.29ARR\(Ours\)36\.9049\.366\.1010\.0518\.8028\.4157\.6565\.8764\.1770\.2836\.7244\.79Qwen2\.5\-7BGreedy60\.3570\.8737\.5039\.6828\.9840\.1439\.1547\.6358\.8364\.7144\.9652\.61CAD21\.8532\.9913\.6015\.713\.6912\.637\.0016\.7220\.7537\.3813\.3823\.09COIECD53\.2265\.8136\.5039\.6318\.3929\.3525\.8035\.4341\.0051\.9234\.9844\.43AdaCAD57\.2168\.7137\.8040\.2724\.1334\.5633\.5042\.1954\.7561\.6741\.4849\.48CoCoA33\.7250\.3629\.9033\.186\.8217\.6614\.6525\.7219\.1734\.4020\.8532\.26ARR\(Ours\)48\.3861\.2037\.0039\.5429\.9840\.0044\.9053\.5563\.2568\.7644\.7052\.61Llama2\-13BGreedy46\.5459\.6021\.1023\.4123\.7734\.4459\.0067\.8960\.6765\.4042\.2250\.15CAD30\.4148\.1214\.8017\.8210\.5719\.7438\.0549\.5746\.3354\.6128\.0337\.97COIECD44\.9460\.4218\.7022\.1221\.0532\.2149\.2060\.0655\.0060\.5937\.7847\.08AdaCAD47\.8060\.8621\.7024\.5123\.6134\.4958\.4567\.4560\.0864\.9642\.3350\.45CoCoA43\.0258\.3618\.5021\.5518\.5029\.0851\.5561\.9255\.7561\.3437\.4646\.45ARR\(Ours\)43\.4357\.1921\.2022\.8823\.0833\.6958\.9567\.6264\.9269\.3742\.3250\.15

Table 2:Performance comparison across benchmarks\. Each benchmark reports EM and F1\. TheAvg\.column is the arithmetic mean over all five benchmarks \(NQ, TabMWP, HotpotQA, TriviaQA, and TriState\-Bench\)\. Bold marks the best value within each model block\.#### Main results\.

Table[2](https://arxiv.org/html/2606.10298#S7.T2)reports performance on the four standard QA benchmarks together with TriState\-Bench\. The standard QA columns test whether a conflict\-aware decoder remains safe when the resistance subset𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}is rare; TriState\-Bench measures aggregate conflict\-resolution ability\.

On standard QA, existing context\-aware baselines regress substantially relative to Greedy decoding: CAD drops Llama3\-8B NQ from31\.6131\.61to14\.7614\.76EM, and CoCoA reduces Qwen2\.5\-7B HotpotQA from28\.9828\.98to6\.826\.82\. Both failures instantiate the over\-generation mode predicted by Corollary[6](https://arxiv.org/html/2606.10298#Thmcorollary6): contrastive extrapolation pushes log\-odds pastpctx,tp\_\{\\mathrm\{ctx\},t\}even when the context is already correct\. ARR matches or exceeds Greedy on most dataset–model pairs and is the best contrastive decoder on average for Llama3\-8B, Mistral\-7B, and Llama2\-13B, remaining within0\.260\.26EM of Greedy on Qwen2\.5\-7B\. No other context\-aware method avoids regression against Greedy across all four models\. On TriState\-Bench, ARR leads every baseline for all four base models\.

#### TriState\-Bench decomposition\.

Table[3](https://arxiv.org/html/2606.10298#S7.T3)decomposes TriState\-Bench into correction \(𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}\), resistance \(𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}\), and agreement \(𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}\)\. All context\-aware baselines collapse on𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}: across the four models, every contrastive method scores below3\.53\.5EM, with CAD and CoCoA below11\. This is the empirical signature of regime asymmetry: a strengthτ\>1\\tau\>1that corrects on𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}simultaneously amplifies the wrong preference on𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}\. ARR is the only method that recovers𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}, lifting EM to15\.7515\.75–33\.2533\.25across models, an order of magnitude above the strongest baseline\. The gain stems from switching the gate toτ<1\\tau<1wheneverpctx,tp\_\{\\mathrm\{ctx\},t\}is less committed thanppri,tp\_\{\\mathrm\{pri\},t\}\(Theorem[1](https://arxiv.org/html/2606.10298#Thmcorollary1)\)\. Crucially, this does not sacrifice𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}or𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}: ARR is best or near\-best on both subsets across all four models\. More results appear in Tables[6](https://arxiv.org/html/2606.10298#A5.T6),[7](https://arxiv.org/html/2606.10298#A5.T7)and Appendix[F](https://arxiv.org/html/2606.10298#A6)\.

ModelMethod𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}EMF1EMF1EMF1Llama3\-8BGreedy59\.5070\.794\.509\.7864\.7573\.92CAD21\.7541\.070\.757\.1121\.2537\.92COIECD30\.2547\.351\.756\.2935\.0050\.99AdaCAD51\.5064\.193\.259\.6748\.5060\.43CoCoA31\.2547\.881\.007\.6830\.0045\.52ARR\(Ours\)75\.0082\.6833\.2538\.5976\.7583\.15Mistral\-7BGreedy91\.2594\.755\.2512\.9982\.7587\.76CAD70\.5078\.480\.255\.4725\.0027\.51COIECD85\.0089\.461\.008\.6172\.7575\.56AdaCAD91\.0094\.252\.2510\.0683\.2587\.95CoCoA55\.5070\.100\.506\.9832\.0048\.19ARR\(Ours\)88\.0092\.7322\.5030\.6682\.0087\.46Qwen2\.5\-7BGreedy89\.5093\.321\.0010\.0486\.0090\.76CAD25\.7549\.650\.256\.3036\.2556\.18COIECD65\.7576\.630\.507\.1956\.7571\.96AdaCAD81\.2587\.500\.758\.9482\.2588\.56CoCoA26\.5046\.410\.256\.1930\.7550\.60ARR\(Ours\)85\.7591\.0915\.7522\.9288\.2592\.27Llama2\-13BGreedy86\.5090\.791\.759\.5893\.7595\.83CAD66\.0075\.740\.507\.7572\.5080\.35COIECD80\.0084\.911\.008\.6284\.0088\.25AdaCAD85\.2589\.831\.759\.6593\.2595\.40CoCoA80\.2585\.441\.509\.1385\.5089\.46ARR\(Ours\)84\.2589\.1617\.0023\.2193\.7595\.81

Table 3:Performance on the three TriState\-Bench subsets\.𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}\(Correction\): gold context, prior incorrect;𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}\(Resistance\): corrupted context, prior correct;𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}\(Agreement\): gold context, prior correct\. ARR consistently dominates the resistance subset𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}where all baselines collapse, while staying competitive on𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}and𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}\. Bold marks the best value within each model block\.

### 7\.3Gate Validation

The gate \(Eq\.[14](https://arxiv.org/html/2606.10298#S6.E14)\) routes between interpolation and extrapolation based on whether the context is more committed than the prior\. We validate this choice by comparing four candidate signals, grouped by whether they carry directional information:

#### Magnitude\-only \(detect conflict existence\)\.

- •A:arg⁡max⁡pctx≠arg⁡max⁡ppri\\arg\\max p\_\{\\text\{ctx\}\}\\neq\\arg\\max p\_\{\\text\{pri\}\}\(top\-1 token differs\)\.
- •B:JSD​\(pctx∥ppri\)\>0\.5\\text\{JSD\}\(p\_\{\\text\{ctx\}\}\\,\\\|\\,p\_\{\\text\{pri\}\}\)\>0\.5\(distributions are sharply divergent\)\.

#### Confidence\-asymmetry \(resolve conflict direction\)\.

- •C:H​\(ppri\)\>H​\(pctx\)H\(p\_\{\\text\{pri\}\}\)\>H\(p\_\{\\text\{ctx\}\}\)\(context sharpens the distribution\)\.
- •D:max⁡pctx\>max⁡ppri\\max p\_\{\\text\{ctx\}\}\>\\max p\_\{\\text\{pri\}\}\(context raises top\-1 confidence\)\.

![Refer to caption](https://arxiv.org/html/2606.10298v1/x4.png)Figure 4:Gate accuracy of each candidate signal\. Confidence\-asymmetry signals \(C, D\) consistently outperform magnitude\-only signals \(A, B\), confirming that directionality is necessary for regime separation\.Magnitude\-only signals fire wheneverppri,tp\_\{\\text\{pri\},t\}andpctx,tp\_\{\\text\{ctx\},t\}diverge but cannot determine which side is more reliable; signal A, in particular, barely exceeds the chance rate of 0\.5\. Confidence\-asymmetry signals additionally resolve direction: a positive value indicates that the context concentrates more mass on its top prediction than the prior does\. Because𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}and𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}differ precisely in this asymmetry \(Corollary[5](https://arxiv.org/html/2606.10298#Thmcorollary5)\), only directional signals can reliably separate the two regimes\. Figure[4](https://arxiv.org/html/2606.10298#S7.F4)confirms this: both C and D surpass B by a clear margin, while A hovers near chance, indicating that detecting conflict alone is insufficient without resolving its direction\.

### 7\.4Ablation Study

MethodTrad QATriStateAvg\.Greedy41\.5/49\.658\.8/64\.745\.0/52\.6CAD11\.5/19\.520\.8/37\.413\.4/23\.1COIECD33\.5/42\.641\.0/51\.935\.0/44\.4AdaCAD38\.2/46\.454\.8/61\.741\.5/49\.5CoCoA21\.3/31\.719\.2/34\.420\.9/32\.3ARR\-JS \(Ours\)40\.1/48\.663\.3/68\.844\.7/52\.6ARR\-KL35\.8/44\.162\.1/68\.041\.0/48\.9ARR\-D37\.8/46\.359\.9/65\.842\.2/50\.2Table 4:Ablation results on strength on Qwen2\.5\-7B \(EM/F1\)\.Having validated the gate signal in Section[7\.3](https://arxiv.org/html/2606.10298#S7.SS3), we fixdt=𝟙​\[max⁡pctx,t\>max⁡ppri,t\]d\_\{t\}=\\mathbb\{1\}\[\\max p\_\{\\mathrm\{ctx\},t\}\>\\max p\_\{\\mathrm\{pri\},t\}\]and vary the strength functionsts\_\{t\}among three choices: normalized JSD \(ARR\-JS, our default\),1−exp⁡\(−KL\)1\-\\exp\(\-\\mathrm\{KL\}\)\(ARR\-KL\), and a constantst≡0\.5s\_\{t\}\\equiv 0\.5\(ARR\-D\)\. As shown in Table[4](https://arxiv.org/html/2606.10298#S7.T4), all three variants surpass every baseline on TriState\-Bench; even ARR\-D reaches59\.959\.9EM, above AdaCAD \(54\.854\.8\)\. The gain on𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}therefore stems from the gate direction rather than the strength formula, confirming that explicitly routing between interpolation and extrapolation matters more than the magnitude of the adjustment\. On Trad QA, however, the formula does matter: ARR\-JS stays within1\.41\.4EM of Greedy \(40\.140\.1vs\.41\.541\.5\), the smallest degradation among all contrastive methods, whereas ARR\-KL and ARR\-D drop by5\.75\.7and3\.73\.7EM respectively\. Balancing TriState\-Bench gains with minimal Trad QA regression, we adopt ARR\-JS as the default\.

### 7\.5Case Study: Cross\-Model Commonalities and Divergences

![Refer to caption](https://arxiv.org/html/2606.10298v1/x5.png)Figure 5:Blue/red shading separates the interpolation \(τ∈\[0,1\]\\tau\\in\[0,1\]\) and extrapolation \(τ\>1\\tau\>1\) regimes\.𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}decays monotonically withτ\\tau;𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}peaks nearτ≈1\\tau\\approx 1and collapses under extrapolation, with model\-dependent severity\.Figure[5](https://arxiv.org/html/2606.10298#S7.F5)plots EM againstτ\\tauon the full TriState\-Bench for Llama3\-8B, Mistral\-7B and Qwen2\.5\-7B \(more results in Figure[7](https://arxiv.org/html/2606.10298#A7.F7); detailed cases in Figure[8](https://arxiv.org/html/2606.10298#A7.F8)\)\. The empirical patterns divide into two cross\-model commonalities and one class of model\-specific divergences\.

#### Commonality 1: threshold reversal within interpolation\.

On every model,𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}EM rises sharply within the interpolation regime rather than climbing gradually\. Llama3\-8B jumps from 2% atτ=0\\tau\{=\}0to 71% atτ=0\.75\\tau\{=\}0\.75; Mistral\-7B from 0% to 66%; Qwen2\.5\-7B from 0% to 73%; Llama3\-Instruct from 0% to 92%\. Symmetrically,𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}EM drops steeply over the same interval \(Llama3\-8B: 92%→\\to18%; Mistral: 72%→\\to16%; Qwen: 74%→\\to4%; Instruct: 82%→\\to5%\)\.

This pattern is the aggregate signature of Corollary[5](https://arxiv.org/html/2606.10298#Thmcorollary5)\. Each sample carries a pairwise reversal thresholdτa,b⋆∈\(0,1\)\\tau^\{\\star\}\_\{a,b\}\\in\(0,1\)at which the decoded distribution flips its preference between the correct tokenaaand the distractorbb\. Asτ\\tausweeps upward, progressively more𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}samples cross their individualτ⋆\\tau^\{\\star\}and switch from wrong to right, while progressively more𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}samples cross theirs and switch from right to wrong\. The steepness of the curves reflects the concentration ofτ⋆\\tau^\{\\star\}values: on Llama3\-Instruct, most thresholds cluster nearτ≈0\.5\\tau\\approx 0\.5, producing an almost step\-function rise\. No singleτ\\tausimultaneously resolves all samples, confirming that interpolation exposes an irreducible trade\-off between correction power and resistance preservation\.

#### Commonality 2: structural collapse of𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}under extrapolation\.

Onceτ\\taucrosses 1,𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}EM drops to near zero on every model and stays flat across the entire extrapolation interval: Llama3\-8B holds 0\.75% fromτ=1\.25\\tau\{=\}1\.25toτ=2\\tau\{=\}2; Mistral\-7B drops from 1\.75% to 0\.25%; Qwen2\.5\-7B from 0\.50% to 0\.25%; Llama3\-Instruct at 0\.25% throughout\. This is not gradual degradation but a structural cliff: the transition fromτ=0\.75\\tau\{=\}0\.75toτ=1\.25\\tau\{=\}1\.25collapses𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}by 15–20×\\timeson every model\.

Corollary[5](https://arxiv.org/html/2606.10298#Thmcorollary5)explains why\. In𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}, the prior supports the correct token \(ℓa,bprior\>0\\ell^\{\\mathrm\{prior\}\}\_\{a,b\}\>0\) while the context favors the distractor \(ℓa,bctx<0\\ell^\{\\mathrm\{ctx\}\}\_\{a,b\}<0\), placing the reversal threshold atτ⋆∈\(0,1\)\\tau^\{\\star\}\\in\(0,1\)\. Byτ=1\\tau\{=\}1most samples have already flipped; extrapolation then drives the pairwise log\-odds further negative without bound, locking in the wrong answer irreversibly\. The collapse is not a failure of any particular method but a structural consequence of the power family:*any*member withτ\>1\\tau\>1operates past the point of no return for resistance\-state samples\.

The canonical instance reproduces on all four models\. On*What is the capital of France?*with misleading context*Lyon*, the prior\-only output \(τ=0\\tau\{=\}0\) returns*Paris*, but greedy decoding \(τ=1\\tau\{=\}1\) and every extrapolation method \(CAD, AdaCAD, CoCoA, COIECD\) return*Lyon*\. The same flip recurs on*first president of the United States*\(Washington→\\toAdams\),*largest ocean*\(Pacific→\\toAtlantic\), and*painter of the Mona Lisa*\(Leonardo→\\toRaphael\)\.

#### Divergence: one mechanism, three generation\-mode failures\.

Corollary[6](https://arxiv.org/html/2606.10298#Thmcorollary6)predicts that extrapolation degrades generation through three routes, depending on where the prior concentrates mass: continuation tokens, over stop tokens, or low\-probability noise tokens\. Figure[9](https://arxiv.org/html/2606.10298#A7.F9)illustrates representative cases for each failure mode\.

- •Llama3\-8B: over\-generation\.𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}EM drops from 82% atτ=0\.75\\tau\{=\}0\.75to 21% atτ=2\\tau\{=\}2\. On*Which country has the most pyramids?*\(gold:*Sudan*\), interpolation atτ=0\.75\\tau\{=\}0\.75outputs a clean*Sudan*, while extrapolation atτ=2\\tau\{=\}2copies verbatim from the context:*“Sudan has the most pyramids of any country in the world, with approximately 200 to 255 known pyramids…”*\. Once the correct entity has been emitted, the context marginally favors continuation over EOS; asτ\\taucrosses into extrapolation, the negative exponent amplifies this preference, and Llama3\-8B’s weak prior on EOS provides no counterforce, yielding runaway continuation that worsens withτ\\tau\(Figure[6](https://arxiv.org/html/2606.10298#S7.F6)\)\.
- •Qwen2\.5\-7B: early stopping\.𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}EM drops from 87% atτ=0\.75\\tau\{=\}0\.75to 36% atτ=2\\tau\{=\}2\. Extrapolation truncates multi\-token answers at their first constituent:*Greenland shark*→\\to*Greenland*,*Alexander Fleming*→\\to*Alexander*,*Sargasso Sea*→\\to*Sargasso*\. Qwen’s prior places strong mass on EOS and sentence\-final punctuation; asτ\\tauincreases past 1, the same negative exponent amplifies that mass, triggering premature termination\.
- •Mistral\-7B: distribution collapse\.𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}EM drops from 81% atτ=0\.75\\tau\{=\}0\.75to 25% atτ=2\\tau\{=\}2\. Asτ\\taugrows, outputs degenerate into repeated underscore strings \(*“\_\_\_\_\_\_\_\_…”*\), training\-corpus residue such as scraped boilerplate \(*“© BrainMass Inc\. brainmass\.com October 10, 2019, 1:00 am ad1”*\), or bare instruction fragments \(*“Instructions:”*\)\. The negative exponent amplifies tokens to which the prior assigns negligible mass; Mistral’s residual probability on filler and template fragments from its training corpus is enough for extrapolation to redirect generation toward precisely these tokens\.

All three failures trace to one mechanism: asτ\\tauenters the extrapolation regime \(τ\>1\\tau\>1\), the power family drives pairwise log\-odds past the context endpoint while the negative exponent reshapes the prior’s tail\. Which extreme manifests depends on where each model’s prior concentrates its residual mass—continuation tokens \(Llama3\), stop tokens \(Qwen\), or low\-probability noise \(Mistral\)\. Corollary[6](https://arxiv.org/html/2606.10298#Thmcorollary6)guarantees that extrapolation degrades generation quality; the prior’s structural bias determines the failure mode\.

![Refer to caption](https://arxiv.org/html/2606.10298v1/x6.png)Figure 6:Generation length of Llama3\-8B across all methods in correction, resistance, and agreement states\.

## 8Conclusion

We generalize existing contrastive decoding methods into a power familyqτ,t∝ppri,t1−τ​pctx,tτq\_\{\\tau,t\}\\propto p\_\{\\mathrm\{pri\},t\}^\{1\-\\tau\}\\,p\_\{\\mathrm\{ctx\},t\}^\{\\tau\}and show that the family partitions atτ=1\\tau=1into interpolation and extrapolation: two structurally distinct regimes\. A regime asymmetry emerges: extrapolation amplifies errors unboundedly when the prior is already correct, interpolation under\-corrects when the context is correct, and no single static regime covers both directions\. To address this asymmetry, Adaptive Regime Routing \(ARR\) routes between regimes at each decoding step via a confidence\-asymmetry gate, shifting the design question from how much to amplify the context to which side deserves authority, without introducing additional hyperparameters\. To make the gains from routing separately measurable, TriState\-Bench conditions conflict states on each model’s actual prior knowledge, enabling correction, resistance, and agreement to be independently observed\. However, the current gate relies on a single statistical scalar to estimate context credibility\. Extending ARR to multi\-source and multi\-turn settings, where credibility signals accumulate across retrieval rounds, is a natural next step\.

## Limitations

This work has three limitations\.*Logit access*: routing signals are derived from the output distribution, so ARR does not apply to black\-box APIs that expose only generated text \(e\.g\., GPT\-4, Claude\)\.*Coarse credibility estimation*: the gate compresses context trustworthiness into a binary decision from a single scalar \(max\-probability gap\), without modeling source reliability or evidential consistency\. This work focuses on establishing the structural necessity of bidirectional regime switching; a more expressive credibility estimator that enables finer\-grained routing is a direct next step\.*Language coverage*: experiments are conducted exclusively on English QA; cross\-lingual and code\-mixed settings remain unevaluated\.

## Acknowledgments

This work was supported by Alibaba Group through Alibaba Research Intern Program\.

## References

- Grattafiori et al\. \(2024\)Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al\-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, et al\.The llama 3 herd of models\.*arXiv preprint arXiv:2407\.21783*, 2024\.
- Jiang et al\. \(2023\)Albert Q\. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie\-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, and William El Sayed\.Mistral 7b, 2023\.URL[https://arxiv\.org/abs/2310\.06825](https://arxiv.org/abs/2310.06825)\.
- Joshi et al\. \(2017\)Mandar Joshi, Eunsol Choi, Daniel S Weld, and Luke Zettlemoyer\.Triviaqa: A large scale distantly supervised challenge dataset for reading comprehension\.In*Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics \(Volume 1: Long Papers\)*, pp\. 1601–1611, 2017\.
- Kasai et al\. \(2023\)Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Akari Asai, Xinyan Yu, Dragomir Radev, Noah A Smith, Yejin Choi, Kentaro Inui, et al\.Realtime qa: What’s the answer right now?*Advances in neural information processing systems*, 36:49025–49043, 2023\.
- Khandelwal et al\. \(2025\)Anant Khandelwal, Manish Gupta, and Puneet Agrawal\.Cocoa: Confidence\-and context\-aware adaptive decoding for resolving knowledge conflicts in large language models\.In*Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing*, pp\. 6846–6866, 2025\.
- Kwiatkowski et al\. \(2019\)Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Jacob Devlin, Kenton Lee, et al\.Natural questions: a benchmark for question answering research\.*Transactions of the Association for Computational Linguistics*, 7:453–466, 2019\.
- Lewis et al\. \(2020\)Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen\-tau Yih, Tim Rocktäschel, et al\.Retrieval\-augmented generation for knowledge\-intensive nlp tasks\.*Advances in neural information processing systems*, 33:9459–9474, 2020\.
- Li et al\. \(2023a\)Daliang Li, Ankit Singh Rawat, Manzil Zaheer, Xin Wang, Michal Lukasik, Andreas Veit, Felix Yu, and Sanjiv Kumar\.Large language models with controllable working memory\.In*Findings of the association for computational linguistics: ACL 2023*, pp\. 1774–1793, 2023a\.
- Li et al\. \(2025\)Gaotang Li, Yuzhong Chen, and Hanghang Tong\.Taming knowledge conflicts in language models\.*arXiv preprint arXiv:2503\.10996*, 2025\.
- Li et al\. \(2023b\)Xiang Lisa Li, Ari Holtzman, Daniel Fried, Percy Liang, Jason Eisner, Tatsunori B Hashimoto, Luke Zettlemoyer, and Mike Lewis\.Contrastive decoding: Open\-ended text generation as optimization\.In*Proceedings of the 61st annual meeting of the association for computational linguistics \(volume 1: Long papers\)*, pp\. 12286–12312, 2023b\.
- Longpre et al\. \(2021\)Shayne Longpre, Kartik Perisetla, Anthony Chen, Nikhil Ramesh, Chris DuBois, and Sameer Singh\.Entity\-based knowledge conflicts in question answering\.In*Proceedings of the 2021 conference on empirical methods in natural language processing*, pp\. 7052–7063, 2021\.
- Lu et al\. \(2022\)Pan Lu, Liang Qiu, Kai\-Wei Chang, Ying Nian Wu, Song\-Chun Zhu, Tanmay Rajpurohit, Peter Clark, and Ashwin Kalyan\.Dynamic prompt learning via policy gradient for semi\-structured mathematical reasoning\.*arXiv preprint arXiv:2209\.14610*, 2022\.
- Mallen et al\. \(2023\)Alex Mallen, Akari Asai, Victor Zhong, Rajarshi Das, Daniel Khashabi, and Hannaneh Hajishirzi\.When not to trust language models: Investigating effectiveness of parametric and non\-parametric memories\.In*Proceedings of the 61st annual meeting of the association for computational linguistics \(volume 1: Long papers\)*, pp\. 9802–9822, 2023\.
- Nakano et al\. \(2021\)Reiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, et al\.Webgpt: Browser\-assisted question\-answering with human feedback\.*arXiv preprint arXiv:2112\.09332*, 2021\.
- Petroni et al\. \(2019\)Fabio Petroni, Tim Rocktäschel, Sebastian Riedel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, and Alexander Miller\.Language models as knowledge bases?In*Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing \(EMNLP\-IJCNLP\)*, pp\. 2463–2473, 2019\.
- Qwen et al\. \(2025\)Qwen, :, An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, Huan Lin, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jingren Zhou, Junyang Lin, Kai Dang, Keming Lu, Keqin Bao, Kexin Yang, Le Yu, Mei Li, Mingfeng Xue, Pei Zhang, Qin Zhu, Rui Men, Runji Lin, Tianhao Li, Tianyi Tang, Tingyu Xia, Xingzhang Ren, Xuancheng Ren, Yang Fan, Yang Su, Yichang Zhang, Yu Wan, Yuqiong Liu, Zeyu Cui, Zhenru Zhang, and Zihan Qiu\.Qwen2\.5 technical report, 2025\.URL[https://arxiv\.org/abs/2412\.15115](https://arxiv.org/abs/2412.15115)\.
- Roberts et al\. \(2020\)Adam Roberts, Colin Raffel, and Noam Shazeer\.How much knowledge can you pack into the parameters of a language model?In*Proceedings of the 2020 conference on empirical methods in natural language processing \(EMNLP\)*, pp\. 5418–5426, 2020\.
- Shi et al\. \(2024\)Weijia Shi, Xiaochuang Han, Mike Lewis, Yulia Tsvetkov, Luke Zettlemoyer, and Wen\-tau Yih\.Trusting your evidence: Hallucinate less with context\-aware decoding\.In*Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies \(Volume 2: Short Papers\)*, pp\. 783–791, 2024\.
- Touvron et al\. \(2023\)Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al\.Llama 2: Open foundation and fine\-tuned chat models\.*arXiv preprint arXiv:2307\.09288*, 2023\.
- Wang et al\. \(2025\)Han Wang, Archiki Prasad, Elias Stengel\-Eskin, and Mohit Bansal\.Adacad: Adaptively decoding to balance conflicts between contextual and parametric knowledge\.In*Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies \(Volume 1: Long Papers\)*, pp\. 11636–11652, 2025\.
- Wu et al\. \(2024\)Kevin Wu, Eric Wu, and James Zou\.Clasheval: Quantifying the tug\-of\-war between an llm’s internal prior and external evidence\.*Advances in neural information processing systems*, 37:33402–33422, 2024\.
- Xu et al\. \(2024\)Rongwu Xu, Zehan Qi, Zhijiang Guo, Cunxiang Wang, Hongru Wang, Yue Zhang, and Wei Xu\.Knowledge conflicts for llms: A survey\.In*Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing*, pp\. 8541–8565, 2024\.
- Yang et al\. \(2018\)Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William Cohen, Ruslan Salakhutdinov, and Christopher D Manning\.Hotpotqa: A dataset for diverse, explainable multi\-hop question answering\.In*Proceedings of the 2018 conference on empirical methods in natural language processing*, pp\. 2369–2380, 2018\.
- Yuan et al\. \(2024\)Xiaowei Yuan, Zhao Yang, Yequan Wang, Shengping Liu, Jun Zhao, and Kang Liu\.Discerning and resolving knowledge conflicts through adaptive decoding with contextual information\-entropy constraint\.In*Findings of the Association for Computational Linguistics: ACL 2024*, pp\. 3903–3922, 2024\.
- Zhao et al\. \(2025\)Yu Zhao, Alessio Devoto, Giwon Hong, Xiaotang Du, Aryo Pradipta Gema, Hongru Wang, Xuanli He, Kam\-Fai Wong, and Pasquale Minervini\.Steering knowledge selection behaviours in llms via sae\-based representation engineering\.In*Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies \(Volume 1: Long Papers\)*, pp\. 5117–5136, 2025\.
- Zhou et al\. \(2023\)Wenxuan Zhou, Sheng Zhang, Hoifung Poon, and Muhao Chen\.Context\-faithful prompting for large language models\.In*Findings of the Association for Computational Linguistics: EMNLP 2023*, pp\. 14544–14556, 2023\.

## Appendix ABackground Methods: Detailed Formulations

### A\.1CAD

The CAD decoding distribution can be derived from a point\-wise mutual information style adjustment:

qtCAD​\(y\)∝pctx,t​\(y\)​\[pctx,t​\(y\)ppri,t​\(y\)\]α\\displaystyle q\_\{t\}^\{\\text\{CAD\}\}\(y\)\\propto p\_\{\\text\{ctx\},t\}\(y\)\\left\[\\frac\{p\_\{\\text\{ctx\},t\}\(y\)\}\{p\_\{\\text\{pri\},t\}\(y\)\}\\right\]^\{\\alpha\}\(19\)=exp⁡\[\(1\+α\)​log⁡pctx,t​\(y\)−α​log⁡ppri,t​\(y\)\],\\displaystyle=\\exp\\\!\\Big\[\(1\+\\alpha\)\\log p\_\{\\text\{ctx\},t\}\(y\)\-\\alpha\\log p\_\{\\text\{pri\},t\}\(y\)\\Big\],followed by softmax normalization over the vocabularyVV\. The caseα=0\\alpha=0degenerates to standard decoding frompctx,tp\_\{\\text\{ctx\},t\}\.

### A\.2COIECD

COIECD first identifies conflict tokens via a contextual information\-entropy constraint, and then applies different decoding logits to conflict and non\-conflict tokens\.

#### Conflict detection\.

Let the information content of tokenyty\_\{t\}beI​\(yt\)=−log⁡p​\(yt∣x,c,y<t\)I\(y\_\{t\}\)=\-\\log p\(y\_\{t\}\\mid x,c,y\_\{<t\}\), and let the prior conditional entropy beℋ1​\(yt\)=ℋ​\(yt∣x,y<t\)\\mathcal\{H\}\_\{1\}\(y\_\{t\}\)=\\mathcal\{H\}\(y\_\{t\}\\mid x,y\_\{<t\}\)\. Building on the Stable Entropy Hypothesis and the Locally Typical Set, COIECD assumes that for non\-conflict tokens the information\-entropy shift is bounded by a constantγ\\gamma, i\.e\.,\|I​\(yt\)−ℋ1​\(yt\)\|<γ\|I\(y\_\{t\}\)\-\\mathcal\{H\}\_\{1\}\(y\_\{t\}\)\|<\\gamma\. Normalizing the shift via softmax givespδ​\(yt\)=softmax⁡\(I​\(yt\)−ℋ1​\(yt\)\)p\_\{\\delta\}\(y\_\{t\}\)=\\operatorname\{softmax\}\(I\(y\_\{t\}\)\-\\mathcal\{H\}\_\{1\}\(y\_\{t\}\)\), and a scaling factorλ∈\(0,1\]\\lambda\\in\(0,1\]defines the upper and lower boundsupδ=λ​maxw⁡pδ​\(w\)u\_\{p\_\{\\delta\}\}=\\lambda\\max\_\{w\}p\_\{\\delta\}\(w\)andlpδ=1λ​minw⁡pδ​\(w\)l\_\{p\_\{\\delta\}\}=\\frac\{1\}\{\\lambda\}\\min\_\{w\}p\_\{\\delta\}\(w\)\(withlpδ=0l\_\{p\_\{\\delta\}\}=0when only one token violates the bound\)\. The non\-conflict set is then

𝒞​\(y<t\)=\{y∈V:lpδ≤pδ​\(yt\)≤upδ\}\.\\mathcal\{C\}\(y\_\{<t\}\)=\\\{\\,y\\in V:l\_\{p\_\{\\delta\}\}\\leq p\_\{\\delta\}\(y\_\{t\}\)\\leq u\_\{p\_\{\\delta\}\}\\,\\\}\.\(20\)

#### Adaptive decoding\.

Letp1​\(yt\)=p​\(yt∣x,y<t\)p\_\{1\}\(y\_\{t\}\)=p\(y\_\{t\}\\mid x,y\_\{<t\}\),p2​\(yt\)=p​\(yt∣x,c,y<t\)p\_\{2\}\(y\_\{t\}\)=p\(y\_\{t\}\\mid x,c,y\_\{<t\}\), and the contrastive objectg​\(yt\)=log⁡p2​\(yt\)−log⁡p1​\(yt\)g\(y\_\{t\}\)=\\log p\_\{2\}\(y\_\{t\}\)\-\\log p\_\{1\}\(y\_\{t\}\)\. Conflict tokens usep2p\_\{2\}as the base distribution, non\-conflict tokens usep1p\_\{1\}, andggis added on top as a uniform adjustment:

log⁡π​\(yt∣x,c,y<t\)=\{log⁡p1​\(yt\)\+α​g​\(yt\),yt∈𝒞​\(y<t\),log⁡p2​\(yt\)\+α​g​\(yt\),yt∉𝒞​\(y<t\)\.\\displaystyle\\log\\pi\(y\_\{t\}\\mid x,c,y\_\{<t\}\)=\(21\)Sampling is then performed viayt∼softmax⁡\[log⁡π\]y\_\{t\}\\sim\\operatorname\{softmax\}\[\\log\\pi\]\. The original paper usesλ=0\.25\\lambda=0\.25andα=1\\alpha=1on QA tasks\.

### A\.3AdaCAD: Jensen\-Shannon Divergence

AdaCAD uses Jensen–Shannon divergence as the step\-wise contrast strength\. Given two distributionsP,QP,QwithM=12​\(P\+Q\)M=\\frac\{1\}\{2\}\(P\+Q\),

JSD⁡\(P∥Q\)=12​KL⁡\(P∥M\)\+12​KL⁡\(Q∥M\)\\operatorname\{JSD\}\(P\\,\\\|\\,Q\)=\\frac\{1\}\{2\}\\,\\operatorname\{KL\}\(P\\,\\\|\\,M\)\+\\frac\{1\}\{2\}\\,\\operatorname\{KL\}\(Q\\,\\\|\\,M\)\(22\)At each decoding step, AdaCAD computesαtJSD=JSD\(ppri,t,\|,pctx,t\)\\alpha\_\{t\}^\{\\text\{JSD\}\}=\\operatorname\{JSD\}\(p\_\{\\text\{pri\},t\},\|,p\_\{\\text\{ctx\},t\}\)as a dynamic replacement for the fixedα\\alphain CAD: when the conflict is large,αtJSD\\alpha\_\{t\}^\{\\text\{JSD\}\}approaches its upper bound and the contrastive adjustment is amplified; when the conflict is small,αtJSD\\alpha\_\{t\}^\{\\text\{JSD\}\}shrinks toward zero and the adjustment is attenuated, recovering near\-standard decoding frompctx,tp\_\{\\text\{ctx\},t\}\.

### A\.4CoCoA: Multi\-Signal Adaptive Gating

CoCoA replaces a single JSD signal with three conflict signals\.

#### Rényi divergence

\(orderβ\\beta, sensitive to long\-tail probabilities\):

Dtβ=1β−1​log​∑y∈Vppri,t​\(y\)β​pctx,t​\(y\)1−β\.D\_\{t\}^\{\\beta\}\\;=\\;\\frac\{1\}\{\\beta\-1\}\\log\\sum\_\{y\\in V\}p\_\{\\text\{pri\},t\}\(y\)^\{\\beta\}\\,p\_\{\\text\{ctx\},t\}\(y\)^\{1\-\\beta\}\.\(23\)

#### Entropy gap

\(the change in uncertainty after the context is introduced\):

Δ​ℋt=ℋ​\(ppri,t\)−ℋ​\(pctx,t\)\.\\Delta\\mathcal\{H\}\_\{t\}\\;=\\;\\mathcal\{H\}\(p\_\{\\text\{pri\},t\}\)\-\\mathcal\{H\}\(p\_\{\\text\{ctx\},t\}\)\.\(24\)

#### Contextual peakedness

\(whether the context distribution gives a clear top prediction\):

mt=pctx,t​\(yt\(1\)\)−pctx,t​\(yt\(2\)\),m\_\{t\}\\;=\\;p\_\{\\text\{ctx\},t\}\(y\_\{t\}^\{\(1\)\}\)\-p\_\{\\text\{ctx\},t\}\(y\_\{t\}^\{\(2\)\}\),\(25\)whereyt\(1\),yt\(2\)y\_\{t\}^\{\(1\)\},y\_\{t\}^\{\(2\)\}are the top\-1 and top\-2 tokens underpctx,tp\_\{\\text\{ctx\},t\}\.

#### Adaptive gating\.

CoCoA first combines Rényi divergence and entropy gap into a conflict scorest=σ​\(Dtβ\+γ​Δ​ℋt\+δ\)s\_\{t\}=\\sigma\(D\_\{t\}^\{\\beta\}\+\\gamma\\Delta\\mathcal\{H\}\_\{t\}\+\\delta\), and then fuses contextual peakedness into the gating weight

λt=σ​\(z​log⁡mt\+log⁡1−stst\),z\>1\.\\lambda\_\{t\}\\;=\\;\\sigma\\\!\\left\(z\\log m\_\{t\}\+\\log\\frac\{1\-s\_\{t\}\}\{s\_\{t\}\}\\right\),\\qquad z\>1\.\(26\)The final distribution is normalized asqtCoCoA​\(y\)∝pctx,t​\(y\)λt​ppri,t​\(y\)1−λtq\_\{t\}^\{\\text\{CoCoA\}\}\(y\)\\propto p\_\{\\text\{ctx\},t\}\(y\)^\{\\lambda\_\{t\}\}\\,p\_\{\\text\{pri\},t\}\(y\)^\{1\-\\lambda\_\{t\}\}\. The original hyperparameters areβ=0\.5\\beta=0\.5,z=5z=5,γ=1\\gamma=1, andδ=10−8\\delta=10^\{\-8\}\.

### A\.5Mapping to the Power Family

#### A\.5\.1 CAD\.

For any fixedα≥0\\alpha\\geq 0, settingτ=1\+α\\tau=1\+\\alphamakes the power\-family memberqτ,tq\_\{\\tau,t\}pointwise equal to the CAD adjusted distributionqtCADq\_\{t\}^\{\\text\{CAD\}\}onVV\.

Letzipriz\_\{i\}^\{\\text\{pri\}\}andzictxz\_\{i\}^\{\\text\{ctx\}\}denote the raw logits assigned to tokeniiby the model on the two forward passes, with softmax normalizations

ppri,t​\(i\)=ezipriZpri,pctx,t​\(i\)=ezictxZctx,p\_\{\\text\{pri\},t\}\(i\)=\\frac\{e^\{z\_\{i\}^\{\\text\{pri\}\}\}\}\{Z^\{\\text\{pri\}\}\},\\quad p\_\{\\text\{ctx\},t\}\(i\)=\\frac\{e^\{z\_\{i\}^\{\\text\{ctx\}\}\}\}\{Z^\{\\text\{ctx\}\}\},\(27\)whereZpri=∑jezjpriZ^\{\\text\{pri\}\}=\\sum\_\{j\}e^\{z\_\{j\}^\{\\text\{pri\}\}\}andZctx=∑jezjctxZ^\{\\text\{ctx\}\}=\\sum\_\{j\}e^\{z\_\{j\}^\{\\text\{ctx\}\}\}\. WritingℓiCAD≜\(1\+α\)​zictx−α​zipri\\ell\_\{i\}^\{\\text\{CAD\}\}\\triangleq\(1\+\\alpha\)\\,z\_\{i\}^\{\\text\{ctx\}\}\-\\alpha\\,z\_\{i\}^\{\\text\{pri\}\}, the original CAD formula reads

qtCAD​\(i\)=exp⁡\(ℓiCAD\)∑j∈Vexp⁡\(ℓjCAD\)\.q\_\{t\}^\{\\text\{CAD\}\}\(i\)=\\frac\{\\exp\\\!\\big\(\\ell\_\{i\}^\{\\text\{CAD\}\}\\big\)\}\{\\sum\_\{j\\in V\}\\exp\\\!\\big\(\\ell\_\{j\}^\{\\text\{CAD\}\}\\big\)\}\.\(28\)Taking the logarithm of the power\-family form,

log⁡qτ,t​\(i\)\\displaystyle\\log q\_\{\\tau,t\}\(i\)=\(1−τ\)​log⁡ppri,t​\(i\)\+τ​log⁡pctx,t​\(i\)−log⁡Zτ,t,\\displaystyle=\(1\-\\tau\)\\log p\_\{\\text\{pri\},t\}\(i\)\+\\tau\\log p\_\{\\text\{ctx\},t\}\(i\)\-\\log Z\_\{\\tau,t\},\(29\)and substituting the softmax forms ofppri,tp\_\{\\text\{pri\},t\}andpctx,tp\_\{\\text\{ctx\},t\}yields

log⁡qτ,t​\(i\)=\(1−τ\)​zipri\+τ​zictx−C,C≜\(1−τ\)​log⁡Zpri\+τ​log⁡Zctx\+log⁡Zτ,t,\\displaystyle\\log q\_\{\\tau,t\}\(i\)=\(1\-\\tau\)\\,z\_\{i\}^\{\\text\{pri\}\}\+\\tau\\,z\_\{i\}^\{\\text\{ctx\}\}\-C,\\quad C\\triangleq\(1\-\\tau\)\\log Z^\{\\text\{pri\}\}\+\\tau\\log Z^\{\\text\{ctx\}\}\+\\log Z\_\{\\tau,t\},\(30\)whereCCdoes not depend onii\. By the shift\-invariance of softmax,softmax⁡\(z\+c​𝟏\)=softmax⁡\(z\)\\operatorname\{softmax\}\(z\+c\\mathbf\{1\}\)=\\operatorname\{softmax\}\(z\), we obtain

qτ,t\(i\)=softmax\(\(1−τ\)zpri\+τzctx\)i\.q\_\{\\tau,t\}\(i\)=\\operatorname\{softmax\}\\\!\\big\(\(1\-\\tau\)\\,z^\{\\text\{pri\}\}\+\\tau\\,z^\{\\text\{ctx\}\}\\big\)\_\{i\}\.\(31\)Substitutingτ=1\+α\\tau=1\+\\alphagives

qτ,t\(i\)=softmax\(\(1\+α\)zctx−αzpri\)i,q\_\{\\tau,t\}\(i\)=\\operatorname\{softmax\}\\\!\\big\(\(1\+\\alpha\)\\,z^\{\\text\{ctx\}\}\-\\alpha\\,z^\{\\text\{pri\}\}\\big\)\_\{i\},\(32\)which equalsqtCAD​\(i\)q\_\{t\}^\{\\text\{CAD\}\}\(i\), i\.e\., the two distributions agree pointwise on the vocabulary\.

#### A\.5\.2 COIECD\.

COIECD is the most special of the four methods: rather than landing at a single point or a continuous interval on the path, it jumps along the path\. The reason follows from its two\-step structure in Appendix[A\.2](https://arxiv.org/html/2606.10298#A1.SS2)\.

Recall the adaptive decoding logit

log⁡π​\(yt∣x,c,y<t\)=\{log⁡p1​\(yt\)\+α​g​\(yt\),yt∈𝒞​\(y<t\),log⁡p2​\(yt\)\+α​g​\(yt\),yt∉𝒞​\(y<t\)\.\\displaystyle\\log\\pi\(y\_\{t\}\\mid x,c,y\_\{<t\}\)=\(33\)whereg​\(yt\)=log⁡pctx,t​\(yt\)−log⁡ppri,t​\(yt\)g\(y\_\{t\}\)=\\log p\_\{\\text\{ctx\},t\}\(y\_\{t\}\)\-\\log p\_\{\\text\{pri\},t\}\(y\_\{t\}\)\. Although the two branches share the same contrastive objectgg, their base distributions differ: the non\-conflict branch usesppri,tp\_\{\\text\{pri\},t\}as the base, while the conflict branch usespctx,tp\_\{\\text\{ctx\},t\}\. For each token group, COIECD is therefore equivalent to a member of the power family in a differentτ\\tau:

τt=\{α,yt∈𝒞​\(y<t\),1\+α,yt∉𝒞​\(y<t\)\.\\tau\_\{t\}=\\begin\{cases\}\\alpha,&y\_\{t\}\\in\\mathcal\{C\}\(y\_\{<t\}\),\\\\ 1\+\\alpha,&y\_\{t\}\\notin\\mathcal\{C\}\(y\_\{<t\}\)\.\\end\{cases\}\(34\)That is, at every decoding step COIECD partitionsVVinto two groups corresponding to the path members atτ=α\\tau=\\alphaandτ=1\+α\\tau=1\+\\alpha, respectively, and merges them through softmax normalization\. This is the only method that jumps along the path rather than sliding along it: the other three assign a singleτ\\tauto all tokens at a given step, whereas COIECD assigns differentτ\\tauvalues to different tokens within the same step\.

The value ofα\\alphacontrols the magnitude of the jump\. The original paper usesα=1\\alpha=1on QA tasks, corresponding toτ∈\{1,2\}\\tau\\in\\\{1,2\\\}, i\.e\., a jump between context\-only decoding and moderate extrapolation\.

#### A\.5\.3 AdaCAD\.

AdaCAD has the same form as CAD, except that the fixedα\\alphais replaced by the step\-wise signalαtJSD=JSD⁡\(ppri,t∥pctx,t\)\\alpha\_\{t\}^\{\\text\{JSD\}\}=\\operatorname\{JSD\}\(p\_\{\\text\{pri\},t\}\\,\\\|\\,p\_\{\\text\{ctx\},t\}\), corresponding toτt=1\+αtJSD\\tau\_\{t\}=1\+\\alpha\_\{t\}^\{\\text\{JSD\}\}\.

#### A\.5\.4 CoCoA\.

The paper form and the public code give two different correspondences\. The paper form

qtCoCoA​\(y\)∝pctx,t​\(y\)λt​ppri,t​\(y\)1−λtq\_\{t\}^\{\\text\{CoCoA\}\}\(y\)\\propto p\_\{\\text\{ctx\},t\}\(y\)^\{\\lambda\_\{t\}\}\\,p\_\{\\text\{pri\},t\}\(y\)^\{1\-\\lambda\_\{t\}\}\(35\)aligns directly with the power family atτt=λt∈\[0,1\]\\tau\_\{t\}=\\lambda\_\{t\}\\in\[0,1\], falling in the*interpolation*regime\. The public code, however, adds an independently weighted PMI biasγ​\(log⁡pctx,t−log⁡ppri,t\)\\gamma\\big\(\\log p\_\{\\text\{ctx\},t\}\-\\log p\_\{\\text\{pri\},t\}\\big\)on top of this paper\-form mixture; after rearrangement, this is equivalent to the power\-family member atτt=λt\+γ\\tau\_\{t\}=\\lambda\_\{t\}\+\\gamma\. The default code setting usesγ=1\\gamma=1and hard\-codesλt=0\.5\\lambda\_\{t\}=0\.5, givingτt=1\.5\\tau\_\{t\}=1\.5, which falls in the*extrapolation*regime\. This is why CoCoA is starred in the Table[1](https://arxiv.org/html/2606.10298#S3.T1)and discussed alongside the extrapolation methods\.

## Appendix BTheoretical Proofs

### B\.1Theorem 1

###### Proof\.

The feasible set𝒞ϵ=\{q∈Δ:𝔻KL​\(q∥pctx\)≤ϵ\}\\mathcal\{C\}\_\{\\epsilon\}=\\\{q\\in\\Delta:\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(q\\\|p\_\{\\mathrm\{ctx\}\}\)\\leq\\epsilon\\\}is non\-empty \(it containspctxp\_\{\\mathrm\{ctx\}\}\), closed \(as a sublevel set of a continuous function\), and convex \(as a sublevel set of a convex function\), hence compact\. The objectiveq↦𝔻KL​\(q∥ppri\)q\\mapsto\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(q\\\|p\_\{\\mathrm\{pri\}\}\)is continuous and strictly convex onΔ\\Delta, so by Weierstrass it attains a unique minimumq⋆q^\{\\star\}on𝒞ϵ\\mathcal\{C\}\_\{\\epsilon\}\.

To see thatq⋆q^\{\\star\}has full support, supposeq⋆​\(y0\)=0q^\{\\star\}\(y\_\{0\}\)=0for somey0∈Vy\_\{0\}\\in V\. Pick anyy1y\_\{1\}withq⋆​\(y1\)\>0q^\{\\star\}\(y\_\{1\}\)\>0and setqt=q⋆\+t​\(ey0−ey1\)q\_\{t\}=q^\{\\star\}\+t\(e\_\{y\_\{0\}\}\-e\_\{y\_\{1\}\}\)for smallt\>0t\>0\. Thisqtq\_\{t\}remains inΔ\\Delta: all coordinates stay non\-negative fort<q⋆​\(y1\)t<q^\{\\star\}\(y\_\{1\}\), and the sum is preserved\. For the constraint, they0y\_\{0\}\-coordinate contributest​log⁡t−t​log⁡pctx​\(y0\)t\\log t\-t\\log p\_\{\\mathrm\{ctx\}\}\(y\_\{0\}\)to𝔻KL​\(qt∥pctx\)\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(q\_\{t\}\\\|p\_\{\\mathrm\{ctx\}\}\), whose right derivative att=0\+t=0^\{\+\}is−∞\-\\infty\(sincelog⁡t\+1→−∞\\log t\+1\\to\-\\inftywhile the remaining coordinates contribute finite derivatives\)\. So𝔻KL​\(qt∥pctx\)<𝔻KL​\(q⋆∥pctx\)≤ϵ\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(q\_\{t\}\\\|p\_\{\\mathrm\{ctx\}\}\)<\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(q^\{\\star\}\\\|p\_\{\\mathrm\{ctx\}\}\)\\leq\\epsilonfor sufficiently smalltt, andqtq\_\{t\}is feasible\. The same−∞\-\\inftyright derivative applies to the objective𝔻KL​\(qt∥ppri\)\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(q\_\{t\}\\\|p\_\{\\mathrm\{pri\}\}\), so the objective strictly decreases alongqtq\_\{t\}, contradicting optimality ofq⋆q^\{\\star\}\. Henceq⋆​\(y\)\>0q^\{\\star\}\(y\)\>0for ally∈Vy\\in V\.

We next show that the KL constraint is active atq⋆q^\{\\star\}\. Suppose𝔻KL​\(q⋆∥pctx\)<ϵ\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(q^\{\\star\}\\\|p\_\{\\mathrm\{ctx\}\}\)<\\epsilon\. Sinceϵ<𝔻KL​\(ppri∥pctx\)\\epsilon<\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(p\_\{\\mathrm\{pri\}\}\\\|p\_\{\\mathrm\{ctx\}\}\), the pointpprip\_\{\\mathrm\{pri\}\}is infeasible, soq⋆≠ppriq^\{\\star\}\\neq p\_\{\\mathrm\{pri\}\}\. Considerqt=\(1−t\)​q⋆\+t​ppriq\_\{t\}=\(1\-t\)q^\{\\star\}\+t\\,p\_\{\\mathrm\{pri\}\}for smallt\>0t\>0\. By convexity of the constraint,𝔻KL​\(qt∥pctx\)≤\(1−t\)​𝔻KL​\(q⋆∥pctx\)\+t​𝔻KL​\(ppri∥pctx\)\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(q\_\{t\}\\\|p\_\{\\mathrm\{ctx\}\}\)\\leq\(1\-t\)\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(q^\{\\star\}\\\|p\_\{\\mathrm\{ctx\}\}\)\+t\\,\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(p\_\{\\mathrm\{pri\}\}\\\|p\_\{\\mathrm\{ctx\}\}\), which is still belowϵ\\epsilonfor smallttby continuity\. But by strict convexity of the objective,𝔻KL​\(qt∥ppri\)<\(1−t\)​𝔻KL​\(q⋆∥ppri\)\+t⋅0<𝔻KL​\(q⋆∥ppri\)\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(q\_\{t\}\\\|p\_\{\\mathrm\{pri\}\}\)<\(1\-t\)\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(q^\{\\star\}\\\|p\_\{\\mathrm\{pri\}\}\)\+t\\cdot 0<\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(q^\{\\star\}\\\|p\_\{\\mathrm\{pri\}\}\), contradicting optimality\. So𝔻KL​\(q⋆∥pctx\)=ϵ\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(q^\{\\star\}\\\|p\_\{\\mathrm\{ctx\}\}\)=\\epsilon\.

Now restrict toX\+=\{q∈ℝ\|V\|:q​\(y\)\>0\}X\_\{\+\}=\\\{q\\in\\mathbb\{R\}^\{\|V\|\}:q\(y\)\>0\\\}, where both KL divergences areC∞C^\{\\infty\}\. Slater’s condition is satisfied byq¯=pctx∈X\+\\bar\{q\}=p\_\{\\mathrm\{ctx\}\}\\in X\_\{\+\}\(which gives𝔻KL​\(q¯∥pctx\)=0<ϵ\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(\\bar\{q\}\\\|p\_\{\\mathrm\{ctx\}\}\)=0<\\epsilon\)\. Sinceq⋆∈X\+q^\{\\star\}\\in X\_\{\+\}and satisfies the first\-order constraint qualification, KKT is necessary and sufficient\. Write the Lagrangian

L​\(q,η,ν\)=𝔻KL​\(q∥ppri\)\+η​\[𝔻KL​\(q∥pctx\)−ϵ\]\+ν​\(∑yq​\(y\)−1\)L\(q,\\eta,\\nu\)=\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(q\\\|p\_\{\\mathrm\{pri\}\}\)\+\\eta\\big\[\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(q\\\|p\_\{\\mathrm\{ctx\}\}\)\-\\epsilon\\big\]\+\\nu\\Big\(\\sum\_\{y\}q\(y\)\-1\\Big\)\(36\)withη≥0\\eta\\geq 0\. Differentiating with respect toq​\(y\)q\(y\)and setting to zero:

log⁡q⋆​\(y\)\+1−log⁡ppri​\(y\)\+η​\[log⁡q⋆​\(y\)\+1−log⁡pctx​\(y\)\]\+ν=0\.\\log q^\{\\star\}\(y\)\+1\-\\log p\_\{\\mathrm\{pri\}\}\(y\)\+\\eta\\big\[\\log q^\{\\star\}\(y\)\+1\-\\log p\_\{\\mathrm\{ctx\}\}\(y\)\\big\]\+\\nu=0\.\(37\)Collecting terms,

\(1\+η\)​log⁡q⋆​\(y\)=log⁡ppri​\(y\)\+η​log⁡pctx​\(y\)−\(1\+η\)−ν\.\(1\+\\eta\)\\log q^\{\\star\}\(y\)=\\log p\_\{\\mathrm\{pri\}\}\(y\)\+\\eta\\log p\_\{\\mathrm\{ctx\}\}\(y\)\-\(1\+\\eta\)\-\\nu\.\(38\)The right\-hand side separates into ayy\-dependent part and a constantC=−\(1\+η\)−νC=\-\(1\+\\eta\)\-\\nu\. Exponentiating and normalizing,

q⋆​\(y\)∝ppri​\(y\)1/\(1\+η\)​pctx​\(y\)η/\(1\+η\)\.q^\{\\star\}\(y\)\\propto p\_\{\\mathrm\{pri\}\}\(y\)^\{1/\(1\+\\eta\)\}\\,p\_\{\\mathrm\{ctx\}\}\(y\)^\{\\eta/\(1\+\\eta\)\}\.\(39\)Setτ=η/\(1\+η\)\\tau=\\eta/\(1\+\\eta\), so that1−τ=1/\(1\+η\)1\-\\tau=1/\(1\+\\eta\)\. Thenq⋆=qτq^\{\\star\}=q\_\{\\tau\}withqτ​\(y\)∝ppri​\(y\)1−τ​pctx​\(y\)τq\_\{\\tau\}\(y\)\\propto p\_\{\\mathrm\{pri\}\}\(y\)^\{1\-\\tau\}p\_\{\\mathrm\{ctx\}\}\(y\)^\{\\tau\}\. Since the constraint is active,η\>0\\eta\>0\(otherwiseq⋆=ppriq^\{\\star\}=p\_\{\\mathrm\{pri\}\}, which is infeasible\), henceτ∈\(0,1\)\\tau\\in\(0,1\)\.

It remains to show thatτ\\tauis uniquely determined byϵ\\epsilon\. Defineϕ​\(τ\):=𝔻KL​\(qτ∥pctx\)\\phi\(\\tau\):=\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(q\_\{\\tau\}\\\|p\_\{\\mathrm\{ctx\}\}\)\. Writer​\(y\)=log⁡\[pctx​\(y\)/ppri​\(y\)\]r\(y\)=\\log\[p\_\{\\mathrm\{ctx\}\}\(y\)/p\_\{\\mathrm\{pri\}\}\(y\)\]and note thatqτ​\(y\)=exp⁡\(log⁡ppri​\(y\)\+τ​r​\(y\)−A​\(τ\)\)q\_\{\\tau\}\(y\)=\\exp\(\\log p\_\{\\mathrm\{pri\}\}\(y\)\+\\tau r\(y\)\-A\(\\tau\)\), whereA​\(τ\)=log​∑yppri​\(y\)​eτ​r​\(y\)A\(\\tau\)=\\log\\sum\_\{y\}p\_\{\\mathrm\{pri\}\}\(y\)e^\{\\tau r\(y\)\}\. Standard exponential\-family identities giveA′​\(τ\)=𝔼qτ​\[r\]A^\{\\prime\}\(\\tau\)=\\mathbb\{E\}\_\{q\_\{\\tau\}\}\[r\]andA′′​\(τ\)=Varqτ​\[r\]A^\{\\prime\\prime\}\(\\tau\)=\\mathrm\{Var\}\_\{q\_\{\\tau\}\}\[r\]\. A direct computation yields

ϕ​\(τ\)=\(τ−1\)​A′​\(τ\)−A​\(τ\),ϕ′​\(τ\)=\(τ−1\)​Varqτ​\[r\]\.\\phi\(\\tau\)=\(\\tau\-1\)A^\{\\prime\}\(\\tau\)\-A\(\\tau\),\\qquad\\phi^\{\\prime\}\(\\tau\)=\(\\tau\-1\)\\,\\mathrm\{Var\}\_\{q\_\{\\tau\}\}\[r\]\.\(40\)Whenppri≠pctxp\_\{\\mathrm\{pri\}\}\\neq p\_\{\\mathrm\{ctx\}\},rris non\-constant under anyqτq\_\{\\tau\}\(sinceqτq\_\{\\tau\}has full support\), soVarqτ​\[r\]\>0\\mathrm\{Var\}\_\{q\_\{\\tau\}\}\[r\]\>0\. Henceϕ′​\(τ\)<0\\phi^\{\\prime\}\(\\tau\)<0forτ∈\(0,1\)\\tau\\in\(0,1\), meaningϕ\\phiis strictly decreasing on\[0,1\]\[0,1\]\. Sinceϕ​\(0\)=𝔻KL​\(ppri∥pctx\)\\phi\(0\)=\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(p\_\{\\mathrm\{pri\}\}\\\|p\_\{\\mathrm\{ctx\}\}\)andϕ​\(1\)=0\\phi\(1\)=0, the intermediate value theorem gives a uniqueτ⋆∈\(0,1\)\\tau^\{\\star\}\\in\(0,1\)withϕ​\(τ⋆\)=ϵ\\phi\(\\tau^\{\\star\}\)=\\epsilonfor eachϵ∈\(0,𝔻KL​\(ppri∥pctx\)\)\\epsilon\\in\(0,\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(p\_\{\\mathrm\{pri\}\}\\\|p\_\{\\mathrm\{ctx\}\}\)\)\. ∎

### B\.2Theorem 2

###### Proof\.

WriteFηF\_\{\\eta\}in expanded form:

Fη​\(q\)=\(1−η\)​∑yq​\(y\)​log⁡q​\(y\)−∑yq​\(y\)​log⁡pctx​\(y\)\+η​∑yq​\(y\)​log⁡ppri​\(y\)\.F\_\{\\eta\}\(q\)=\(1\-\\eta\)\\sum\_\{y\}q\(y\)\\log q\(y\)\-\\sum\_\{y\}q\(y\)\\log p\_\{\\mathrm\{ctx\}\}\(y\)\+\\eta\\sum\_\{y\}q\(y\)\\log p\_\{\\mathrm\{pri\}\}\(y\)\.\(41\)Since0<η<10<\\eta<1, the coefficient1−η1\-\\etaon the negative\-entropy term is positive\. The functionx↦x​log⁡xx\\mapsto x\\log xis strictly convex on\[0,1\]\[0,1\], and the remaining terms are linear inqq, soFηF\_\{\\eta\}is strictly convex onΔ\\Delta\. AsΔ\\Deltais compact, there exists a unique minimizerq⋆q^\{\\star\}\.

We claimq⋆​\(y\)\>0q^\{\\star\}\(y\)\>0for allyy\. Ifq⋆​\(y0\)=0q^\{\\star\}\(y\_\{0\}\)=0, takeqt=q⋆\+t​\(ey0−ey1\)q\_\{t\}=q^\{\\star\}\+t\(e\_\{y\_\{0\}\}\-e\_\{y\_\{1\}\}\)as before\. They0y\_\{0\}\-coordinate contributes\(1−η\)⋅t​log⁡t\(1\-\\eta\)\\cdot t\\log ttoFη​\(qt\)F\_\{\\eta\}\(q\_\{t\}\)plus terms linear intt; the right derivative att=0\+t=0^\{\+\}is dominated by\(1−η\)​\(log⁡t\+1\)→−∞\(1\-\\eta\)\(\\log t\+1\)\\to\-\\infty, soFηF\_\{\\eta\}decreases along this direction, contradicting minimality\.

Withq⋆q^\{\\star\}in the interior, we minimize over\{q∈X\+:∑yq​\(y\)=1\}\\\{q\\in X\_\{\+\}:\\sum\_\{y\}q\(y\)=1\\\}\. There is no inequality constraint, so Slater’s condition holds trivially\. The stationarity condition for the LagrangianL​\(q,ν\)=Fη​\(q\)\+ν​\(∑yq​\(y\)−1\)L\(q,\\nu\)=F\_\{\\eta\}\(q\)\+\\nu\(\\sum\_\{y\}q\(y\)\-1\)reads

\(1−η\)​\(log⁡q⋆​\(y\)\+1\)−log⁡pctx​\(y\)\+η​log⁡ppri​\(y\)\+ν=0\.\(1\-\\eta\)\(\\log q^\{\\star\}\(y\)\+1\)\-\\log p\_\{\\mathrm\{ctx\}\}\(y\)\+\\eta\\log p\_\{\\mathrm\{pri\}\}\(y\)\+\\nu=0\.\(42\)Since1−η\>01\-\\eta\>0, divide through:

log⁡q⋆​\(y\)=11−η​log⁡pctx​\(y\)−η1−η​log⁡ppri​\(y\)\+C\\log q^\{\\star\}\(y\)=\\frac\{1\}\{1\-\\eta\}\\log p\_\{\\mathrm\{ctx\}\}\(y\)\-\\frac\{\\eta\}\{1\-\\eta\}\\log p\_\{\\mathrm\{pri\}\}\(y\)\+C\(43\)whereC=−1−ν/\(1−η\)C=\-1\-\\nu/\(1\-\\eta\)is independent ofyy\. Setτ=1/\(1−η\)\\tau=1/\(1\-\\eta\), giving1−τ=−η/\(1−η\)1\-\\tau=\-\\eta/\(1\-\\eta\), so that

q⋆​\(y\)∝ppri​\(y\)1−τ​pctx​\(y\)τ=qτ​\(y\)\.q^\{\\star\}\(y\)\\propto p\_\{\\mathrm\{pri\}\}\(y\)^\{1\-\\tau\}\\,p\_\{\\mathrm\{ctx\}\}\(y\)^\{\\tau\}=q\_\{\\tau\}\(y\)\.\(44\)Sinceη∈\(0,1\)\\eta\\in\(0,1\), we haveτ=1/\(1−η\)∈\(1,\+∞\)\\tau=1/\(1\-\\eta\)\\in\(1,\+\\infty\)\. The boundary caseη=0\\eta=0givesF0​\(q\)=𝔻KL​\(q∥pctx\)F\_\{0\}\(q\)=\\mathbb\{D\}\_\{\\mathrm\{KL\}\}\(q\\\|p\_\{\\mathrm\{ctx\}\}\), whose unique minimizer isq⋆=pctxq^\{\\star\}=p\_\{\\mathrm\{ctx\}\}, corresponding toτ=1\\tau=1\. ∎

### B\.3Proposition 3

###### Proof\.

The normalization constantZτZ\_\{\\tau\}cancels in the pairwise ratio:

qτ​\(a\)qτ​\(b\)=\(ppri​\(a\)ppri​\(b\)\)1−τ​\(pctx​\(a\)pctx​\(b\)\)τ\.\\frac\{q\_\{\\tau\}\(a\)\}\{q\_\{\\tau\}\(b\)\}=\\left\(\\frac\{p\_\{\\mathrm\{pri\}\}\(a\)\}\{p\_\{\\mathrm\{pri\}\}\(b\)\}\\right\)^\{1\-\\tau\}\\left\(\\frac\{p\_\{\\mathrm\{ctx\}\}\(a\)\}\{p\_\{\\mathrm\{ctx\}\}\(b\)\}\\right\)^\{\\tau\}\.\(45\)Taking the log givesℓa,b​\(τ\)=\(1−τ\)​ℓa,bpri\+τ​ℓa,bctx=ℓa,bpri\+τ​Δa,b\\ell\_\{a,b\}\(\\tau\)=\(1\-\\tau\)\\ell^\{\\mathrm\{pri\}\}\_\{a,b\}\+\\tau\\,\\ell^\{\\mathrm\{ctx\}\}\_\{a,b\}=\\ell^\{\\mathrm\{pri\}\}\_\{a,b\}\+\\tau\\,\\Delta\_\{a,b\}\. WhenΔa,b≠0\\Delta\_\{a,b\}\\neq 0, this affine function has a unique zero atτa,b⋆=−ℓa,bpri/Δa,b\\tau^\{\\star\}\_\{a,b\}=\-\\ell^\{\\mathrm\{pri\}\}\_\{a,b\}/\\Delta\_\{a,b\}\. ∎

### B\.4Corollary 5

###### Proof\.

We instantiate Proposition[4](https://arxiv.org/html/2606.10298#Thmcorollary4)for each state\.

Correction\.ℓa,bpri<0<ℓa,bctx\\ell^\{\\mathrm\{pri\}\}\_\{a,b\}<0<\\ell^\{\\mathrm\{ctx\}\}\_\{a,b\}givesΔa,b\>0\\Delta\_\{a,b\}\>0and−ℓa,bpri\>0\-\\ell^\{\\mathrm\{pri\}\}\_\{a,b\}\>0, soτa,b⋆\>0\\tau^\{\\star\}\_\{a,b\}\>0\. Alsoτa,b⋆<1⇔−ℓa,bpri<Δa,b⇔ℓa,bctx\>0\\tau^\{\\star\}\_\{a,b\}<1\\Leftrightarrow\-\\ell^\{\\mathrm\{pri\}\}\_\{a,b\}<\\Delta\_\{a,b\}\\Leftrightarrow\\ell^\{\\mathrm\{ctx\}\}\_\{a,b\}\>0, which holds by assumption\. Henceτa,b⋆∈\(0,1\)\\tau^\{\\star\}\_\{a,b\}\\in\(0,1\)\.

Resistance\.ℓa,bpri\>0\>ℓa,bctx\\ell^\{\\mathrm\{pri\}\}\_\{a,b\}\>0\>\\ell^\{\\mathrm\{ctx\}\}\_\{a,b\}givesΔa,b<0\\Delta\_\{a,b\}<0and−ℓa,bpri<0\-\\ell^\{\\mathrm\{pri\}\}\_\{a,b\}<0, soτa,b⋆\>0\\tau^\{\\star\}\_\{a,b\}\>0\. Dividing−ℓa,bpri\>Δa,b\-\\ell^\{\\mathrm\{pri\}\}\_\{a,b\}\>\\Delta\_\{a,b\}byΔa,b<0\\Delta\_\{a,b\}<0flips the inequality, givingτa,b⋆<1\\tau^\{\\star\}\_\{a,b\}<1iffℓa,bctx<0\\ell^\{\\mathrm\{ctx\}\}\_\{a,b\}<0, which holds\. Sinceℓa,b​\(0\)=ℓa,bpri\>0\\ell\_\{a,b\}\(0\)=\\ell^\{\\mathrm\{pri\}\}\_\{a,b\}\>0and the slopeΔa,b<0\\Delta\_\{a,b\}<0,ℓa,b​\(τ\)\\ell\_\{a,b\}\(\\tau\)decreases through zero atτa,b⋆\\tau^\{\\star\}\_\{a,b\}and continues into the negative region forτ\>1\\tau\>1, amplifying the wrong preference\.

Agreement\.Forτ∈\[0,1\]\\tau\\in\[0,1\],ℓa,b​\(τ\)\\ell\_\{a,b\}\(\\tau\)is a convex combination of two positive numbers and stays positive\. Forτ\>1\\tau\>1withΔa,b<0\\Delta\_\{a,b\}<0, an analogous calculation givesτa,b⋆\>1\\tau^\{\\star\}\_\{a,b\}\>1, so any reversal lies far in the extrapolation regime; forΔa,b≥0\\Delta\_\{a,b\}\\geq 0, no reversal occurs at all\. ∎

### B\.5Corollary 6

###### Proof\.

Apply Proposition[4](https://arxiv.org/html/2606.10298#Thmcorollary4)witha=ca=candb=sb=s\. The affine identity givesℓc,s​\(τ\)=ℓc,sctx\+\(τ−1\)​Δc,s\\ell\_\{c,s\}\(\\tau\)=\\ell^\{\\mathrm\{ctx\}\}\_\{c,s\}\+\(\\tau\-1\)\\Delta\_\{c,s\}, so forτ\>1\\tau\>1the displacement\(τ−1\)​\|Δc,s\|\(\\tau\-1\)\|\\Delta\_\{c,s\}\|grows linearly without bound\. The three cases follow by reading off the sign ofΔc,s\\Delta\_\{c,s\}\. ∎

## Appendix CTriState\-Bench Construction Details

### C\.1Sampled Fact Construction: A Source\-First Anchor\-Driven Pipeline

We adopt asource\-firststrategy: candidate entities are first pulled from DBpedia along type\-specific categories, and only then rewritten by an LLM under the constraint of Wikipedia evidence\. By shifting the decision of which entity to write about from the LLM to a structured source, we mitigate two structural failure modes of free LLM generation\. First, head clustering: free generation tends to repeatedly surface high\-frequency entities that already saturate the pretraining corpus\. Second, homogeneous difficulty: it is hard to produce enoughℱwrongM\\mathcal\{F\}\_\{\\text\{wrong\}\}^\{M\}samples through LLM imagination alone, yet these samples are precisely what diagnoses the fragility of extrapolation\-based methods\.

#### Anchor extraction\.

Anchors are sampled from DBpedia along three answer types, and each type carries its own bucketing scheme so that coverage is enforced along orthogonal axes\. For person anchors, we query DBpedia with a class plus birth\-date constraint, with a subject\-category regex as fallback, and bucket along occupation and era\. For location anchors, we query DBpedia with place\-class and subject\-category constraints, and bucket along feature type and continent\. For scientific\-term anchors, we query DBpedia with subject\-category regex constraints, and bucket along subfield\. Each candidate is then passed through a type\-specific surface\-form filter that discards items DBpedia mislabels, for example proper nouns that look like persons but resolve to fictional characters or ships\.

#### Evidence retrieval\.

For each surviving anchor, we issue a web search and concatenate the knowledge graph card with the organic snippets to form a single evidence string\. An anchor is kept only if its evidence is sufficiently long, contains the canonical answer as a substring, and is not sourced from a banned URL list covering online forums, community Q&A sites, and machine\-translated mirrors\. Anchors that fail any of these conditions are discarded before any LLM call is made, which keeps generation cost bounded by retrieval quality\.

#### LLM rewriting and validation\.

The surviving anchor\-evidence pairs are handed to DeepSeek\-V4\-Flash, which produces a final truth string, an alias set, and a normalized evidence block\. The output then undergoes format and length checks on the truth field, followed by surface\-form string matching and embedding\-level similarity to remove near\-duplicates\. The resulting fact repository is therefore anchored in entity composition by structured sources, while its linguistic surface is normalized by the LLM\.

### C\.2Prior Calibration

Given a target modelMMand a factfif\_\{i\}, its three question variantsqi\(1\.\.3\)q\_\{i\}^\{\(1\.\.3\)\}are decoded without any context\. Each question is probed in two stages: the greedy decision is treated as the anchor, and stochastic sampling only refines its confidence\.

#### Anchor probing\.

A single greedy decoding pass acts as a hard gate\. If greedy decoding fails on a question, no number of subsequent stochastic hits can promote that question to*matched*; if greedy succeeds, a few stochastic misses cannot demote it to*missed*either\. This anchor removes sampling flukes and ensures that the calibration verdict reflects the model’s most confident behaviour rather than tail draws\.

#### Confidence sampling and verdict\.

A bounded number of stochastic decodes then estimate the prior’s stability under small perturbations, with hits scored by normalized substring matching against the alias set𝒜i\\mathcal\{A\}\_\{i\}\. Lethi\(k\)h\_\{i\}^\{\(k\)\}be the stochastic hit count on thekk\-th question\. The verdict is*matched*when greedy hits andhi\(k\)h\_\{i\}^\{\(k\)\}clears a high\-hit threshold,*missed*when greedy misses andhi\(k\)h\_\{i\}^\{\(k\)\}stays under a low\-hit threshold, and*uncertain*otherwise; the two thresholds are tuned to balance prior concentration against tolerance to sampling noise\.

At evaluation time, the same question is decoded with a fixed\-seed greedy pass, so the test\-time inference behavior matches the one used during calibration\.

## Appendix DDataset and Model Details

### D\.1QA Datasets

Some QA datasets \(e\.g\., NQ, TriviaQA, HotpotQA\) do not have publicly available test sets, so we report results on the validation set\. FollowingShi et al\. \([2024](https://arxiv.org/html/2606.10298#bib.bib18)\), we sub\-sample datasets with very large test sets to expedite inference\. Across all datasets we use greedy decoding with a maximum generation length of 32 tokens, and we truncate contexts to a maximum length of 4,064 tokens\. We report Exact Match \(EM\) and token\-level F1 as evaluation metrics\.

- •Natural Questions \(NQ;Kwiatkowski et al\.,[2019](https://arxiv.org/html/2606.10298#bib.bib6)\)is a large\-scale QA dataset consisting of real user queries issued to Google Search paired with Wikipedia passages as evidence\. From the NQ validation set \(originally 7\.83K examples\), we select instances that have short answers, yielding roughly 3,200 samples\. This dataset represents a low\-conflict, standard RAG setting in which the context generally supports the model’s pre\-trained memory\.
- •TriviaQA\(Joshi et al\.,[2017](https://arxiv.org/html/2606.10298#bib.bib3)\)consists of questions written by trivia enthusiasts together with evidence documents that were independently collected from multiple sources\. It requires models to process long passages and to reason across multiple sentences\. We randomly sample 2,000 instances from the TriviaQA Wiki validation set, which contains 8K examples in total\.
- •HotpotQA\(Yang et al\.,[2018](https://arxiv.org/html/2606.10298#bib.bib23)\)is a multi\-hop QA dataset that requires reasoning over multiple Wikipedia documents\. We evaluate on the full validation set under the distractor setting \(7,405 instances\), which probes information integration and conflict filtering in a complex retrieval environment\.
- •TabMWP\(Lu et al\.,[2022](https://arxiv.org/html/2606.10298#bib.bib12)\)is a semi\-structured math reasoning QA dataset whose contexts are tables, requiring the model to understand tabular data and perform numerical reasoning\. We use the official test1k subset of 1,000 instances to evaluate the model’s ability to process structured contexts\.

NQc: The Shivalik Hills to the northeast\.x: Which geographical part of Haryana is Shivalik Hills situated?TabMWPc: Table: Name — Number of coins; Braden — 76; Camilla — 94; Rick — 86; Mary — 84; Hector — 80; Devin — 83; Emily — 82; Avery — 87x: Some friends discussed the sizes of their coin collections\. What is the mean of the numbers?TriviaQAc:*Astronomical Glossary:*…Cepheid Variable, …Pulsar \(a rotating neutron star emitting periodic radio pulses\) …x: What general name is given to a rotating star which emits a regular beat of radiation?HotpotQAc:*Ed Wood \(film\):*Ed Wood is a 1994 American biographical period comedy\-drama film …*Scott Derrickson:*Scott Derrickson is an American director, screenwriter and producer …x: Were Scott Derrickson and Ed Wood of the same nationality?TriState\-Bench –𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}\(Agreement\)c: George Washington, a prominent American Founding Father and military officer, served as the first president of the United States from 1789 to 1797 …x: Who was the first president of the United States?TriState\-Bench –𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}\(Correction\)c: Cecilia Payne\-Gaposchkin, a pioneering British\-born American astronomer, established the fundamental composition of stars in her 1925 doctoral thesis …x: Who determined that stars are composed primarily of hydrogen and helium?TriState\-Bench –𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}\(Resistance\)c: John Adams \(October 30, 1735 – July 4, 1826\) was an American Founding Father, statesman, and politician who served as the first president of the United States from 1789 to 1797 …x: Who was the first president of the United States?Table 5:Examples of\(c,x\)\(c,x\)pairs from each dataset\. The top four blocks are standard QA benchmarks; the bottom three are the disjoint subsets of our TriState\-Bench, defined by the conflict state:𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\},𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}, and𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}\. The same queryxxappears in𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}and𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}but with differentcc, isolating conflict from question difficulty\. Long passages are abbreviated with “…”\.
### D\.2Models

Our main experiments cover four mid\-sized base models: Llama3\-8B\(Grattafiori et al\.,[2024](https://arxiv.org/html/2606.10298#bib.bib1)\), Mistral\-7B\(Jiang et al\.,[2023](https://arxiv.org/html/2606.10298#bib.bib2)\), Qwen2\.5\-7B\(Qwen et al\.,[2025](https://arxiv.org/html/2606.10298#bib.bib16)\), and Llama2\-13B\(Touvron et al\.,[2023](https://arxiv.org/html/2606.10298#bib.bib19)\)\. The instruction\-following experiments use the corresponding instruction\-tuned variants: Llama3\-8B\-Instruct, Mistral\-7B\-Instruct\-v0\.1, Qwen2\.5\-7B\-Instruct, and Llama2\-13B\-Chat\. All models are obtained from the HuggingFace Hub and run in FP16 precision\. We do not fine\-tune any model; all experiments are conducted entirely as inference\-time decoding\.

### D\.3Licenses

Datasets are released under the following licenses:

- •Natural Questions: Apache\-2\.0 license
- •TriviaQA: Apache\-2\.0 license
- •HotpotQA: CC BY\-SA 4\.0 license
- •TabMWP: CC BY\-NC\-SA 4\.0 license

The models we use have the following licenses:

- •
- •
- •Mistral: Apache\-2\.0 license
- •Qwen2\.5: Apache\-2\.0 license

## Appendix EMore Results in TriState\-Bench

A detailed comparison of additional parameters across all methods for Llama3\-8B, Mistral\-7B, Qwen2\.5\-7B and Llama2\-13B is presented in Tables[6](https://arxiv.org/html/2606.10298#A5.T6)and[7](https://arxiv.org/html/2606.10298#A5.T7)\.

ModelMethod𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}EMF1EMF1EMF1Llama3\-8BGreedy59\.5070\.794\.509\.7864\.7573\.92Greedy\_no\_ctx2\.0011\.3391\.5094\.0091\.0093\.65CAD \(α\\alpha=0\.25\)44\.5058\.001\.507\.6841\.5054\.62CAD \(α\\alpha=0\.5\)33\.0049\.351\.007\.6730\.5045\.96CAD \(α\\alpha=0\.75\)26\.7544\.720\.757\.1924\.5040\.49CAD \(α\\alpha=1\.0\)21\.7541\.070\.757\.1121\.2537\.92COIECD30\.2547\.351\.756\.2935\.0050\.99AdaCAD51\.5064\.193\.259\.6748\.5060\.43CoCoA31\.2547\.881\.007\.6830\.0045\.52Simple Interp \(τ\\tau=0\.25\)10\.5020\.3171\.0075\.8392\.5095\.02Simple Interp \(τ\\tau=0\.5\)40\.7552\.6642\.5048\.7089\.2592\.59Simple Interp \(τ\\tau=0\.75\)70\.5079\.2517\.7524\.2081\.7587\.01ARR \(Ours\)75\.0082\.6833\.2538\.5976\.7583\.15Mistral\-7BGreedy91\.2594\.755\.2512\.9982\.7587\.76Greedy\_no\_ctx0\.008\.9671\.7581\.1071\.7581\.10CAD \(α\\alpha=0\.25\)90\.7594\.141\.759\.6883\.7587\.95CAD \(α\\alpha=0\.5\)87\.0091\.080\.758\.3473\.2575\.91CAD \(α\\alpha=0\.75\)79\.7585\.470\.506\.7547\.0049\.14CAD \(α\\alpha=1\.0\)70\.5078\.480\.255\.4725\.0027\.51COIECD85\.0089\.461\.008\.6172\.7575\.56AdaCAD91\.0094\.252\.2510\.0683\.2587\.95CoCoA55\.5070\.100\.506\.9832\.0048\.19Simple Interp \(τ\\tau=0\.25\)2\.2514\.2045\.0060\.0175\.5083\.36Simple Interp \(τ\\tau=0\.5\)20\.5034\.1528\.5043\.4479\.7585\.99Simple Interp \(τ\\tau=0\.75\)66\.0074\.9715\.7527\.1480\.7586\.45ARR \(Ours\)88\.0092\.7322\.5030\.6682\.0087\.46

Table 6:Performance of all methods on the three TriState\-Bench subsets using Llama3\-8B and Mistral\-7B\.𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}\(Correction\): gold context, prior incorrect;𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}\(Resistance\): corrupted context, prior correct;𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}\(Agreement\): gold context, prior correct\. ARR consistently dominates the resistance subset𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}where all baselines collapse, while staying competitive on𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}and𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}\. Bold marks the best value within each model block\.ModelMethod𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}EMF1EMF1EMF1Qwen2\.5\-7BGreedy89\.5093\.321\.0010\.0486\.0090\.76Greedy\_no\_ctx0\.008\.7674\.2581\.7374\.2581\.73CAD \(α\\alpha=0\.25\)77\.5084\.720\.508\.4673\.5083\.45CAD \(α\\alpha=0\.5\)58\.5071\.460\.257\.2460\.2574\.26CAD \(α\\alpha=0\.75\)38\.5057\.620\.256\.7549\.2567\.73CAD \(α\\alpha=1\.0\)25\.7549\.650\.256\.3036\.2556\.18COIECD65\.7576\.630\.507\.1956\.7571\.96AdaCAD81\.2587\.500\.758\.9482\.2588\.56CoCoA26\.5046\.410\.256\.1930\.7550\.60Simple Interp \(τ\\tau=0\.25\)6\.0020\.6036\.5053\.0581\.2586\.92Simple Interp \(τ\\tau=0\.5\)31\.2548\.4112\.5029\.8185\.5090\.47Simple Interp \(τ\\tau=0\.75\)73\.0080\.683\.5015\.9887\.0091\.23ARR \(Ours\)85\.7591\.0915\.7522\.9288\.2592\.27Llama2\-13BGreedy86\.5090\.791\.759\.5893\.7595\.83Greedy\_no\_ctx0\.008\.1694\.2596\.5094\.2596\.50CAD \(α\\alpha=0\.25\)85\.2589\.851\.759\.6392\.0094\.61CAD \(α\\alpha=0\.5\)80\.2585\.441\.509\.1385\.5089\.46CAD \(α\\alpha=0\.75\)73\.5080\.521\.008\.4578\.0084\.33CAD \(α\\alpha=1\.0\)66\.0075\.740\.507\.7572\.5080\.35COIECD80\.0084\.911\.008\.6284\.0088\.25AdaCAD85\.2589\.831\.759\.6593\.2595\.40CoCoA80\.2585\.441\.509\.1385\.5089\.46Simple Interp \(τ\\tau=0\.25\)11\.7518\.2562\.7566\.7494\.2596\.08Simple Interp \(τ\\tau=0\.5\)38\.5043\.8629\.2533\.2593\.7595\.81Simple Interp \(τ\\tau=0\.75\)71\.5075\.9111\.2517\.9094\.0095\.99ARR \(Ours\)84\.0089\.1217\.0023\.1793\.7595\.81

Table 7:Performance of all methods on the three TriState\-Bench subsets using Qwen2\.5\-7B and Llama2\-13B\.𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}\(Correction\): gold context, prior incorrect;𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}\(Resistance\): corrupted context, prior correct;𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}\(Agreement\): gold context, prior correct\. ARR consistently dominates the resistance subset𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}where all baselines collapse, while staying competitive on𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}and𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}\. Bold marks the best value within each model block\.
## Appendix FInstruction\-tuned LLMs Experiments

We replicate the main evaluation on four instruction\-tuned backbones—Llama\-3\-8B\-Instruct, Mistral\-7B\-Instruct, Qwen2\.5\-7B\-Instruct, and Llama\-2\-13B\-Instruct \(Tables[8](https://arxiv.org/html/2606.10298#A6.T8)and[9](https://arxiv.org/html/2606.10298#A6.T9)\)\.

#### Direction\-consistent gain on𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}\.

ARR is the only method that pushes𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}EM into double digits across all four instruct backbones; every baseline remains below2\.502\.50EM on this subset\. On Mistral\-Instruct, ARR moves𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}from2\.502\.50to26\.5026\.50EM \(13\.93→37\.7413\.93\\to 37\.74F1\); on Llama\-3\-Instruct from0\.750\.75to24\.5024\.50EM \(11\.10→35\.6811\.10\\to 35\.68F1\); on Llama\-2\-13B\-Instruct from0\.500\.50to11\.0011\.00EM \(8\.90→23\.608\.90\\to 23\.60F1\)\. On Qwen2\.5\-Instruct, where EM is non\-comparable \(see below\), ARR raises𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}F1 from7\.747\.74to10\.7510\.75\. This mirrors the base\-model finding: ARR’s advantage concentrates on the resistance subset where existing methods uniformly collapse\.

#### TriState\-Bench aggregate: two wins, two losses\.

ARR achieves a net TriState\-Bench EM gain of\+5\.75\+5\.75over the strongest baseline on both Llama\-3\-Instruct \(66\.7566\.75vs\. Greedy61\.0061\.00\) and Mistral\-Instruct \(60\.9260\.92vs\. AdaCAD55\.1755\.17\)\. On Llama\-2\-13B\-Instruct, however, ARR loses6\.006\.00EM to Greedy \(44\.1744\.17vs\.50\.1750\.17\): the𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}cost \(−17\.8\-17\.8EM\) outweighs the𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}gain \(\+10\.5\+10\.5EM\)\. We report this as a limitation rather than suppressing it; the trade\-off is inherent whenτ<1\\tau<1over\-corrects on a backbone whose prior is weaker than its context\-reading ability\.

#### Qwen2\.5\-Instruct collapses EM, not F1\.

Every method on Qwen2\.5\-7B\-Instruct produces EM≤2\.83\\leq 2\.83\. The model reformats every answer into a full explanatory sentence \(e\.g\., “*Cecilia Payne\-Gaposchkin determined that stars are composed primarily of hydrogen and helium*”\), so normalized first\-line EM cannot match the gold span\. Token\-level F1 remains well\-defined and ranks methods consistently with other backbones: ARR leads all baselines on𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}F1 by33points\. We therefore rely on F1 for this backbone\.

#### Traditional QA\.

ARR’s traditional\-QA performance is mixed across instruct backbones\. On Llama\-3\-Instruct, ARR achieves the best overall average \(33\.3333\.33EM\) and wins 3 of 4 datasets \(TabMWP, HotpotQA, TriviaQA\), losing NQ by66EM\. On Mistral\-Instruct, Greedy leads overall \(38\.4238\.42vs\. ARR38\.0238\.02EM\); ARR matches or exceeds baselines only on TriviaQA\. On Llama\-2\-13B\-Instruct, ARR wins TabMWP and HotpotQA but loses NQ and TriviaQA, trailing Greedy by2\.352\.35EM on average\. Unlike the base\-model setting, ARR does not uniformly dominate traditional QA when applied to instruct models\.

#### Why the gain is narrower than on base models\.

Instruction tuning saturates the𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}and𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}subsets: all baselines already read context faithfully when context is correct, leaving little headroom for decoding\-time intervention\. The only subset where pulling toward the prior \(τ<1\\tau<1\) retains measurable effect is𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}, where the model must override a wrong context using parametric knowledge\. ARR’s directional gain therefore concentrates there, and the high baseline on𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}/𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}dilutes the overall average improvement\.

ModelMethodNQTabMWPHotpotQATriviaQATriStateAvg\.EMF1EMF1EMF1EMF1EMF1EMF1Llama3\-8B\-InstGreedy31\.0850\.8014\.2019\.258\.7022\.9437\.0047\.5261\.0066\.9630\.4041\.49CAD21\.1741\.573\.407\.391\.3412\.258\.8518\.8138\.0850\.1714\.5726\.04COIECD32\.9352\.324\.8011\.084\.7718\.1226\.1036\.9258\.7565\.5825\.4736\.80AdaCAD31\.8651\.2610\.6016\.246\.7020\.4030\.1540\.3060\.1766\.4127\.9038\.92CoCoA32\.1151\.326\.2011\.353\.7016\.1218\.2029\.3656\.9264\.5223\.4334\.53ARR\(Ours\)24\.9844\.8016\.7021\.3910\.2224\.3948\.0059\.8366\.7573\.7333\.3344\.83Mistral\-7B\-InstGreedy35\.4553\.2534\.5039\.2120\.6334\.7647\.0057\.4454\.5062\.8138\.4249\.49CAD26\.1245\.5421\.9028\.2710\.3022\.2322\.7034\.7044\.9254\.9225\.1937\.13COIECD33\.8552\.5731\.5035\.8017\.9331\.8439\.5051\.6951\.3360\.2834\.8246\.44AdaCAD36\.7154\.3033\.9038\.2919\.7833\.7045\.1056\.0855\.1763\.2038\.1349\.11CoCoA31\.1450\.0930\.4034\.6310\.0124\.2027\.1040\.010\.2519\.2819\.7833\.64ARR\(Ours\)31\.4348\.2630\.4036\.5919\.8633\.8647\.5057\.6660\.9269\.5738\.0249\.19Qwen2\.5\-7B\-InstGreedy0\.8818\.181\.504\.430\.419\.623\.1512\.511\.2521\.011\.4413\.15CAD4\.5920\.333\.808\.210\.328\.971\.259\.572\.8317\.772\.5612\.97COIECD1\.3217\.955\.809\.160\.349\.852\.5011\.402\.4219\.482\.4813\.57AdaCAD1\.9219\.025\.909\.260\.309\.882\.2511\.222\.2520\.102\.5213\.90CoCoA3\.6420\.557\.5011\.190\.309\.681\.3510\.071\.5018\.452\.8613\.99ARR\(Ours\)0\.5016\.022\.104\.240\.6210\.072\.2013\.360\.9220\.471\.2712\.83Llama2\-13B\-InstGreedy20\.5741\.366\.8013\.877\.9820\.2737\.1048\.9650\.1758\.9324\.5236\.68CAD13\.8733\.322\.207\.861\.9810\.8412\.0025\.2625\.1343\.6011\.0424\.18COIECD22\.3942\.863\.609\.536\.2517\.4632\.3044\.6849\.6358\.5022\.8334\.61AdaCAD22\.3942\.486\.5012\.537\.5818\.8635\.9047\.6350\.0758\.9324\.4936\.09CoCoA3\.1826\.933\.008\.161\.2311\.826\.8024\.187\.5037\.274\.3421\.67ARR\(Ours\)14\.5334\.168\.2014\.588\.0420\.6435\.9048\.2744\.1757\.0722\.1734\.94

Table 8:Performance comparison on Instruct models\. Each benchmark reports EM and F1\. TheAvg\.column is the arithmetic mean over all five benchmarks \(NQ, TabMWP, HotpotQA, TriviaQA, and TriState\)\. Bold marks the best value within each model block\.ModelMethod𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}EMF1EMF1EMF1Llama3\-8B\-InstGreedy92\.5095\.960\.7511\.1089\.7593\.82CAD60\.7575\.380\.259\.5553\.2565\.57COIECD90\.2593\.750\.2511\.3385\.7591\.66AdaCAD91\.7595\.130\.2510\.8888\.5093\.22CoCoA87\.2592\.060\.2510\.8183\.2590\.70ARR\(Ours\)86\.7591\.7524\.5035\.6889\.0093\.77Mistral\-7B\-InstGreedy81\.0087\.372\.5013\.9380\.0087\.12CAD75\.2583\.260\.2510\.6159\.2570\.90COIECD80\.7587\.331\.0011\.8372\.2581\.68AdaCAD84\.7589\.981\.5012\.7979\.2586\.82CoCoA0\.5028\.030\.005\.660\.2524\.16ARR\(Ours\)76\.5083\.6626\.5037\.7479\.7587\.30Qwen2\.5\-7B\-InstGreedy1\.2527\.810\.007\.742\.5027\.49CAD6\.5027\.930\.005\.322\.0020\.07COIECD2\.5025\.230\.006\.994\.7526\.24AdaCAD2\.5027\.640\.007\.314\.2525\.36CoCoA2\.5027\.150\.006\.422\.0021\.78ARR\(Ours\)0\.7524\.440\.2510\.751\.7526\.24Llama2\-13B\-InstGreedy77\.0085\.300\.508\.9073\.0082\.60CAD50\.2068\.800\.007\.4025\.2054\.60COIECD74\.2083\.700\.208\.7074\.5083\.10AdaCAD78\.5086\.700\.208\.4071\.5081\.70CoCoA11\.5059\.800\.005\.5011\.0046\.50ARR\(Ours\)59\.2072\.8011\.0023\.6062\.3074\.80

Table 9:Performance on the three TriState\-Bench subsets for Instruct models\.𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}\(Correction\): gold context, prior incorrect;𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}\(Resistance\): corrupted context, prior correct;𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}\(Agreement\): gold context, prior correct\. ARR consistently dominates the resistance subset𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}where all baselines collapse, while staying competitive on𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}and𝒮agr\\mathcal\{S\}\_\{\\mathrm\{agr\}\}\. Bold marks the best value within each model block\.

## Appendix GCase Study Details

We complete the full EM\-vs\-τ\\taucurves in Figure[7](https://arxiv.org/html/2606.10298#A7.F7)and the detailed cases is showed on Figure[8](https://arxiv.org/html/2606.10298#A7.F8),[9](https://arxiv.org/html/2606.10298#A7.F9)\.

![Refer to caption](https://arxiv.org/html/2606.10298v1/x7.png)Figure 7:Blue/red shading separates the interpolation \(τ∈\[0,1\]\\tau\\in\[0,1\]\) and extrapolation \(τ\>1\\tau\>1\) regimes\.𝒮res\\mathcal\{S\}\_\{\\mathrm\{res\}\}decays monotonically withτ\\tau;𝒮cor\\mathcal\{S\}\_\{\\mathrm\{cor\}\}peaks nearτ≈1\\tau\\approx 1and collapses under extrapolation, with model\-dependent severity \(sharpest for Llama\-3\-8B, mildest for Llama\-3\-8B\-Instruct\)\.![Refer to caption](https://arxiv.org/html/2606.10298v1/x8.png)Figure 8:Detailed cases on Llama\-3\-8B, illustrating how varyingτ\\tauaffects decoding outputs and exposes the structural failure of extrapolation methods\.![Refer to caption](https://arxiv.org/html/2606.10298v1/x9.png)Figure 9:The detailed failure cases in TriState\-Bench on Llama\-3\-8B, Qwen2\.5\-7B, and Mistral\-7B\.
## Appendix HPrompts

We use the QA prompt template shown in Table[10](https://arxiv.org/html/2606.10298#A8.T10)\.

Question Answering Prompt TemplateWith ContextWithout ContextUsing only the references listed below, answer the following question\.Context: \{context\} Question: \{question\}? Answer:

Answer the following question\.Question: \{question\}? Answer:

Table 10:Prompt templates

Similar Articles

Context-Aware RL for Agentic and Multimodal LLMs

Hugging Face Daily Papers

Introduces ContextRL, a reinforcement learning approach that teaches LLMs to identify which context supports an answer, achieving gains on agentic and multimodal benchmarks.