Backbone-Equated Diffusion OOD via Sparse Internal Snapshots

arXiv cs.LG 05/13/26, 04:00 AM Papers
Summary
This paper introduces a protocol for fair comparison of diffusion-based OOD detectors and proposes Canonical Feature Snapshots (CFS), which leverage sparse internal activations for efficient detection.
arXiv:2605.11014v1 Announce Type: new Abstract: Fair comparison between diffusion-based OOD detectors is challenging, as conclusions can vary with backbone choice, corruption parameterization, and test-time budget. We address this issue through a Mutualized Backbone-Equated (MBE) protocol that aligns canonical corruption levels and logical test-time cost across diffusion backbones. Within this setting, we introduce Canonical Feature Snapshots (CFS), a family of detectors that probes a frozen diffusion backbone using only a tiny number of native internal activations at canonical low-noise levels. On a controlled CIFAR-scale benchmark, the strongest one-forward CFS variant is CFS(1x2), while an even smaller decoder-only variant remains highly competitive. This shows that much of the relative-OOD signal exposed by frozen diffusion backbones is concentrated in a small number of sparse internal states, rather than requiring full denoising trajectories or high-capacity downstream heads. We further provide a local diagnostic theory explaining these observations through conditional encoder-decoder complementarity, diagonal-score separation, and low-noise corruption stability. The official implementation is available at https://github.com/RouzAY/cfs-diffusion-ood/.
Original Article Export to Word Export to PDF
View Cached Full Text
Cached at: 05/13/26, 06:28 AM
# Backbone-Equated Diffusion OOD via Sparse Internal Snapshots
Source: [https://arxiv.org/html/2605.11014](https://arxiv.org/html/2605.11014)
Yadang Alexis Rouzoumka1,2 DEMR, ONERA & SONDRA Université Paris\-Saclay yadang\-alexis\.rouzoumka@centralesupelec\.fr, rouzoumkaalexis@yahoo\.fr&Jean Pinsolle2 SONDRA, CentraleSupélec Université Paris\-Saclay &Eugénie Terreaux1 DEMR, ONERA Université Paris\-Saclay Christèle Morisseau1 DEMR, ONERA Université Paris\-Saclay &Jean\-Philippe Ovarlez1,2 DEMR, ONERA & SONDRA Université Paris\-Saclay &Chengfang Ren2 SONDRA, CentraleSupélec Université Paris\-Saclay

###### Abstract

Fair comparison between diffusion\-based OOD detectors is challenging, as conclusions can vary with backbone choice, corruption parameterization, and test\-time budget\. We address this issue through a*Mutualized Backbone\-Equated*\(MBE\) protocol that aligns canonical corruption levels and logical test\-time cost across diffusion backbones\. Within this setting, we introduce*Canonical Feature Snapshots*\(CFS\), a family of detectors that probes a frozen diffusion backbone using only a tiny number of native internal activations at canonical low\-noise levels\. On a controlled CIFAR\-scale benchmark, the strongest one\-forward CFS variant isCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\), while an even smaller decoder\-only variant remains highly competitive\. This shows that much of the relative\-OOD signal exposed by frozen diffusion backbones is concentrated in a small number of sparse internal states, rather than requiring full denoising trajectories or high\-capacity downstream heads\. We further provide a local diagnostic theory explaining these observations through conditional encoder\-decoder complementarity, diagonal\-score separation, and low\-noise corruption stability\. The official implementation is available at[https://github\.com/RouzAY/cfs\-diffusion\-ood/](https://github.com/RouzAY/cfs-diffusion-ood/)\.

## 1Introduction

Out\-of\-distribution \(OOD\) detection asks whether a model can identify inputs that fall outside the regime where its predictions should be trusted\. In diffusion models, once the backbone is frozen, the same image can be probed across a structured family of corruption levels, producing a hierarchy of hidden states before the final denoising output\.

Most diffusion\-based OOD methods probe this backbone in*output space*: through reconstructions, denoising residuals, trajectories, output\-side geometry, or posterior\-consistency scores\(Mahmoodet al\.,[2021](https://arxiv.org/html/2605.11014#bib.bib13); Grahamet al\.,[2023](https://arxiv.org/html/2605.11014#bib.bib14); Liuet al\.,[2023](https://arxiv.org/html/2605.11014#bib.bib11); Gaoet al\.,[2023](https://arxiv.org/html/2605.11014#bib.bib12); Henget al\.,[2024](https://arxiv.org/html/2605.11014#bib.bib8); Rouzoumkaet al\.,[2026](https://arxiv.org/html/2605.11014#bib.bib7); Barkleyet al\.,[2026](https://arxiv.org/html/2605.11014#bib.bib9); Shoushtariet al\.,[2026](https://arxiv.org/html/2605.11014#bib.bib10)\)\. This has produced useful detectors, but it leaves open a more basic question: Within a frozen diffusion backbone, where does the most useful information for OOD detection reside?

We argue that this question is obscured by two coupled issues\. The first is*protocol confounding*: in diffusion OOD, conclusions can depend strongly on checkpoint family \(e\.g\., DDPM vs\. EDM\), corruption coordinates, and test\-time budget, much as protocol choices already matter in post\-hoc OOD benchmarking more broadly\(Yanget al\.,[2022](https://arxiv.org/html/2605.11014#bib.bib18); Zhanget al\.,[2024](https://arxiv.org/html/2605.11014#bib.bib19)\)\. The second is representation mismatch: output\-space summaries rely on compressed readouts and may miss discriminative information still present in internal states, consistent with the growing view of diffusion models as representation learners\(Yang and Wang,[2023](https://arxiv.org/html/2605.11014#bib.bib33); Luoet al\.,[2023](https://arxiv.org/html/2605.11014#bib.bib32); Yuet al\.,[2025](https://arxiv.org/html/2605.11014#bib.bib35)\)\.

We study this question under a controlled evaluation setting\. First, we introduce a*Mutualized Backbone\-Equated*\(MBE\) protocol that aligns checkpoint\-family policy, canonical corruption levels, and logical test\-time cost across diffusion backbones\. Second, within this setting, we propose*Canonical Feature Snapshots*\(CFS\), a family of OOD detectors that probes a frozen diffusion backbone through a tiny number of aligned internal activations\.

On a controlled CIFAR\-scale benchmark, the strongest one\-forward operating point isCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\), while a decoder\-only variant,CFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\), remains highly competitive\. Under controlled evaluation, useful OOD signal in frozen diffusion backbones is already strongly concentrated in a tiny number of sparse native internal snapshots\.

To explain these trends, we develop a measurable local\-testing view of sparse diffusion probing\. It yields three diagnostic principles: conditional encoder\-decoder complementarity, diagonal\-score separation, and low\-noise corruption stability\. The theory is local rather than universal, but it produces directly estimable quantities for selecting hooks, levels, and encoder\-decoder pairings, and we show empirically that these diagnostics track downstream OOD behavior across both improved\-diffusion and EDM backbones\.

##### Contributions\.

- •A controlled protocol and sparse detector\.We formulate diffusion OOD comparison under a*Mutualized Backbone\-Equated*\(MBE\) protocol, and introduce*Canonical Feature Snapshots*\(CFS\), a detector family that probes frozen diffusion backbones through a tiny number of aligned internal activations\.
- •Evidence that the OOD signal is highly concentrated\.Under MBE,CFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)is a strong one\-forward operating point, while the single late\-decoder probeCFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\)remains highly competitive, showing that a relevant signal is strongly concentrated in sparse native snapshots\.
- •A local diagnostic explanation\.We develop a measurable local\-testing view that explains conditional encoder\-decoder complementarity, diagonal\-score separation, and low\-noise stability, and show that its diagnostics track empirical behavior\.

## 2Related Work and Positioning

Post\-hoc OOD detection and protocol sensitivity\.A large literature studies post\-hoc OOD scores for pretrained discriminative models, including Mahalanobis distances, energy scores, activation shaping, virtual\-logit matching, and nearest\-neighbor geometry\(Leeet al\.,[2018](https://arxiv.org/html/2605.11014#bib.bib27); Liuet al\.,[2020](https://arxiv.org/html/2605.11014#bib.bib28); Sunet al\.,[2021](https://arxiv.org/html/2605.11014#bib.bib29); Wanget al\.,[2022](https://arxiv.org/html/2605.11014#bib.bib30); Sunet al\.,[2022](https://arxiv.org/html/2605.11014#bib.bib31); Luet al\.,[2025](https://arxiv.org/html/2605.11014#bib.bib44)\)\. Benchmarking efforts such as OpenOOD and OpenOOD v1\.5 show how strongly protocol choices can affect conclusions\(Yanget al\.,[2022](https://arxiv.org/html/2605.11014#bib.bib18); Zhanget al\.,[2024](https://arxiv.org/html/2605.11014#bib.bib19)\)\. These concerns are even sharper in diffusion OOD, where checkpoint family, corruption coordinates, and test\-time budget introduce additional confounders\.

Diffusion OOD in output space\.Most diffusion\-based OOD methods probe denoisers through output\-space quantities or output\-derived summaries\.MSMAaggregates multiscale score norms\(Mahmoodet al\.,[2021](https://arxiv.org/html/2605.11014#bib.bib13)\);DDPM\-OODandLMDuse denoising, reconstruction, or inpainting behavior\(Grahamet al\.,[2023](https://arxiv.org/html/2605.11014#bib.bib14); Liuet al\.,[2023](https://arxiv.org/html/2605.11014#bib.bib11)\); likelihood\-based diffusion OOD has also been studied\(Goodier and Campbell,[2023](https://arxiv.org/html/2605.11014#bib.bib15)\); DiffGuard adds conditional guidance\(Gaoet al\.,[2023](https://arxiv.org/html/2605.11014#bib.bib12)\); andDiffPathsummarizes denoising trajectories from a single unconditional diffusion backbone\(Henget al\.,[2024](https://arxiv.org/html/2605.11014#bib.bib8)\)\. Other methods use output\-side geometric or consistency structure:GEPCmeasures transformation\-induced posterior\-consistency violations in denoiser outputs\(Rouzoumkaet al\.,[2026](https://arxiv.org/html/2605.11014#bib.bib7)\), while SCOPED and EigenScore study output\-side geometry or uncertainty\(Barkleyet al\.,[2026](https://arxiv.org/html/2605.11014#bib.bib9); Shoushtariet al\.,[2026](https://arxiv.org/html/2605.11014#bib.bib10)\)\. Despite their differences, these methods share the same basic viewpoint: the OOD signal is extracted primarily from denoiser outputs, reconstructions, trajectories, consistency residuals, or output\-side geometry\.

Diffusion models as representation learners\.A parallel literature views diffusion models as representation learners\. Pretrained diffusion backbones provide useful internal features for downstream tasks\(Yang and Wang,[2023](https://arxiv.org/html/2605.11014#bib.bib33)\), and semantically meaningful descriptors can be consolidated from multi\-layer and multi\-timestep states\(Luoet al\.,[2023](https://arxiv.org/html/2605.11014#bib.bib32)\)\. Related work also suggests encoder/decoder asymmetry: Faster Diffusion reports that encoder activations vary less across denoising steps than decoder activations\(Liet al\.,[2024](https://arxiv.org/html/2605.11014#bib.bib17)\)\. While not an OOD result, this supports the idea that different internal states play different functional roles\. REPA further argues that strong hidden representations are important for generation quality\(Yuet al\.,[2025](https://arxiv.org/html/2605.11014#bib.bib35)\)\. Recent OOD work also supports representation\-space modeling beyond raw\-pixel likelihoods\(Dinget al\.,[2025](https://arxiv.org/html/2605.11014#bib.bib42); Järveet al\.,[2025](https://arxiv.org/html/2605.11014#bib.bib43)\)\. These works motivate our representation\-first viewpoint\.

Positioning\.We study diffusion OOD under a*Mutualized Backbone\-Equated*protocol and ask: under a shared\-source, backbone\-equated, budget\-accounted comparison, how much relative\-OOD signal is already present in a tiny number of sparse native frozen snapshots? CFS answers this question with a deliberately minimal detector family: no reconstruction module, no guidance machinery, no reverse\-path recursion, and no high\-capacity downstream head\. Table[1](https://arxiv.org/html/2605.11014#S2.T1)summarizesCFSpositioning\.

Table 1:Positioning relative to diffusion\-based OOD detection\.Our distinction is not merely that internal features can help, but that under a shared\-source, backbone\-equated protocol, a tiny number of sparse native frozen snapshots already captures a strong relative\-OOD signal\.
## 3Mutualized Backbone\-Equated Protocol \(MBE\)

OOD benchmarking has shown that evaluation details can dominate perceived progress\(Yanget al\.,[2022](https://arxiv.org/html/2605.11014#bib.bib18); Zhanget al\.,[2024](https://arxiv.org/html/2605.11014#bib.bib19)\)\. In diffusion OOD, this problem is amplified by additional degrees of freedom, including checkpoint family, corruption parameterization, and hidden test\-time budget\. We therefore evaluate all methods under a shared protocol designed to remove these confounders\.

### 3\.1Canonical corruption and cross\-backbone alignment

A first ingredient ofMBEis a shared corruption view across backbone families\. Diffusion backbones evaluate corrupted versions of an input across a family of noise levels\(Hoet al\.,[2020](https://arxiv.org/html/2605.11014#bib.bib1); Nichol and Dhariwal,[2021](https://arxiv.org/html/2605.11014#bib.bib4); Karraset al\.,[2022](https://arxiv.org/html/2605.11014#bib.bib5)\), but their native interfaces differ: some expose discrete timestepstt, while others use continuous noise scales such asσ\\sigma\. To compare OOD detectors across backbones, we therefore work with a common canonical corruption parameterization\.

For a clean sample𝐱0\\mathbf\{x\}\_\{0\}, we write corruption as

𝐱λ=a\(λ\)𝐱0\+b\(λ\)𝜺,𝜺∼𝒩\(0,I\),\\mathbf\{x\}\_\{\\lambda\}=a\(\\lambda\)\\mathbf\{x\}\_\{0\}\+b\(\\lambda\)\\boldsymbol\{\\varepsilon\},\\qquad\\boldsymbol\{\\varepsilon\}\\sim\\mathcal\{N\}\(0,I\),\(1\)and use the canonical coordinate, with a slight abuse of notation,

λ:=log⁡a\(λ\)2b\(λ\)2,\\lambda:=\\log\\frac\{a\(\\lambda\)^\{2\}\}\{b\(\\lambda\)^\{2\}\},\(2\)i\.e\., logSNR\. Largeλ\\lambdacorresponds to cleaner observations and smallλ\\lambdato noisier ones\.

For improved\-diffusion checkpoints,at=α¯ta\_\{t\}=\\sqrt\{\\bar\{\\alpha\}\_\{t\}\},bt=1−α¯tb\_\{t\}=\\sqrt\{1\-\\bar\{\\alpha\}\_\{t\}\}, andλt=log⁡α¯t1−α¯t\\lambda\_\{t\}=\\log\\frac\{\\bar\{\\alpha\}\_\{t\}\}\{1\-\\bar\{\\alpha\}\_\{t\}\}, so a desired canonical level is matched to the nearest native timestep in logSNR space\. EDM\-style backbones instead expose a continuous noise input; our adapter maps each canonicalλ\\lambdato the appropriate backbone\-specific model input and returns coefficients satisfying Eq\. \([1](https://arxiv.org/html/2605.11014#S3.E1)\)\. Canonicalization matters because many diffusion OOD methods are intrinsically multilevel: without a shared corruption coordinate, methods may be compared at levels that are not actually matched in corruption strength\.

### 3\.2Controlled comparison underMBE

UnderMBE, a method is evaluated under the same source\-family policy, preprocessing, canonical corruption levels, ID/OOD splits, and logical test\-time cost as its competitors\. The purpose ofMBEis scientific comparison, not per\-backbone peak tuning\. A diffusion OOD method can otherwise appear stronger for reasons unrelated to its scoring rule: a better\-matched checkpoint, a different input normalization, a different corruption coordinate, or a larger hidden test\-time budget\.

Concretely,MBEroutes all methods through the same canonical corruption semantics and shared adapter interface, and evaluates them under the same split policy and logical budget accounting\. It does not force identical native implementations\. Detailed canonicalization, adapter outputs, and baseline implementation taxonomy are deferred to Appendix[C](https://arxiv.org/html/2605.11014#A3)and Appendix[D](https://arxiv.org/html/2605.11014#A4)\.

For discrete backbones, mapping a continuous logSNR grid to native timesteps can produce duplicates\. We therefore distinguish the candidate grid resolutionKgridK\_\{\\mathrm\{grid\}\}from the number of effective canonical levelsKcK\_\{c\}actually used by a method; Appendix[C\.3](https://arxiv.org/html/2605.11014#A3.SS3)–[C\.4](https://arxiv.org/html/2605.11014#A3.SS4)gives the precise construction\.

We report the logical test\-time cost as

Costm=\#Fm\+\#Jm,\\mathrm\{Cost\}\_\{m\}=\\\#F\_\{m\}\+\\\#J\_\{m\},where\#Fm\\\#F\_\{m\}is the number of backbone forward evaluations per image and\#Jm\\\#J\_\{m\}the number of Jacobian\-type evaluations when applicable\. In the main comparisons of this paper, the dominant variation is in\#F\\\#F, and all compared methods have\#J=0\\\#J=0\.

## 4Method: Canonical Feature Snapshots \(CFS\)

### 4\.1Canonical Feature Snapshots

We ask whether a tiny number of native frozen internal activations already captures useful OOD signal once protocol confounders are removed\. LetP⋆P\_\{\\star\}denote the source distribution used to train a frozen diffusion checkpoint, and letPPandQQdenote the evaluation ID and OOD datasets\. We do not treat the checkpoint as an OOD oracle forP⋆P\_\{\\star\}; instead, we use it as a frozen representation map and define OOD relative to the evaluation reference bankPP\.

ACFSinstance is specified by a small set of canonical levelsΛ\\Lambda, a small set of native internal hooksℋ\\mathcal\{H\}, and an ID\-only scoring head on the resulting pooled slot descriptors\. For eachλ∈Λ\\lambda\\in\\Lambda, we form

𝐱λ=a\(λ\)𝐱0\+b\(λ\)𝜺,𝜺∼𝒩\(𝟎,𝐈\),\\mathbf\{x\}\_\{\\lambda\}=a\(\\lambda\)\\mathbf\{x\}\_\{0\}\+b\(\\lambda\)\\boldsymbol\{\\varepsilon\},\\qquad\\boldsymbol\{\\varepsilon\}\\sim\\mathcal\{N\}\(\\mathbf\{0\},\\mathbf\{I\}\),run one forward pass through the frozen diffusion backbone, and pool each selected activation𝐮λ,h\(𝐱λ\)\\mathbf\{u\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{\\lambda\}\)into a descriptor

𝐳λ,h\(𝐱0\):=Pool⁡\(𝐮λ,h\(𝐱λ\)\)\.\\mathbf\{z\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\):=\\operatorname\{Pool\}\\\!\\left\(\\mathbf\{u\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{\\lambda\}\)\\right\)\.The sparse representation is

𝚽\(𝐱0\)=\[𝐳λ,h\(𝐱0\)\]\(λ,h\)∈𝒮,𝒮⊆Λ×ℋ\.\\boldsymbol\{\\Phi\}\(\\mathbf\{x\}\_\{0\}\)=\\big\[\\mathbf\{z\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\)\\big\]\_\{\(\\lambda,h\)\\in\\mathcal\{S\}\},\\qquad\\mathcal\{S\}\\subseteq\\Lambda\\times\\mathcal\{H\}\.OOD detection then reduces to fitting ID\-only slot statistics on𝚽\(P\)\\boldsymbol\{\\Phi\}\(P\)and scoring deviation from that reference geometry\.

The paper focuses on two operating points:CFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\), which uses one deep encoder and one late decoder hook at a single low\-noise level, andCFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\), a decoder\-only variant\. Because all retained hooks at a fixed level are captured within the same backbone forward pass, logical cost depends only on the number of selected levels:

\#FCFS=\|Λ\|,\#JCFS=0\.\\\#F\_\{\\textsc\{CFS\}\}=\|\\Lambda\|,\\qquad\\\#J\_\{\\textsc\{CFS\}\}=0\.
Figure[1](https://arxiv.org/html/2605.11014#S4.F1)summarizes this pipeline\.

probe setupλ∈Λ\\lambda\\in\\Lambdaλ=log⁡a\(λ\)2b\(λ\)2\\displaystyle\\lambda=\\log\\frac\{a\(\\lambda\)^\{2\}\}\{b\(\\lambda\)^\{2\}\}canonical corruptionxλ=a\(λ\)x0\+b\(λ\)ε\\displaystyle x\_\{\\lambda\}=a\(\\lambda\)x\_\{0\}\+b\(\\lambda\)\\varepsilonnative feature slotFk,h\\displaystyle F\_\{k,h\}=hookk,h\(xλ\)\\displaystyle=\\mathrm\{hook\}\_\{k,h\}\(x\_\{\\lambda\}\)x0x\_\{0\}inputdeep encodersnapshotlate decodersnapshotpooled slotzk,h\\displaystyle z\_\{k,h\}=\[μ\(Fk,h\);σ\(Fk,h\)\]\\displaystyle=\[\\,\\mu\(F\_\{k,h\}\);\\sigma\(F\_\{k,h\}\)\\,\]diagonal scoresk,h\(x\)=1D∑j\\displaystyle s\_\{k,h\}\(x\)=\\frac\{1\}\{D\}\\sum\_\{j\}\(zj−μ^j\)2v^j\\displaystyle\\frac\{\(z\_\{j\}\-\\hat\{\\mu\}\_\{j\}\)^\{2\}\}\{\\hat\{v\}\_\{j\}\}final scoreSCFS\(x\)=1\|𝒮\|\\displaystyle S\_\{\\textsc\{CFS\}\}\(x\)=\\frac\{1\}\{\|\\mathcal\{S\}\|\}∑\(k,h\)∈𝒮sk,h\(x\)\\displaystyle\\sum\_\{\(k,h\)\\in\\mathcal\{S\}\}s\_\{k,h\}\(x\)frozen diffusion U\-Netcanonical probinglightweight ID\-only scoring

Figure 1:Overview ofCFS\.The input is corrupted at a canonical level, processed by a frozen diffusion U\-Net, probed through a small number of native internal snapshots, and scored with a lightweight ID\-only head\.
### 4\.2A local diagnostic view

Our theory is local and diagnostic rather than universal: it asks which frozen internal statistics should be most useful for testing membership in the evaluation ID reference distribution, and it focuses on quantities directly measurable on held\-out data\.

Fix a canonical levelλ\\lambda\. Let𝐳d,λ\(𝐱0\)\\mathbf\{z\}\_\{d,\\lambda\}\(\\mathbf\{x\}\_\{0\}\)and𝐳e,λ\(𝐱0\)\\mathbf\{z\}\_\{e,\\lambda\}\(\\mathbf\{x\}\_\{0\}\)denote a selected late decoder snapshot and a selected deep encoder snapshot, and define

𝐳λ\(𝐱0\)=\[𝐳d,λ\(𝐱0\)⊤,𝐳e,λ\(𝐱0\)⊤\]⊤\.\\mathbf\{z\}\_\{\\lambda\}\(\\mathbf\{x\}\_\{0\}\)=\[\\mathbf\{z\}\_\{d,\\lambda\}\(\\mathbf\{x\}\_\{0\}\)^\{\\top\},\\mathbf\{z\}\_\{e,\\lambda\}\(\\mathbf\{x\}\_\{0\}\)^\{\\top\}\]^\{\\top\}\.We use the local testing approximation

𝐳λ\(𝐱0\)∣H0∼𝒩\(𝝁λ,𝚺λ\),𝐳λ\(𝐱0\)∣H1∼𝒩\(𝝁λ\+𝚫λ,𝚺λ\),\\mathbf\{z\}\_\{\\lambda\}\(\\mathbf\{x\}\_\{0\}\)\\mid H\_\{0\}\\sim\\mathcal\{N\}\(\\boldsymbol\{\\mu\}\_\{\\lambda\},\\boldsymbol\{\\Sigma\}\_\{\\lambda\}\),\\qquad\\mathbf\{z\}\_\{\\lambda\}\(\\mathbf\{x\}\_\{0\}\)\\mid H\_\{1\}\\sim\\mathcal\{N\}\(\\boldsymbol\{\\mu\}\_\{\\lambda\}\+\\boldsymbol\{\\Delta\}\_\{\\lambda\},\\boldsymbol\{\\Sigma\}\_\{\\lambda\}\),\(3\)whereH0:𝐱0∼PH\_\{0\}:\\mathbf\{x\}\_\{0\}\\sim PandH1:𝐱0∼QH\_\{1\}:\\mathbf\{x\}\_\{0\}\\sim Q\. This is a local model on pooled internal representations, not a global image model\.

###### Theorem 1\(Conditional encoder\-decoder complementarity\)\.

Under Eq\. \([3](https://arxiv.org/html/2605.11014#S4.E3)\), the paired separation decomposes as

Seppair\(λ\)=Sepdec\(λ\)\+Rese∣d\(λ\),Rese∣d\(λ\)≥0\.\\mathrm\{Sep\}\_\{\\mathrm\{pair\}\}\(\\lambda\)=\\mathrm\{Sep\}\_\{\\mathrm\{dec\}\}\(\\lambda\)\+\\mathrm\{Res\}\_\{e\\mid d\}\(\\lambda\),\\qquad\\mathrm\{Res\}\_\{e\\mid d\}\(\\lambda\)\\geq 0\.The residual vanishes iff the encoder shift is fully explained by the decoder shift\. Hence

Seppair\(λ\)≥Sepdec\(λ\)\.\\mathrm\{Sep\}\_\{\\mathrm\{pair\}\}\(\\lambda\)\\geq\\mathrm\{Sep\}\_\{\\mathrm\{dec\}\}\(\\lambda\)\.The full expression is given in Appendix[B\.3](https://arxiv.org/html/2605.11014#A2.SS3)\.

Thus, the local model supports treating the late decoder as the primary sparse probe, while viewing the encoder as a complementary source of residual information\.

###### Proposition 1\(Low\-noise corruption stability\)\.

Let

𝐱λ=a\(λ\)𝐱0\+b\(λ\)𝜺,𝜺∼𝒩\(𝟎,𝐈\),𝐳λ,h\(𝐱0\)=ϕλ,h\(𝐱λ\)\.\\mathbf\{x\}\_\{\\lambda\}=a\(\\lambda\)\\mathbf\{x\}\_\{0\}\+b\(\\lambda\)\\boldsymbol\{\\varepsilon\},\\qquad\\boldsymbol\{\\varepsilon\}\\sim\\mathcal\{N\}\(\\mathbf\{0\},\\mathbf\{I\}\),\\qquad\\mathbf\{z\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\)=\\boldsymbol\{\\phi\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{\\lambda\}\)\.Assume thatϕλ,h\\boldsymbol\{\\phi\}\_\{\\lambda,h\}admits a first\-order mean\-square expansion arounda\(λ\)𝐱0a\(\\lambda\)\\mathbf\{x\}\_\{0\}, with Jacobian

𝐉λ,h\(𝐱0\):=∇𝐱ϕλ,h\(𝐱\)\|𝐱=a\(λ\)𝐱0\.\\mathbf\{J\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\):=\\nabla\_\{\\mathbf\{x\}\}\\boldsymbol\{\\phi\}\_\{\\lambda,h\}\(\\mathbf\{x\}\)\\big\|\_\{\\mathbf\{x\}=a\(\\lambda\)\\mathbf\{x\}\_\{0\}\}\.Then, conditionally on𝐱0\\mathbf\{x\}\_\{0\},

𝔼\[∥𝐳λ,h\(𝐱0\)−ϕλ,h\(a\(λ\)𝐱0\)∥22\|𝐱0\]=b\(λ\)2tr\(𝐉λ,h\(𝐱0\)𝐉λ,h\(𝐱0\)⊤\)\+o\(b\(λ\)2\)\.\\mathbb\{E\}\\\!\\left\[\\\|\\mathbf\{z\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\)\-\\boldsymbol\{\\phi\}\_\{\\lambda,h\}\(a\(\\lambda\)\\mathbf\{x\}\_\{0\}\)\\\|\_\{2\}^\{2\}\\,\\middle\|\\,\\mathbf\{x\}\_\{0\}\\right\]=b\(\\lambda\)^\{2\}\\operatorname\{tr\}\\\!\\left\(\\mathbf\{J\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\)\\mathbf\{J\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\)^\{\\top\}\\right\)\+o\(b\(\\lambda\)^\{2\}\)\.\(4\)Thus, the within\-image corruption variance of the hooked representation is first\-order proportional tob\(λ\)2b\(\\lambda\)^\{2\}\.

Finally, Appendix[B\.2](https://arxiv.org/html/2605.11014#A2.SS2)shows that the oracle diagonal score has mean11underH0H\_\{0\}and1\+κλ/dλ1\+\\kappa\_\{\\lambda\}/d\_\{\\lambda\}underH1H\_\{1\}, where

𝐃λ=diag⁡\(𝚺λ\),κλ=𝚫λ⊤𝐃λ−1𝚫λ,dλ=dim\(𝐳λ\)\.\\mathbf\{D\}\_\{\\lambda\}=\\operatorname\{diag\}\(\\boldsymbol\{\\Sigma\}\_\{\\lambda\}\),\\qquad\\kappa\_\{\\lambda\}=\\boldsymbol\{\\Delta\}\_\{\\lambda\}^\{\\top\}\\mathbf\{D\}\_\{\\lambda\}^\{\-1\}\\boldsymbol\{\\Delta\}\_\{\\lambda\},\\qquad d\_\{\\lambda\}=\\dim\(\\mathbf\{z\}\_\{\\lambda\}\)\.These results motivate two diagnostics:

κ^λ\(𝒮\),R^h\(λ\),\\hat\{\\kappa\}\_\{\\lambda\}\(\\mathcal\{S\}\),\\qquad\\hat\{R\}\_\{h\}\(\\lambda\),corresponding to the estimated diagonal separation and content\-to\-instability ratio\. Their operational estimators are introduced in the next subsection, while their formal analysis and supplementary interpretation are deferred to Appendix[B](https://arxiv.org/html/2605.11014#A2)\.

### 4\.3Hook selection and lightweight scoring

We hook*block outputs*rather than arbitrary leaf submodules\. This improves portability across architectures and avoids unstable low\-level activations\. The local theory suggests two priorities: late decoder states are the primary candidates, and a deep encoder hook, when used, should be interpreted as complementary residual information\.

Within each structural region, we refine the final choice with a small ID\-only proxy\. For candidate modulemm, let𝐳i,r\(m\)\\mathbf\{z\}\_\{i,r\}^\{\(m\)\}be its pooled feature for imageiiunder corruption drawrr, averaged over the selected canonical levels, and let

𝐳¯i\(m\)=1R∑r=1R𝐳i,r\(m\)\.\\bar\{\\mathbf\{z\}\}\_\{i\}^\{\(m\)\}=\\frac\{1\}\{R\}\\sum\_\{r=1\}^\{R\}\\mathbf\{z\}\_\{i,r\}^\{\(m\)\}\.We estimate

𝐂^img\(m\)=Cov^i\(𝐳¯i\(m\)\),𝐂^corr\(m\)=N−1∑i=1NCov^r\(𝐳i,r\(m\)\),\\widehat\{\\mathbf\{C\}\}\_\{\\mathrm\{img\}\}^\{\(m\)\}=\\widehat\{\\operatorname\{Cov\}\}\_\{i\}\(\\bar\{\\mathbf\{z\}\}\_\{i\}^\{\(m\)\}\),\\qquad\\widehat\{\\mathbf\{C\}\}\_\{\\mathrm\{corr\}\}^\{\(m\)\}=N^\{\-1\}\\sum\_\{i=1\}^\{N\}\\widehat\{\\operatorname\{Cov\}\}\_\{r\}\(\\mathbf\{z\}\_\{i,r\}^\{\(m\)\}\),and score the candidate by

Proxy\(m\)=tr⁡𝐂^img\(m\)tr⁡𝐂^corr\(m\)\.\\mathrm\{Proxy\}\(m\)=\\frac\{\\operatorname\{tr\}\\widehat\{\\mathbf\{C\}\}\_\{\\mathrm\{img\}\}^\{\(m\)\}\}\{\\operatorname\{tr\}\\widehat\{\\mathbf\{C\}\}\_\{\\mathrm\{corr\}\}^\{\(m\)\}\}\.\(5\)This favors modules that vary across ID images while remaining stable under stochastic corruption at fixed content\.

For a selected slot\(k,ℓ\)\(k,\\ell\), let

𝐅k,ℓ\(𝐱0\)∈ℝCk,ℓ×Hk,ℓ×Wk,ℓ\\mathbf\{F\}\_\{k,\\ell\}\(\\mathbf\{x\}\_\{0\}\)\\in\\mathbb\{R\}^\{C\_\{k,\\ell\}\\times H\_\{k,\\ell\}\\times W\_\{k,\\ell\}\}denote the hooked feature map\. We use the pooled descriptor

𝐳k,ℓ\(𝐱0\)=\[Meansp\(𝐅k,ℓ\(𝐱0\)\)⊤,Stdsp\(𝐅k,ℓ\(𝐱0\)\)⊤\]⊤∈ℝDk,ℓ,\\mathbf\{z\}\_\{k,\\ell\}\(\\mathbf\{x\}\_\{0\}\)=\[\\operatorname\{Mean\}\_\{\\mathrm\{sp\}\}\(\\mathbf\{F\}\_\{k,\\ell\}\(\\mathbf\{x\}\_\{0\}\)\)^\{\\top\},\\operatorname\{Std\}\_\{\\mathrm\{sp\}\}\(\\mathbf\{F\}\_\{k,\\ell\}\(\\mathbf\{x\}\_\{0\}\)\)^\{\\top\}\]^\{\\top\}\\in\\mathbb\{R\}^\{D\_\{k,\\ell\}\},\(6\)whereMeansp\\operatorname\{Mean\}\_\{\\mathrm\{sp\}\}andStdsp\\operatorname\{Std\}\_\{\\mathrm\{sp\}\}are channel\-wise spatial mean and standard deviation\. Using only ID\-train data, we fit diagonal statistics\(𝝁^k,ℓ,𝐯^k,ℓ\)\(\\hat\{\\boldsymbol\{\\mu\}\}\_\{k,\\ell\},\\hat\{\\mathbf\{v\}\}\_\{k,\\ell\}\)\. The slot score is

sk,ℓ\(𝐱0\)=1Dk,ℓ∑j=1Dk,ℓ\(𝐳k,ℓ\(j\)\(𝐱0\)−𝝁^k,ℓ\(j\)\)2𝐯^k,ℓ\(j\),s\_\{k,\\ell\}\(\\mathbf\{x\}\_\{0\}\)=\\frac\{1\}\{D\_\{k,\\ell\}\}\\sum\_\{j=1\}^\{D\_\{k,\\ell\}\}\\frac\{\\left\(\\mathbf\{z\}\_\{k,\\ell\}^\{\(j\)\}\(\\mathbf\{x\}\_\{0\}\)\-\\hat\{\\boldsymbol\{\\mu\}\}\_\{k,\\ell\}^\{\(j\)\}\\right\)^\{2\}\}\{\\hat\{\\mathbf\{v\}\}\_\{k,\\ell\}^\{\(j\)\}\},\(7\)and the final score is

SCFS\(𝐱0\)=1\|𝒮\|∑\(k,ℓ\)∈𝒮sk,ℓ\(𝐱0\)\.S\_\{\\textsc\{CFS\}\}\(\\mathbf\{x\}\_\{0\}\)=\\frac\{1\}\{\|\\mathcal\{S\}\|\}\\sum\_\{\(k,\\ell\)\\in\\mathcal\{S\}\}s\_\{k,\\ell\}\(\\mathbf\{x\}\_\{0\}\)\.\(8\)
We intentionally use a lightweight diagonal score, also referred to as an ID\-only score or ID\-only proxy:CFSis meant to test the quality of sparse frozen representations, not to rely on a high\-capacity downstream classifier\. While stronger downstream heads may improve absolute performance, they would partially confound the representation\-quality interpretation targeted byMBE\. Alternative heads are therefore reported only as analysis in Appendix[E\.6](https://arxiv.org/html/2605.11014#A5.SS6)\. For the local diagnostics, we estimate

κ^λ\(𝒮\)=𝚫^λ,𝒮⊤𝐃^λ,𝒮−1𝚫^λ,𝒮,R^h\(λ\)=tr⁡𝐂^img\(h,λ\)tr⁡𝐂^corr\(h,λ\)\.\\hat\{\\kappa\}\_\{\\lambda\}\(\\mathcal\{S\}\)=\\hat\{\\boldsymbol\{\\Delta\}\}\_\{\\lambda,\\mathcal\{S\}\}^\{\\top\}\\hat\{\\mathbf\{D\}\}\_\{\\lambda,\\mathcal\{S\}\}^\{\-1\}\\hat\{\\boldsymbol\{\\Delta\}\}\_\{\\lambda,\\mathcal\{S\}\},\\qquad\\hat\{R\}\_\{h\}\(\\lambda\)=\\frac\{\\operatorname\{tr\}\\widehat\{\\mathbf\{C\}\}\_\{\\mathrm\{img\}\}^\{\(h,\\lambda\)\}\}\{\\operatorname\{tr\}\\widehat\{\\mathbf\{C\}\}\_\{\\mathrm\{corr\}\}^\{\(h,\\lambda\)\}\}\.These diagnostics are not used to fit the detector; they only test whether sparse hooks and canonical levels behave as predicted by the local theory\.

## 5Experiments

### 5\.1Setup

Our main benchmark is CIFAR\-scale, where protocol control is the cleanest\. We use

ℐsmall\\displaystyle\\mathcal\{I\}\_\{\\mathrm\{small\}\}=\{CIFAR\-10,SVHN,CelebA32\},\\displaystyle=\\\{\\text\{CIFAR\-10\},\\text\{SVHN\},\\text\{CelebA32\}\\\}\\,,𝒪small\\displaystyle\\mathcal\{O\}\_\{\\mathrm\{small\}\}=\{CIFAR\-10,SVHN,CelebA32,CIFAR\-100,DTD\}\.\\displaystyle=\\\{\\text\{CIFAR\-10\},\\text\{SVHN\},\\text\{CelebA32\},\\text\{CIFAR\-100\},\\text\{DTD\}\\\}\\,\.For eachi∈ℐsmalli\\in\\mathcal\{I\}\_\{\\mathrm\{small\}\}, all datasets in𝒪small∖\{i\}\\mathcal\{O\}\_\{\\mathrm\{small\}\}\\setminus\\\{i\\\}are treated as OOD, yielding1212ID→\\toOOD pairs per backbone\.

We evaluate two diffusion backbone families through the same adapter and canonical corruption interface: improved\-diffusion backbones with discrete timesteps and nativeε\\varepsilon\-prediction, and EDM\-family backbones with continuous noise conditioning and one\-shotx^0\\hat\{x\}\_\{0\}estimation\. In the main benchmark, all methods use a shared CIFAR\-10 checkpoint family; we then repeat the same benchmark with a shared CelebA32 checkpoint family for source\-family robustness\. Larger\-scale ImageNet transfer results are deferred to the Appendix[I](https://arxiv.org/html/2605.11014#A9)\.

### 5\.2Baselines and metrics

All baselines are evaluated under the sharedMBEpipeline\. Our main comparators areMSMA\(Mahmoodet al\.,[2021](https://arxiv.org/html/2605.11014#bib.bib13)\),DiffPath\(Henget al\.,[2024](https://arxiv.org/html/2605.11014#bib.bib8)\),DDPM\-OOD\(Grahamet al\.,[2023](https://arxiv.org/html/2605.11014#bib.bib14)\), andGEPC\(Rouzoumkaet al\.,[2026](https://arxiv.org/html/2605.11014#bib.bib7)\)\. Detailed baseline taxonomy and implementation choices are deferred to Appendix[C](https://arxiv.org/html/2605.11014#A3)and Appendix[D](https://arxiv.org/html/2605.11014#A4)\.

Unless stated otherwise, the main paper operating point isCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\), a one\-level two\-hook variant using paired encoder and decoder snapshots at the same low\-noise canonical level\. We also reportCFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\), a decoder\-only companion\. Richer variants such asCFS\(2×2\)\\textsc\{CFS\}\(2\\times 2\)andCFS\(2×4\)\\textsc\{CFS\}\(2\\times 4\)are studied in the Appendix[E\.1](https://arxiv.org/html/2605.11014#A5.SS1)\.

All methods output OOD\-high scores\. Our primary metric is AUROC; full FPR95 breakdowns are reported in the appendix\. Across pairs, we summarize performance by

AvgAUROCm,b\\displaystyle\\mathrm\{AvgAUROC\}\_\{m,b\}=1\|𝒫\|∑p∈𝒫AUROCm,b\(p\),\\displaystyle=\\frac\{1\}\{\|\\mathcal\{P\}\|\}\\sum\_\{p\\in\\mathcal\{P\}\}\\mathrm\{AUROC\}\_\{m,b\}\(p\),\(9\)AvgWorstAUROCm,b\\displaystyle\\mathrm\{AvgWorstAUROC\}\_\{m,b\}=1\|ℐ\|∑i∈ℐmino∈𝒪\(i\)⁡AUROCm,b\(i→o\),\\displaystyle=\\frac\{1\}\{\|\\mathcal\{I\}\|\}\\sum\_\{i\\in\\mathcal\{I\}\}\\min\_\{o\\in\\mathcal\{O\}\(i\)\}\\mathrm\{AUROC\}\_\{m,b\}\(i\\to o\),\(10\)and report logical test\-timeCostm=\#Fm\+\#Jm\\mathrm\{Cost\}\_\{m\}=\\\#F\_\{m\}\+\\\#J\_\{m\}\(with\#Jm=0\\\#J\_\{m\}=0for all main\-paper comparisons\)\.

Appendix[E\.9](https://arxiv.org/html/2605.11014#A5.SS9)provides an architecture\-transfer sanity check on U\-ViT, where CFS is applied to early/middle/late transformer block snapshots rather than U\-Net encoder/decoder maps\.

## 6Results

Claim scope\.Our claim is strictly about controlled comparison underMBE: when source\-family policy, canonical corruption semantics, and logical test\-time cost are matched, sparse native snapshot probing is stronger than the output\-space alternatives we evaluate at comparable or lower logical cost\.

### 6\.1Main CIFAR\-scale comparison underMBE

Table[2](https://arxiv.org/html/2605.11014#S6.T2)reports the main controlled CIFAR\-scale comparison underMBE\.

Table 2:Main comparison underMBEon the CIFAR\-scale benchmark\.IDs: CIFAR\-10, SVHN, CelebA32\. OODs: the remaining datasets among \{CIFAR\-10, SVHN, CelebA32, CIFAR\-100, DTD\}, yielding1212pairs per backbone\. All methods use the same source\-family policy, preprocessing, canonical corruption construction, and split policy\. Best and second\-best results are highlighted in bold and underline, respectively\. Complementary results are available in Appendix[F](https://arxiv.org/html/2605.11014#A6)\.MethodBackboneAvgAUROC↑\\mathrm\{AvgAUROC\}\\uparrowAvgWorstAUROC↑\\mathrm\{AvgWorstAUROC\}\\uparrow\#F\\\#F/img↓\\downarrowNotesMSMAimproved0\.7920\.68810multiscale output descriptorMSMAEDM0\.7960\.68910multiscale output descriptorDiffPathimproved0\.7780\.64110recursive path statisticDiffPathEDM0\.7920\.63510recursive path statisticDDPM\-OODimproved0\.5500\.316364reconstruction / manifoldDDPM\-OODEDM0\.5590\.320364reconstruction / manifoldGEPCimproved0\.6160\.5468output\-space consistencyGEPCEDM0\.7740\.6008output\-space consistencyCFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\)improved0\.8860\.7931single snapshot at low noiseCFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\)EDM0\.9190\.8091single snapshot at low noiseCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)improved0\.8870\.7991paired snapshots at low noiseCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)EDM0\.9160\.8141paired snapshots at low noise

At a one\-forward budget,CFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)is the strongest operating point on both backbone families\. Relative to the strongest harmonized output\-space baselines, it improves AvgWorstAUROC from0\.6890\.689to0\.8140\.814on EDM and from0\.6880\.688to0\.7990\.799on improved\-diffusion, while reducing logical cost from88–1010forwards per image to11\. At the same time,CFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\)remains extremely close, showing that the representation\-space gain is not only strong but also highly compressible\. Budget\-matched ablations are reported in Appendix[E\.1](https://arxiv.org/html/2605.11014#A5.SS1)\.

DLSR is the closest prior in spirit, but it lies outside our MBE protocol because it introduces an additional learned feature\-reconstruction module\. Since DLSR is only available on its native published evaluation pairs, we report a comparison on that DLSR\-native subset in Table[20](https://arxiv.org/html/2605.11014#A8.T20)of Appendix[H\.1](https://arxiv.org/html/2605.11014#A8.SS1)\.

### 6\.2Theory diagnostics: measurable quantities predict sparse\-probe quality

We next test whether the local\-testing theory yields measurable quantities that predict sparse\-probe quality\. It does \(see Table[3](https://arxiv.org/html/2605.11014#S6.T3)\)\.

First, the diagonal noncentrality diagnosticκ^λ\(𝒮\)/d\\hat\{\\kappa\}\_\{\\lambda\}\(\\mathcal\{S\}\)/dis strongly aligned with downstream OOD performance across both backbone families: candidate sparse probes with larger estimated diagonal separation consistently yield larger AUROC\. This supports the interpretation of the diagonal score as a local detector\. Representative diagnostic scatter plots forκ^λ\(𝒮\)/d\\hat\{\\kappa\}\_\{\\lambda\}\(\\mathcal\{S\}\)/dversus AUROC are deferred to Appendix[B\.7](https://arxiv.org/html/2605.11014#A2.SS7)\.

Second, the content\-to\-instability ratioR^h\(λ\)\\hat\{R\}\_\{h\}\(\\lambda\)is also strongly predictive of hook quality\. Moreover, the within\-image corruption variance decreases sharply asb\(λ\)2b\(\\lambda\)^\{2\}decreases, in agreement with Proposition[1](https://arxiv.org/html/2605.11014#Thmproposition1), supporting the low\-noise advantage\. Appendix[B\.7](https://arxiv.org/html/2605.11014#A2.SS7)reports the complementaryR^h\(λ\)\\hat\{R\}\_\{h\}\(\\lambda\)and low\-noise stability plots\.

Table 3:Empirical validation of the measurable theory diagnostics\.Spearman correlations are computed across candidate sparse probes and ID→\\toOOD pairs\. The diagnostics are not used to fit the detector; they only test whether the measured behavior of sparse hooks and canonical levels agrees with the local theory\.
### 6\.3Focused ablations

The appendix addresses three narrower questions: whether the gain is explained by hidden test\-time budget in Appendix[E\.1](https://arxiv.org/html/2605.11014#A5.SS1), whether the gain is driven by head complexity rather than representation quality \(see Appendix[E\.4](https://arxiv.org/html/2605.11014#A5.SS4)\), and whether the ranking is stable across stochastic seeds through Appendix[E\.8](https://arxiv.org/html/2605.11014#A5.SS8)\. Across all cases, the central conclusion is unchanged: the gain comes primarily from*where*the frozen backbone is probed\.

Hook\-selection robustness\.The ID\-only hook\-selection proxy does not need to recover the exact oracle pair to be effective\. On improved\-diffusion, the selected pair reaches0\.8860\.886AvgAUROC versus0\.8950\.895for the best admissible pair, while on EDM, the pairwise landscape exhibits a broad plateau of strong pairs\. This indicates thatCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)is not driven by brittle hook cherry\-picking; see Appendix[E\.2](https://arxiv.org/html/2605.11014#A5.SS2)\.

External positioning outsideMBE\.Table[4](https://arxiv.org/html/2605.11014#S6.T4)provides a non\-protocol\-matched positioning against previously reported diffusion\-OOD results\. These comparisons are included for external positioning\.

Table 4:External positioning outsideMBE\(single\-checkpoint setting only\)\.Prior\-method results are taken from the respective papers or officially reported artifacts\. Full results are in Appendix[H](https://arxiv.org/html/2605.11014#A8)\.
### 6\.4Source\-family robustness

In Table[5](https://arxiv.org/html/2605.11014#S6.T5), we repeat the same CIFAR\-scale benchmark with a shared CelebA32 source family to test whether the ranking persists after changing the frozen source representation\.

Table 5:Source\-family robustness on the CIFAR\-scale benchmark\.Same protocol as Table[2](https://arxiv.org/html/2605.11014#S6.T2), but with a shared CelebA32 checkpoint family\. More experiments are exposed in Appendix[G](https://arxiv.org/html/2605.11014#A7)\.ImprovedEDM\#F\\\#F/img↓\\downarrowMethodAvgAUROC↑\\mathrm\{AvgAUROC\}\\uparrowAvgWorstAUROC↑\\mathrm\{AvgWorstAUROC\}\\uparrowAvgAUROC↑\\mathrm\{AvgAUROC\}\\uparrowAvgWorstAUROC↑\\mathrm\{AvgWorstAUROC\}\\uparrowMSMA0\.8810\.8080\.7900\.67310DiffPath0\.8290\.7430\.7760\.65010DDPM\-OOD0\.5810\.3620\.5750\.325364GEPC0\.7490\.6500\.7450\.6488CFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\)0\.9070\.8230\.9140\.8271CFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)0\.9080\.8270\.9280\.8461The ordering remains essentially unchanged under both source families:CFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)remains best on both backbone families, andCFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\)remains a near\-tied one\-forward companion\. This argues against a checkpoint\-family artifact and supports the probing\-space interpretation\.

## 7Discussion and Limitations

Once protocol confounders are removed, the main bottleneck is not simply*more output\-space engineering*, but*where*the frozen diffusion backbone is probed\. Our results suggest that a small number of native internal snapshots can retain relative\-OOD geometry that output\-space summaries partly attenuate\. In this sense,CFSshould be read not only as a score family, but as evidence that, under controlledMBEcomparison, a representation\-first probing strategy can be more informative than broader output\-space summaries\. Across controlled comparisons,CFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)is the strongest one\-forward operating point, while the sparseCFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\)remains remarkably competitive, showing that a large fraction of the useful signal is already concentrated in a single late decoder snapshot\.

Limitations\.CFSrequires internal access to the diffusion backbone and is therefore less black\-box than output\-space scores\. Its portability is structural rather than representational: canonical levels can be aligned across backbones, but internal features remain architecture and source\-dependent\. Hard multimodal ID regimes such as CIFAR\-10 remain challenging\. Finally, our theory is intentionally local rather than universal\.

## 8Conclusion

We introducedMBE, a protocol for fair cross\-backbone diffusion OOD evaluation, andCFS, a family of minimal representation\-space detectors based on sparse native internal snapshots\. UnderMBE, the strongest one\-forward operating point in our main controlled comparisons isCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\), while the even sparserCFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\)remains remarkably competitive\. More broadly, once protocol confounders are removed, strong relative diffusion OOD detection can already be obtained far upstream of elaborate output\-space probes: a tiny number of sparse, canonically aligned internal snapshots captures a large fraction of the useful relative\-OOD geometry\.

## References

- SCOPED: score\-curvature out\-of\-distribution proximity evaluator for diffusion\.InThe Thirteenth International Conference on Learning Representations,Cited by:[§1](https://arxiv.org/html/2605.11014#S1.p2.1),[Table 1](https://arxiv.org/html/2605.11014#S2.T1.3.1.8.7.1),[§2](https://arxiv.org/html/2605.11014#S2.p2.1)\.
- Y\. Ding, A\. Aleksandrauskas, A\. Ahmadian, J\. Unger, F\. Lindsten, and G\. Eilertsen \(2025\)Revisiting likelihood\-based out\-of\-distribution detection by modeling representations\.InImage Analysis,Lecture Notes in Computer Science,pp\. 166–179\.Cited by:[§2](https://arxiv.org/html/2605.11014#S2.p3.1)\.
- R\. Gao, C\. Zhao, L\. Hong, and Q\. Xu \(2023\)DIFFGUARD: semantic mismatch\-guided out\-of\-distribution detection using pre\-trained diffusion models\.InProceedings of the IEEE/CVF International Conference on Computer Vision \(ICCV\),pp\. 1579–1589\.Cited by:[§1](https://arxiv.org/html/2605.11014#S1.p2.1),[Table 1](https://arxiv.org/html/2605.11014#S2.T1.3.1.5.4.1),[§2](https://arxiv.org/html/2605.11014#S2.p2.1)\.
- J\. Goodier and N\. D\. F\. Campbell \(2023\)Likelihood\-based out\-of\-distribution detection with denoising diffusion probabilistic models\.arXiv preprint arXiv:2310\.17432\.Cited by:[§2](https://arxiv.org/html/2605.11014#S2.p2.1)\.
- M\. S\. Graham, W\. H\. L\. Pinaya, P\. Tudosiu, P\. Nachev, S\. Ourselin, and M\. J\. Cardoso \(2023\)Denoising diffusion models for out\-of\-distribution detection\.InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops,pp\. 2948–2957\.Cited by:[§1](https://arxiv.org/html/2605.11014#S1.p2.1),[Table 1](https://arxiv.org/html/2605.11014#S2.T1.3.1.3.2.1),[§2](https://arxiv.org/html/2605.11014#S2.p2.1),[§5\.2](https://arxiv.org/html/2605.11014#S5.SS2.p1.1)\.
- A\. Heng, A\. H\. Thiery, and H\. Soh \(2024\)Out\-of\-distribution detection with a single unconditional diffusion model\.InAdvances in Neural Information Processing Systems,Cited by:[§1](https://arxiv.org/html/2605.11014#S1.p2.1),[Table 1](https://arxiv.org/html/2605.11014#S2.T1.3.1.6.5.1),[§2](https://arxiv.org/html/2605.11014#S2.p2.1),[§5\.2](https://arxiv.org/html/2605.11014#S5.SS2.p1.1)\.
- J\. Ho, A\. Jain, and P\. Abbeel \(2020\)Denoising diffusion probabilistic models\.InAdvances in Neural Information Processing Systems,Cited by:[§3\.1](https://arxiv.org/html/2605.11014#S3.SS1.p1.2)\.
- J\. Järve, K\. K\. Haavel, and M\. Kull \(2025\)Probability density from latent diffusion models for out\-of\-distribution detection\.External Links:2508\.15737Cited by:[§2](https://arxiv.org/html/2605.11014#S2.p3.1)\.
- T\. Karras, M\. Aittala, T\. Aila, and S\. Laine \(2022\)Elucidating the design space of diffusion\-based generative models\.InAdvances in Neural Information Processing Systems,Vol\.35\.Cited by:[§3\.1](https://arxiv.org/html/2605.11014#S3.SS1.p1.2)\.
- K\. Lee, K\. Lee, H\. Lee, and J\. Shin \(2018\)A simple unified framework for detecting out\-of\-distribution samples and adversarial attacks\.InAdvances in Neural Information Processing Systems,Vol\.31\.Cited by:[§2](https://arxiv.org/html/2605.11014#S2.p1.1)\.
- S\. Li, T\. Hu, J\. van de Weijer, F\. S\. Khan, T\. Liu, L\. Li, S\. Yang, Y\. Wang, M\. Cheng, and J\. Yang \(2024\)Faster diffusion: rethinking the role of the encoder for diffusion model inference\.InAdvances in Neural Information Processing Systems,Vol\.37\.Cited by:[§2](https://arxiv.org/html/2605.11014#S2.p3.1)\.
- W\. Liu, X\. Wang, J\. D\. Owens, and Y\. Li \(2020\)Energy\-based out\-of\-distribution detection\.InAdvances in Neural Information Processing Systems,Vol\.33,pp\. 21464–21475\.Cited by:[§2](https://arxiv.org/html/2605.11014#S2.p1.1)\.
- Z\. Liu, J\. P\. Zhou, Y\. Wang, and K\. Q\. Weinberger \(2023\)Unsupervised out\-of\-distribution detection with diffusion inpainting\.InProceedings of the 40th International Conference on Machine Learning,Proceedings of Machine Learning Research, Vol\.202,pp\. 22528–22538\.Cited by:[§1](https://arxiv.org/html/2605.11014#S1.p2.1),[Table 1](https://arxiv.org/html/2605.11014#S2.T1.3.1.4.3.1),[§2](https://arxiv.org/html/2605.11014#S2.p2.1)\.
- S\. Lu, Y\. Wang, L\. Sheng, L\. He, A\. Zheng, and J\. Liang \(2025\)Out\-of\-distribution detection: a task\-oriented survey of recent advances\.ACM Computing Surveys\.Cited by:[§2](https://arxiv.org/html/2605.11014#S2.p1.1)\.
- G\. Luo, L\. Dunlap, D\. H\. Park, A\. Holynski, and T\. Darrell \(2023\)Diffusion hyperfeatures: searching through time and space for semantic correspondence\.InAdvances in Neural Information Processing Systems,Vol\.36\.Cited by:[§1](https://arxiv.org/html/2605.11014#S1.p3.1),[§2](https://arxiv.org/html/2605.11014#S2.p3.1)\.
- A\. Mahmood, J\. Oliva, and M\. A\. Styner \(2021\)Multiscale score matching for out\-of\-distribution detection\.InInternational Conference on Learning Representations,Cited by:[§1](https://arxiv.org/html/2605.11014#S1.p2.1),[Table 1](https://arxiv.org/html/2605.11014#S2.T1.3.1.2.1.1),[§2](https://arxiv.org/html/2605.11014#S2.p2.1),[§5\.2](https://arxiv.org/html/2605.11014#S5.SS2.p1.1)\.
- A\. Nichol and P\. Dhariwal \(2021\)Improved denoising diffusion probabilistic models\.InProceedings of the 38th International Conference on Machine Learning,Proceedings of Machine Learning Research, Vol\.139,pp\. 8162–8171\.Cited by:[§3\.1](https://arxiv.org/html/2605.11014#S3.SS1.p1.2)\.
- Y\. A\. Rouzoumka, J\. Pinsolle, E\. Terreaux, C\. Morisseau, J\. Ovarlez, and C\. Ren \(2026\)GEPC: group\-equivariant posterior consistency for out\-of\-distribution detection in diffusion models\.External Links:2602\.00191,[Document](https://dx.doi.org/10.48550/arXiv.2602.00191)Cited by:[§1](https://arxiv.org/html/2605.11014#S1.p2.1),[Table 1](https://arxiv.org/html/2605.11014#S2.T1.3.1.7.6.1),[§2](https://arxiv.org/html/2605.11014#S2.p2.1),[§5\.2](https://arxiv.org/html/2605.11014#S5.SS2.p1.1)\.
- S\. Shoushtari, Y\. Wang, X\. Shi, S\. Asif, and U\. S\. Kamilov \(2026\)EigenScore: ood detection using posterior covariance in diffusion models\.InThe Thirteenth International Conference on Learning Representations,Cited by:[§1](https://arxiv.org/html/2605.11014#S1.p2.1),[Table 1](https://arxiv.org/html/2605.11014#S2.T1.3.1.9.8.1),[§2](https://arxiv.org/html/2605.11014#S2.p2.1)\.
- Y\. Sun, C\. Guo, and Y\. Li \(2021\)ReAct: out\-of\-distribution detection with rectified activations\.InAdvances in Neural Information Processing Systems,Vol\.34\.Cited by:[§2](https://arxiv.org/html/2605.11014#S2.p1.1)\.
- Y\. Sun, Y\. Ming, X\. Zhu, and Y\. Li \(2022\)Out\-of\-distribution detection with deep nearest neighbors\.InProceedings of the 39th International Conference on Machine Learning,Proceedings of Machine Learning Research, Vol\.162,pp\. 20827–20840\.Cited by:[§2](https://arxiv.org/html/2605.11014#S2.p1.1)\.
- H\. Wang, Z\. Li, L\. Feng, and W\. Zhang \(2022\)ViM: out\-of\-distribution with virtual\-logit matching\.InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,pp\. 4921–4930\.Cited by:[§2](https://arxiv.org/html/2605.11014#S2.p1.1)\.
- J\. Yang, P\. Wang, D\. Zou, Z\. Zhou, K\. Ding, W\. Peng, H\. Wang, G\. Chen, B\. Li, Y\. Sun, X\. Du, K\. Zhou, W\. Zhang, D\. Hendrycks, Y\. Li, and Z\. Liu \(2022\)OpenOOD: benchmarking generalized out\-of\-distribution detection\.InAdvances in Neural Information Processing Systems, Datasets and Benchmarks Track,Cited by:[§1](https://arxiv.org/html/2605.11014#S1.p3.1),[§2](https://arxiv.org/html/2605.11014#S2.p1.1),[§3](https://arxiv.org/html/2605.11014#S3.p1.1)\.
- X\. Yang and X\. Wang \(2023\)Diffusion model as representation learner\.InProceedings of the IEEE/CVF International Conference on Computer Vision \(ICCV\),pp\. 18938–18949\.Cited by:[§1](https://arxiv.org/html/2605.11014#S1.p3.1),[§2](https://arxiv.org/html/2605.11014#S2.p3.1)\.
- Y\. Yang, D\. Cheng, C\. Fang, Y\. Wang, C\. Jiao, L\. Cheng, N\. Wang, and X\. Gao \(2024\)Diffusion\-based layer\-wise semantic reconstruction for unsupervised out\-of\-distribution detection\.InAdvances in Neural Information Processing Systems,Vol\.37\.Cited by:[Table 1](https://arxiv.org/html/2605.11014#S2.T1.3.1.10.9.1)\.
- S\. Yu, S\. Kwak, H\. Jang, J\. Jeong, J\. Huang, J\. Shin, and S\. Xie \(2025\)Representation alignment for generation: training diffusion transformers is easier than you think\.InInternational Conference on Learning Representations,Cited by:[§1](https://arxiv.org/html/2605.11014#S1.p3.1),[§2](https://arxiv.org/html/2605.11014#S2.p3.1)\.
- J\. Zhang, J\. Yang, P\. Wang, H\. Wang, Y\. Lin, H\. Zhang, Y\. Sun, X\. Du, K\. Zhou, W\. Zhang, Y\. Li, Z\. Liu, Y\. Chen, and H\. Li \(2024\)OpenOOD v1\.5: enhanced benchmark for out\-of\-distribution detection\.Journal of Data\-centric Machine Learning Research\.Note:Dataset CertificationCited by:[§1](https://arxiv.org/html/2605.11014#S1.p3.1),[§2](https://arxiv.org/html/2605.11014#S2.p1.1),[§3](https://arxiv.org/html/2605.11014#S3.p1.1)\.

## Appendix AAppendix Roadmap

This appendix supports the paper’s central claim: under a backbone\-equated protocol, a frozen diffusion checkpoint can be used as a relative\-OOD representation map, and a tiny number of sparse low\-noise internal snapshots already captures strong useful signal\.

It is organized as follows:

- •Section[B](https://arxiv.org/html/2605.11014#A2): proofs and supplementary validations for the main\-paper theory, including conditional encoder\-decoder complementarity, low\-noise corruption stability, diagonal\-score separation, canonical\-level matching, and the interpretation of the hook\-selection proxy\.
- •Sections[C](https://arxiv.org/html/2605.11014#A3)–[D](https://arxiv.org/html/2605.11014#A4): shared protocol, canonicalization, logical cost accounting, and harmonized baseline implementations\.
- •Section[E](https://arxiv.org/html/2605.11014#A5): focused ablations testing budget, hook robustness, canonical\-level robustness, pooling, head choice, bank size, and seed stability\.
- •Sections[F](https://arxiv.org/html/2605.11014#A6)and[G](https://arxiv.org/html/2605.11014#A7): full CIFAR\-scale pairwise results under the primary and alternative source\-family policies\.
- •Section[H](https://arxiv.org/html/2605.11014#A8): external positioning against prior reported diffusion\-based CIFAR\-scale results outside the controlledMBEprotocol\.
- •Section[I](https://arxiv.org/html/2605.11014#A9): checkpoint\-controlled large\-scale results on ImageNet200 and ImageNet1K using a single official ImageNet\-64 improved\-diffusion backbone\.
- •Section[J](https://arxiv.org/html/2605.11014#A10): implementation and configuration details, including the mainCFShyperparameters, ID\-only head configurations, determinism settings, profiling protocol, and representative compute\-resource reporting\.
- •Section[K](https://arxiv.org/html/2605.11014#A11): broader impact and responsible\-use discussion\.

Unless noted otherwise, appendix ablations are centered onCFS\(1×2\)\(1\\times 2\), the primary main\-paper operating point\. We retainCFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\)only in targeted controls where the compression result is directly relevant\.

## Appendix BTheory Appendix

This appendix supplies the formal statements, proofs, and supplementary validations supporting the main paper theory\. The main text relies directly on two load\-bearing claims: conditional encoder\-decoder complementarity and low\-noise corruption stability\. We additionally formalize here the diagonal\-score separation result deferred from the main text, a canonical\-level stability bound under discretization, and a formal interpretation of the hook\-selection proxy\.

### B\.1Notation

Fix a frozen diffusion backbone and evaluation domainsPPandQQ, corresponding to the relative ID and OOD distributions\. For a canonical corruption levelλ\\lambda,

𝐱λ=a\(λ\)𝐱0\+b\(λ\)𝜺,𝜺∼𝒩\(𝟎,𝐈\)\.\\mathbf\{x\}\_\{\\lambda\}=a\(\\lambda\)\\,\\mathbf\{x\}\_\{0\}\+b\(\\lambda\)\\,\\boldsymbol\{\\varepsilon\}\\,,\\qquad\\boldsymbol\{\\varepsilon\}\\sim\\mathcal\{N\}\(\\mathbf\{0\},\\mathbf\{I\}\)\\,\.For a selected late decoder hook and a selected deep encoder hook,

𝐳λ\(𝐱0\)=\[𝐳d,λ\(𝐱0\)𝐳e,λ\(𝐱0\)\],\\mathbf\{z\}\_\{\\lambda\}\(\\mathbf\{x\}\_\{0\}\)=\\begin\{bmatrix\}\\mathbf\{z\}\_\{d,\\lambda\}\(\\mathbf\{x\}\_\{0\}\)\\\\ \\mathbf\{z\}\_\{e,\\lambda\}\(\\mathbf\{x\}\_\{0\}\)\\end\{bmatrix\}\\,,and the local model is

𝐳λ\(𝐱0\)∣H0∼𝒩\(𝝁λ,𝚺λ\),𝐳λ\(𝐱0\)∣H1∼𝒩\(𝝁λ\+𝚫λ,𝚺λ\),\\mathbf\{z\}\_\{\\lambda\}\(\\mathbf\{x\}\_\{0\}\)\\mid H\_\{0\}\\sim\\mathcal\{N\}\(\\boldsymbol\{\\mu\}\_\{\\lambda\},\\boldsymbol\{\\Sigma\}\_\{\\lambda\}\)\\,,\\qquad\\mathbf\{z\}\_\{\\lambda\}\(\\mathbf\{x\}\_\{0\}\)\\mid H\_\{1\}\\sim\\mathcal\{N\}\(\\boldsymbol\{\\mu\}\_\{\\lambda\}\+\\boldsymbol\{\\Delta\}\_\{\\lambda\},\\boldsymbol\{\\Sigma\}\_\{\\lambda\}\)\\,,withH0:𝐱0∼PH\_\{0\}:\\mathbf\{x\}\_\{0\}\\sim PandH1:𝐱0∼QH\_\{1\}:\\mathbf\{x\}\_\{0\}\\sim Q\.

### B\.2Deferred statement from the main text

The following result is used in the interpretation of the diagonalCFSscore, but is deferred here to keep the main text focused\.

Let

𝐃λ=diag⁡\(𝚺λ\),dλ=dim\(𝐳λ\),\\mathbf\{D\}\_\{\\lambda\}=\\operatorname\{diag\}\(\\boldsymbol\{\\Sigma\}\_\{\\lambda\}\),\\qquad d\_\{\\lambda\}=\\dim\(\\mathbf\{z\}\_\{\\lambda\}\),and consider the oracle diagonal score

sλ∘\(𝐳\):=1dλ\(𝐳−𝝁λ\)⊤𝐃λ−1\(𝐳−𝝁λ\),κλ:=𝚫λ⊤𝐃λ−1𝚫λ\.s^\{\\circ\}\_\{\\lambda\}\(\\mathbf\{z\}\):=\\frac\{1\}\{d\_\{\\lambda\}\}\\,\(\\mathbf\{z\}\-\\boldsymbol\{\\mu\}\_\{\\lambda\}\)^\{\\top\}\\,\\mathbf\{D\}\_\{\\lambda\}^\{\-1\}\\,\(\\mathbf\{z\}\-\\boldsymbol\{\\mu\}\_\{\\lambda\}\)\\,,\\qquad\\kappa\_\{\\lambda\}:=\\boldsymbol\{\\Delta\}\_\{\\lambda\}^\{\\top\}\\,\\mathbf\{D\}\_\{\\lambda\}^\{\-1\}\\,\\boldsymbol\{\\Delta\}\_\{\\lambda\}\\,\.
###### Theorem 2\(Diagonal\-score separation under the local model\)\.

Under Eq\. \([3](https://arxiv.org/html/2605.11014#S4.E3)\),

𝔼H0\[sλ∘\]=1,𝔼H1\[sλ∘\]=1\+κλdλ\.\\mathbb\{E\}\_\{H\_\{0\}\}\[s^\{\\circ\}\_\{\\lambda\}\]=1\\,,\\qquad\\mathbb\{E\}\_\{H\_\{1\}\}\[s^\{\\circ\}\_\{\\lambda\}\]=1\+\\frac\{\\kappa\_\{\\lambda\}\}\{d\_\{\\lambda\}\}\\,\.Moreover, the variance ofsλ∘s^\{\\circ\}\_\{\\lambda\}is controlled by the correlation structure of𝚺λ\\boldsymbol\{\\Sigma\}\_\{\\lambda\}, yielding the detection\-power bound proved below\.

### B\.3Proof of Theorem[1](https://arxiv.org/html/2605.11014#Thmtheorem1)

###### Proof\.

Write

𝚫λ=\[𝚫d,λ𝚫e,λ\],𝚺λ=\[𝚺dd,λ𝚺de,λ𝚺ed,λ𝚺ee,λ\]\.\\boldsymbol\{\\Delta\}\_\{\\lambda\}=\\begin\{bmatrix\}\\boldsymbol\{\\Delta\}\_\{d,\\lambda\}\\\\ \\boldsymbol\{\\Delta\}\_\{e,\\lambda\}\\end\{bmatrix\}\\,,\\qquad\\boldsymbol\{\\Sigma\}\_\{\\lambda\}=\\begin\{bmatrix\}\\boldsymbol\{\\Sigma\}\_\{dd,\\lambda\}&\\boldsymbol\{\\Sigma\}\_\{de,\\lambda\}\\\\ \\boldsymbol\{\\Sigma\}\_\{ed,\\lambda\}&\\boldsymbol\{\\Sigma\}\_\{ee,\\lambda\}\\end\{bmatrix\}\\,\.The paired Mahalanobis separation is

Seppair\(λ\)=𝚫λ⊤𝚺λ−1𝚫λ,\\mathrm\{Sep\}\_\{\\mathrm\{pair\}\}\(\\lambda\)=\\boldsymbol\{\\Delta\}\_\{\\lambda\}^\{\\top\}\\,\\boldsymbol\{\\Sigma\}\_\{\\lambda\}^\{\-1\}\\,\\boldsymbol\{\\Delta\}\_\{\\lambda\}\\,,while the decoder\-only separation is

Sepdec\(λ\)=𝚫d,λ⊤𝚺dd,λ−1𝚫d,λ\.\\mathrm\{Sep\}\_\{\\mathrm\{dec\}\}\(\\lambda\)=\\boldsymbol\{\\Delta\}\_\{d,\\lambda\}^\{\\top\}\\,\\boldsymbol\{\\Sigma\}\_\{dd,\\lambda\}^\{\-1\}\\,\\boldsymbol\{\\Delta\}\_\{d,\\lambda\}\\,\.Let

𝚺e∣d,λ=𝚺ee,λ−𝚺ed,λ𝚺dd,λ−1𝚺de,λ,\\boldsymbol\{\\Sigma\}\_\{e\\mid d,\\lambda\}=\\boldsymbol\{\\Sigma\}\_\{ee,\\lambda\}\-\\boldsymbol\{\\Sigma\}\_\{ed,\\lambda\}\\,\\boldsymbol\{\\Sigma\}\_\{dd,\\lambda\}^\{\-1\}\\,\\boldsymbol\{\\Sigma\}\_\{de,\\lambda\}\\,,be the Schur complement of the decoder block\. By the block inverse formula,

𝚺λ−1=\[𝚺dd,λ−1\+𝚺dd,λ−1𝚺de,λ𝚺e∣d,λ−1𝚺ed,λ𝚺dd,λ−1−𝚺dd,λ−1𝚺de,λ𝚺e∣d,λ−1−𝚺e∣d,λ−1𝚺ed,λ𝚺dd,λ−1𝚺e∣d,λ−1\]\.\\boldsymbol\{\\Sigma\}\_\{\\lambda\}^\{\-1\}=\\begin\{bmatrix\}\\boldsymbol\{\\Sigma\}\_\{dd,\\lambda\}^\{\-1\}\+\\boldsymbol\{\\Sigma\}\_\{dd,\\lambda\}^\{\-1\}\\,\\boldsymbol\{\\Sigma\}\_\{de,\\lambda\}\\,\\boldsymbol\{\\Sigma\}\_\{e\\mid d,\\lambda\}^\{\-1\}\\,\\boldsymbol\{\\Sigma\}\_\{ed,\\lambda\}\\,\\boldsymbol\{\\Sigma\}\_\{dd,\\lambda\}^\{\-1\}&\-\\boldsymbol\{\\Sigma\}\_\{dd,\\lambda\}^\{\-1\}\\,\\boldsymbol\{\\Sigma\}\_\{de,\\lambda\}\\,\\boldsymbol\{\\Sigma\}\_\{e\\mid d,\\lambda\}^\{\-1\}\\\\\[2\.5pt\] \-\\boldsymbol\{\\Sigma\}\_\{e\\mid d,\\lambda\}^\{\-1\}\\,\\boldsymbol\{\\Sigma\}\_\{ed,\\lambda\}\\,\\boldsymbol\{\\Sigma\}\_\{dd,\\lambda\}^\{\-1\}&\\boldsymbol\{\\Sigma\}\_\{e\\mid d,\\lambda\}^\{\-1\}\\end\{bmatrix\}\\,\.Substituting this expression into𝚫λ⊤𝚺λ−1𝚫λ\\boldsymbol\{\\Delta\}\_\{\\lambda\}^\{\\top\}\\,\\boldsymbol\{\\Sigma\}\_\{\\lambda\}^\{\-1\}\\,\\boldsymbol\{\\Delta\}\_\{\\lambda\}and collecting terms gives

Seppair\(λ\)=Sepdec\(λ\)\+Rese∣d\(λ\),\\mathrm\{Sep\}\_\{\\mathrm\{pair\}\}\(\\lambda\)=\\mathrm\{Sep\}\_\{\\mathrm\{dec\}\}\(\\lambda\)\+\\mathrm\{Res\}\_\{e\\mid d\}\(\\lambda\)\\,,where

Rese∣d\(λ\)=\(𝚫e,λ−𝚺ed,λ𝚺dd,λ−1𝚫d,λ\)⊤𝚺e∣d,λ−1\(𝚫e,λ−𝚺ed,λ𝚺dd,λ−1𝚫d,λ\)\.\\mathrm\{Res\}\_\{e\\mid d\}\(\\lambda\)=\\left\(\\boldsymbol\{\\Delta\}\_\{e,\\lambda\}\-\\boldsymbol\{\\Sigma\}\_\{ed,\\lambda\}\\,\\boldsymbol\{\\Sigma\}\_\{dd,\\lambda\}^\{\-1\}\\,\\boldsymbol\{\\Delta\}\_\{d,\\lambda\}\\right\)^\{\\top\}\\,\\boldsymbol\{\\Sigma\}\_\{e\\mid d,\\lambda\}^\{\-1\}\\,\\left\(\\boldsymbol\{\\Delta\}\_\{e,\\lambda\}\-\\boldsymbol\{\\Sigma\}\_\{ed,\\lambda\}\\,\\boldsymbol\{\\Sigma\}\_\{dd,\\lambda\}^\{\-1\}\\,\\boldsymbol\{\\Delta\}\_\{d,\\lambda\}\\right\)\\,\.Since𝚺e∣d,λ≻0\\boldsymbol\{\\Sigma\}\_\{e\\mid d,\\lambda\}\\succ 0, the residual term is nonnegative\. Therefore

Seppair\(λ\)≥Sepdec\(λ\)\.\\mathrm\{Sep\}\_\{\\mathrm\{pair\}\}\(\\lambda\)\\geq\\mathrm\{Sep\}\_\{\\mathrm\{dec\}\}\(\\lambda\)\\,\.Equality holds if and only if the conditional residual vanishes, namely

𝚫e,λ=𝚺ed,λ𝚺dd,λ−1𝚫d,λ\.\\boldsymbol\{\\Delta\}\_\{e,\\lambda\}=\\boldsymbol\{\\Sigma\}\_\{ed,\\lambda\}\\,\\boldsymbol\{\\Sigma\}\_\{dd,\\lambda\}^\{\-1\}\\,\\boldsymbol\{\\Delta\}\_\{d,\\lambda\}\\,\.This proves the decomposition and the claimed nonnegativity\. ∎

### B\.4Proof of Proposition[1](https://arxiv.org/html/2605.11014#Thmproposition1)

###### Proof\.

Fix𝐱0\\mathbf\{x\}\_\{0\}and write

𝐱λ=a\(λ\)𝐱0\+b\(λ\)𝜺\.\\mathbf\{x\}\_\{\\lambda\}=a\(\\lambda\)\\,\\mathbf\{x\}\_\{0\}\+b\(\\lambda\)\\,\\boldsymbol\{\\varepsilon\}\\,\.By the assumed first\-order mean\-square expansion ofϕλ,h\\boldsymbol\{\\phi\}\_\{\\lambda,h\}arounda\(λ\)𝐱0a\(\\lambda\)\\mathbf\{x\}\_\{0\}, we have

ϕλ,h\(𝐱λ\)=ϕλ,h\(a\(λ\)𝐱0\)\+b\(λ\)𝐉λ,h\(𝐱0\)𝜺\+𝐫λ,h\(𝐱0,𝜺\),\\boldsymbol\{\\phi\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{\\lambda\}\)=\\boldsymbol\{\\phi\}\_\{\\lambda,h\}\(a\(\\lambda\)\\mathbf\{x\}\_\{0\}\)\+b\(\\lambda\)\\,\\mathbf\{J\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\)\\,\\boldsymbol\{\\varepsilon\}\+\\mathbf\{r\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\},\\boldsymbol\{\\varepsilon\}\)\\,,with

𝔼\[∥𝐫λ,h\(𝐱0,𝜺\)∥22\|𝐱0\]=o\(b\(λ\)2\)\.\\mathbb\{E\}\\\!\\left\[\\left\\\|\\mathbf\{r\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\},\\boldsymbol\{\\varepsilon\}\)\\right\\\|\_\{2\}^\{2\}\\,\\middle\|\\,\\mathbf\{x\}\_\{0\}\\right\]=o\\left\(b\(\\lambda\)^\{2\}\\right\)\\,\.Therefore,

𝐳λ,h\(𝐱0\)−ϕλ,h\(a\(λ\)𝐱0\)=b\(λ\)𝐉λ,h\(𝐱0\)𝜺\+𝐫λ,h\(𝐱0,𝜺\)\.\\mathbf\{z\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\)\-\\boldsymbol\{\\phi\}\_\{\\lambda,h\}\(a\(\\lambda\)\\,\\mathbf\{x\}\_\{0\}\)=b\(\\lambda\)\\,\\mathbf\{J\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\)\\,\\boldsymbol\{\\varepsilon\}\+\\mathbf\{r\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\},\\boldsymbol\{\\varepsilon\}\)\\,\.Taking squared norms and conditional expectations gives

𝔼\[∥𝐳λ,h\(𝐱0\)−ϕλ,h\(a\(λ\)𝐱0\)∥22\|𝐱0\]=b\(λ\)2𝔼\[∥𝐉λ,h\(𝐱0\)𝜺∥22\]\+o\(b\(λ\)2\)\.\\mathbb\{E\}\\\!\\left\[\\left\\\|\\mathbf\{z\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\)\-\\boldsymbol\{\\phi\}\_\{\\lambda,h\}\(a\(\\lambda\)\\,\\mathbf\{x\}\_\{0\}\)\\right\\\|\_\{2\}^\{2\}\\,\\middle\|\\,\\mathbf\{x\}\_\{0\}\\right\]=b\(\\lambda\)^\{2\}\\mathbb\{E\}\\\!\\left\[\\left\\\|\\mathbf\{J\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\)\\boldsymbol\{\\varepsilon\}\\right\\\|\_\{2\}^\{2\}\\right\]\+o\\left\(b\(\\lambda\)^\{2\}\\right\)\\,\.Since𝜺∼𝒩\(𝟎,𝐈\)\\boldsymbol\{\\varepsilon\}\\sim\\mathcal\{N\}\(\\mathbf\{0\},\\mathbf\{I\}\),

𝔼\[‖𝐉λ,h\(𝐱0\)𝜺‖22\]=tr⁡\(𝐉λ,h\(𝐱0\)𝐉λ,h\(𝐱0\)⊤\)\.\\mathbb\{E\}\\\!\\left\[\\left\\\|\\mathbf\{J\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\)\\,\\boldsymbol\{\\varepsilon\}\\right\\\|\_\{2\}^\{2\}\\right\]=\\operatorname\{tr\}\\\!\\left\(\\mathbf\{J\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\)\\,\\mathbf\{J\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\)^\{\\top\}\\right\)\\,\.Hence

𝔼\[∥𝐳λ,h\(𝐱0\)−ϕλ,h\(a\(λ\)𝐱0\)∥22\|𝐱0\]=b\(λ\)2tr\(𝐉λ,h\(𝐱0\)𝐉λ,h\(𝐱0\)⊤\)\+o\(b\(λ\)2\),\\mathbb\{E\}\\\!\\left\[\\left\\\|\\mathbf\{z\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\)\-\\boldsymbol\{\\phi\}\_\{\\lambda,h\}\(a\(\\lambda\)\\,\\mathbf\{x\}\_\{0\}\)\\right\\\|\_\{2\}^\{2\}\\,\\middle\|\\,\\mathbf\{x\}\_\{0\}\\right\]=b\(\\lambda\)^\{2\}\\operatorname\{tr\}\\\!\\left\(\\mathbf\{J\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\)\\,\\mathbf\{J\}\_\{\\lambda,h\}\(\\mathbf\{x\}\_\{0\}\)^\{\\top\}\\right\)\+o\(b\\left\(\\lambda\)^\{2\}\\right\)\\,,which proves the result\. ∎

### B\.5Proof of Theorem[2](https://arxiv.org/html/2605.11014#Thmtheorem2)

###### Proof\.

Let

𝐃λ=diag⁡\(𝚺λ\),𝐑λ=𝐃λ−1/2𝚺λ𝐃λ−1/2,𝜹λ=𝐃λ−1/2𝚫λ\.\\mathbf\{D\}\_\{\\lambda\}=\\operatorname\{diag\}\(\\boldsymbol\{\\Sigma\}\_\{\\lambda\}\),\\qquad\\mathbf\{R\}\_\{\\lambda\}=\\mathbf\{D\}\_\{\\lambda\}^\{\-1/2\}\\boldsymbol\{\\Sigma\}\_\{\\lambda\}\\mathbf\{D\}\_\{\\lambda\}^\{\-1/2\},\\qquad\\boldsymbol\{\\delta\}\_\{\\lambda\}=\\mathbf\{D\}\_\{\\lambda\}^\{\-1/2\}\\boldsymbol\{\\Delta\}\_\{\\lambda\}\\,\.Define the whitened\-but\-correlated variable

𝐲=𝐃λ−1/2\(𝐳−𝝁λ\)\.\\mathbf\{y\}=\\mathbf\{D\}\_\{\\lambda\}^\{\-1/2\}\(\\mathbf\{z\}\-\\boldsymbol\{\\mu\}\_\{\\lambda\}\)\\,\.UnderH0H\_\{0\}, we have

𝐲∼𝒩\(𝟎,𝐑λ\),\\mathbf\{y\}\\sim\\mathcal\{N\}\(\\mathbf\{0\},\\mathbf\{R\}\_\{\\lambda\}\)\\,,and underH1H\_\{1\},

𝐲∼𝒩\(𝜹λ,𝐑λ\)\.\\mathbf\{y\}\\sim\\mathcal\{N\}\(\\boldsymbol\{\\delta\}\_\{\\lambda\},\\mathbf\{R\}\_\{\\lambda\}\)\\,\.The oracle diagonal score can be written as

sλ∘\(𝐳\)=1dλ𝐲⊤𝐲\.s\_\{\\lambda\}^\{\\circ\}\(\\mathbf\{z\}\)=\\frac\{1\}\{d\_\{\\lambda\}\}\\,\\mathbf\{y\}^\{\\top\}\\,\\mathbf\{y\}\\,\.
Because𝐑λ\\mathbf\{R\}\_\{\\lambda\}is a correlation matrix,tr⁡\(𝐑λ\)=dλ\\operatorname\{tr\}\(\\mathbf\{R\}\_\{\\lambda\}\)=d\_\{\\lambda\}\. Hence

𝔼H0\[𝐲⊤𝐲\]=tr⁡\(𝐑λ\)=dλ,\\mathbb\{E\}\_\{H\_\{0\}\}\[\\mathbf\{y\}^\{\\top\}\\mathbf\{y\}\]=\\operatorname\{tr\}\(\\mathbf\{R\}\_\{\\lambda\}\)=d\_\{\\lambda\}\\,,so

𝔼H0\[sλ∘\]=1\.\\mathbb\{E\}\_\{H\_\{0\}\}\[s\_\{\\lambda\}^\{\\circ\}\]=1\\,\.UnderH1H\_\{1\},

𝔼H1\[𝐲⊤𝐲\]=tr⁡\(𝐑λ\)\+𝜹λ⊤𝜹λ=dλ\+κλ,\\mathbb\{E\}\_\{H\_\{1\}\}\\left\[\\mathbf\{y\}^\{\\top\}\\mathbf\{y\}\\right\]=\\operatorname\{tr\}\(\\mathbf\{R\}\_\{\\lambda\}\)\+\\boldsymbol\{\\delta\}\_\{\\lambda\}^\{\\top\}\\,\\boldsymbol\{\\delta\}\_\{\\lambda\}=d\_\{\\lambda\}\+\\kappa\_\{\\lambda\}\\,,where

κλ=𝚫λ⊤𝐃λ−1𝚫λ=𝜹λ⊤𝜹λ\.\\kappa\_\{\\lambda\}=\\boldsymbol\{\\Delta\}\_\{\\lambda\}^\{\\top\}\\,\\mathbf\{D\}\_\{\\lambda\}^\{\-1\}\\,\\boldsymbol\{\\Delta\}\_\{\\lambda\}=\\boldsymbol\{\\delta\}\_\{\\lambda\}^\{\\top\}\\,\\boldsymbol\{\\delta\}\_\{\\lambda\}\\,\.Therefore

𝔼H1\[sλ∘\]=1\+κλdλ\.\\mathbb\{E\}\_\{H\_\{1\}\}\[s\_\{\\lambda\}^\{\\circ\}\]=1\+\\frac\{\\kappa\_\{\\lambda\}\}\{d\_\{\\lambda\}\}\\,\.
For a Gaussian quadratic form𝐲⊤𝐀𝐲\\mathbf\{y\}^\{\\top\}\\,\\mathbf\{A\}\\,\\mathbf\{y\}with𝐀=𝐈\\mathbf\{A\}=\\mathbf\{I\}, the variance satisfies

Var⁡\(𝐲⊤𝐲\)=2tr⁡\(𝐑λ2\)\+4𝜹λ⊤𝐑λ𝜹λ,\\operatorname\{Var\}\\left\(\\mathbf\{y\}^\{\\top\}\\,\\mathbf\{y\}\\right\)=2\\operatorname\{tr\}\\left\(\\mathbf\{R\}\_\{\\lambda\}^\{2\}\\right\)\+4\\,\\boldsymbol\{\\delta\}\_\{\\lambda\}^\{\\top\}\\,\\mathbf\{R\}\_\{\\lambda\}\\,\\boldsymbol\{\\delta\}\_\{\\lambda\}\\,,with the second term absent underH0H\_\{0\}\. Dividing bydλ2d\_\{\\lambda\}^\{2\}gives

VarH0⁡\(sλ∘\)=2dλ2tr⁡\(𝐑λ2\),\\operatorname\{Var\}\_\{H\_\{0\}\}\(s\_\{\\lambda\}^\{\\circ\}\)=\\frac\{2\}\{d\_\{\\lambda\}^\{2\}\}\\,\\operatorname\{tr\}\\left\(\\mathbf\{R\}\_\{\\lambda\}^\{2\}\\right\)\\,,and

VarH1⁡\(sλ∘\)=2dλ2tr⁡\(𝐑λ2\)\+4dλ2𝜹λ⊤𝐑λ𝜹λ\.\\operatorname\{Var\}\_\{H\_\{1\}\}\(s\_\{\\lambda\}^\{\\circ\}\)=\\frac\{2\}\{d\_\{\\lambda\}^\{2\}\}\\,\\operatorname\{tr\}\\left\(\\mathbf\{R\}\_\{\\lambda\}^\{2\}\\right\)\+\\frac\{4\}\{d\_\{\\lambda\}^\{2\}\}\\,\\boldsymbol\{\\delta\}\_\{\\lambda\}^\{\\top\}\\,\\mathbf\{R\}\_\{\\lambda\}\\,\\boldsymbol\{\\delta\}\_\{\\lambda\}\\,\.
Finally, apply Cantelli’s inequality underH1H\_\{1\}to the random variablesλ∘s\_\{\\lambda\}^\{\\circ\}\. For anyτ<𝔼H1\[sλ∘\]\\tau<\\mathbb\{E\}\_\{H\_\{1\}\}\[s\_\{\\lambda\}^\{\\circ\}\],

PrH1⁡\[sλ∘≤τ\]≤VarH1⁡\(sλ∘\)VarH1⁡\(sλ∘\)\+\(𝔼H1\[sλ∘\]−τ\)2\.\\Pr\_\{H\_\{1\}\}\[s\_\{\\lambda\}^\{\\circ\}\\leq\\tau\]\\leq\\frac\{\\operatorname\{Var\}\_\{H\_\{1\}\}\(s\_\{\\lambda\}^\{\\circ\}\)\}\{\\operatorname\{Var\}\_\{H\_\{1\}\}\(s\_\{\\lambda\}^\{\\circ\}\)\+\\left\(\\mathbb\{E\}\_\{H\_\{1\}\}\[s\_\{\\lambda\}^\{\\circ\}\]\-\\tau\\right\)^\{2\}\}\\,\.Taking complements and substituting the mean gives

PrH1⁡\[sλ∘\>τ\]≥1−VarH1⁡\(sλ∘\)VarH1⁡\(sλ∘\)\+\(1\+κλ/dλ−τ\)2\.\\Pr\_\{H\_\{1\}\}\[s\_\{\\lambda\}^\{\\circ\}\>\\tau\]\\geq 1\-\\frac\{\\operatorname\{Var\}\_\{H\_\{1\}\}\(s\_\{\\lambda\}^\{\\circ\}\)\}\{\\operatorname\{Var\}\_\{H\_\{1\}\}\(s\_\{\\lambda\}^\{\\circ\}\)\+\\left\(1\+\\kappa\_\{\\lambda\}/d\_\{\\lambda\}\-\\tau\\right\)^\{2\}\}\\,\.This proves the theorem\. ∎

### B\.6Canonical\-level matching sanity check

This diagnostic tests the implementation role of canonical logSNR matching\. For a discrete backbone, a requested canonical levelλref\\lambda\_\{\\mathrm\{ref\}\}must be mapped to a native timesteptt, whose effective logSNR can be different fromλref\\lambda\_\{\\mathrm\{ref\}\}and is denoted byλt\\lambda\_\{t\}\.

The following simple bound explains why such mismatches can matter: if the internal descriptor and the resulting slot score vary smoothly with the canonical level, then the logSNR mismatch induces controlled score drift\.

Fix a selected hookhh\. Assume that, on the evaluation domain𝒳\\mathcal\{X\}, the pooled descriptor𝐳\.,h\(𝐱\)\\mathbf\{z\}\_\{\.,h\}\(\\mathbf\{x\}\)isLhL\_\{h\}\-Lipschitz in the canonical level:

‖𝐳λ,h\(𝐱\)−𝐳λ′,h\(𝐱\)‖2≤Lh\|λ−λ′\|,∀𝐱∈𝒳,λandλ′two corruption levels\.\\\|\\mathbf\{z\}\_\{\\lambda,h\}\(\\mathbf\{x\}\)\-\\mathbf\{z\}\_\{\\lambda^\{\\prime\},h\}\(\\mathbf\{x\}\)\\\|\_\{2\}\\leq L\_\{h\}\|\\lambda\-\\lambda^\{\\prime\}\|\\,,\\qquad\\forall\\mathbf\{x\}\\in\\mathcal\{X\},\\ \\lambda\\text\{ and \}\\lambda^\{\\prime\}\\text\{ two corruption levels\}\.Assume also that the corresponding oracle slot scores\.,h∘s^\{\\circ\}\_\{\.,h\}isMhM\_\{h\}\-Lipschitz in its feature argument over the attained range\. Then

\|sλ,h∘\(𝐱\)−sλ′,h∘\(𝐱\)\|≤MhLh\|λ−λ′\|\.\|s^\{\\circ\}\_\{\\lambda,h\}\(\\mathbf\{x\}\)\-s^\{\\circ\}\_\{\\lambda^\{\\prime\},h\}\(\\mathbf\{x\}\)\|\\leq M\_\{h\}L\_\{h\}\|\\lambda\-\\lambda^\{\\prime\}\|\.Consequently, for a discrete backbone using a native matched levelλt\\lambda\_\{t\}and a requested canonical levelλref\\lambda\_\{\\mathrm\{ref\}\}, by averaging these bounds over the selected hooks in the oracleCFSscore :

\|SCFS,λref∘\(𝐱\)−SCFS,λt∘\(𝐱\)\|≤1\|𝒮\|∑h∈𝒮MhLh\|λref−λt\|\.\|S^\{\\circ\}\_\{\\textsc\{CFS\},\\lambda\_\{\\mathrm\{ref\}\}\}\(\\mathbf\{x\}\)\-S^\{\\circ\}\_\{\\textsc\{CFS\},\\lambda\_\{t\}\}\(\\mathbf\{x\}\)\|\\leq\\frac\{1\}\{\|\\mathcal\{S\}\|\}\\sum\_\{h\\in\\mathcal\{S\}\}M\_\{h\}L\_\{h\}\|\\lambda\_\{\\mathrm\{ref\}\}\-\\lambda\_\{t\}\|\\,\.
The same reasoning can be applied to AUROC score difference\.

This diagnostic is only meaningful for discrete backbones \(it is detailed in Appendix[C\.4](https://arxiv.org/html/2605.11014#A3.SS4)\), so we report it on improved\-diffusion\. Figure[2](https://arxiv.org/html/2605.11014#A2.F2)shows two complementary views\. First, larger effective canonical mismatches tend to induce larger OOD score drift\. Second, sufficiently large mismatches can also produce measurable AUROC degradation relative to the matched logSNR\-uniform reference policy\. The relation is not perfectly linear, which is expected under discrete timestep collisions and pair\-dependent difficulty, but the global trend supports the role of canonical logSNR matching as a genuine methodological requirement rather than a cosmetic alignment choice\.

![Refer to caption](https://arxiv.org/html/2605.11014v1/figs/canonical_gap_vs_drift_improved.png)

![Refer to caption](https://arxiv.org/html/2605.11014v1/figs/canonical_gap_vs_delta_auroc_improved.png)

Figure 2:Canonical\-level matching sanity check on improved\-diffusion\.Left:standardized score drift versus effective logSNR mismatch\|λref−λt\|\|\\lambda\_\{\\mathrm\{ref\}\}\-\\lambda\_\{t\}\|\.Right:AUROC degradation relative to the matched logSNR policy\. Large mismatches can induce score drift and degrade AUROC, supporting logSNR matching as an implementation requirement rather than a cosmetic alignment choice\.
### B\.7Empirical diagnostic protocol

The theory of Section[4](https://arxiv.org/html/2605.11014#S4)yields two measurable quantities used in our empirical diagnostics: the diagonal noncentralityκ^λ\(𝒮\)\\hat\{\\kappa\}\_\{\\lambda\}\(\\mathcal\{S\}\)and the content\-to\-instability ratioR^h\(λ\)\\hat\{R\}\_\{h\}\(\\lambda\)\. These quantities are estimated on held\-out data and compared directly against downstream OOD performance\.

Forκ^λ\(𝒮\)\\hat\{\\kappa\}\_\{\\lambda\}\(\\mathcal\{S\}\), we study rank correlation with pairwise AUROC across candidate sparse probes and ID→\\toOOD pairs\. This diagnostic is summarized in the main paper through Table[3](https://arxiv.org/html/2605.11014#S6.T3), with representative scatter plots deferred to Figure[3](https://arxiv.org/html/2605.11014#A2.F3)\. It confirms this prediction on both improved\-diffusion and EDM backbones\. Across candidate sparse probes and ID→\\toOOD pairs, larger values ofκ^λ\(𝒮\)/d\\hat\{\\kappa\}\_\{\\lambda\}\(\\mathcal\{S\}\)/dare strongly aligned with larger AUROC, supporting the interpretation of the diagonal score as a structured local detector rather than as an arbitrary lightweight classifier\.

ForR^h\(λ\)\\hat\{R\}\_\{h\}\(\\lambda\), we study rank correlation with average AUROC across candidate hooks, and we additionally measure within\-image corruption variance as a function ofb\(λ\)2b\(\\lambda\)^\{2\}\. These validations are reported in Figure[4](https://arxiv.org/html/2605.11014#A2.F4)\.

![Refer to caption](https://arxiv.org/html/2605.11014v1/figs/kappa_vs_auroc_improved.png)

![Refer to caption](https://arxiv.org/html/2605.11014v1/figs/kappa_vs_auroc_edm.png)

Figure 3:Empirical validation of diagonal separation\.The estimated diagonal noncentralityκ^λ\(𝒮\)/d\\hat\{\\kappa\}\_\{\\lambda\}\(\\mathcal\{S\}\)/dis strongly aligned with downstream AUROC on both improved\-diffusion and EDM backbones\.![Refer to caption](https://arxiv.org/html/2605.11014v1/figs/ratio_vs_auroc_improved.png)

![Refer to caption](https://arxiv.org/html/2605.11014v1/figs/ratio_vs_auroc_edm.png)

![Refer to caption](https://arxiv.org/html/2605.11014v1/figs/within_vs_b2_improved.png)

![Refer to caption](https://arxiv.org/html/2605.11014v1/figs/within_vs_b2_edm.png)

Figure 4:Empirical validation of low\-noise stability and the content\-to\-instability diagnostic\.Top row:R^h\(λ\)\\hat\{R\}\_\{h\}\(\\lambda\)versus downstream Avg AUROC across candidate hooks for improved\-diffusion \(left\) and EDM \(right\)\.Bottom row:within\-image corruption variance versusb\(λ\)2b\(\\lambda\)^\{2\}for the selected encoder and decoder hooks on improved\-diffusion \(left\) and EDM \(right\)\. Across both backbone families, larger content\-to\-instability ratios align with stronger downstream performance, and lower\-noise canonical probing reduces corruption\-induced variability\.
### B\.8Interpretation

Taken together, the appendix theory supports three concrete conclusions about sparse diffusion probing\.

First, the paired encoder\-decoder separation decomposes into a decoder term plus a nonnegative conditional encoder residual\. This supports using a late decoder snapshot as the primary sparse probe, while interpreting encoder snapshots as complementary rather than primary\. Second, the diagonalCFSscore is governed by a measurable diagonal noncentrality parameterκλ\\kappa\_\{\\lambda\}, so the score can be analyzed as a structured local detector rather than as an arbitrary lightweight head\. Third, low\-noise canonical probing reduces corruption\-induced instability: under a local smoothness approximation, within\-image feature variance scales asb\(λ\)2b\(\\lambda\)^\{2\}, and the empirical content\-to\-instability diagnostic captures this effect\.

When included, the canonical\-matching diagnostic further supports that logSNR alignment is not merely a bookkeeping device: large discrete\-level mismatches can induce score drift and degrade AUROC\.

The theory remains local rather than universal\. It does not claim that sparse internal probing dominates every conceivable output\-space detector\. Its purpose is narrower: to explain and predict which sparse internal probes should work best under a backbone\-equated protocol\.

## Appendix CExperimental Protocols, Splits, and Canonicalization

This section groups the protocol elements shared across methods: dataset splits, canonical level construction, backbone\-specific corruption mappings, adapter outputs, and logical cost accounting\.

### C\.1Small\-scale benchmark protocol

The small\-scale benchmark uses the three ID datasets

ℐsmall=\{CIFAR\-10,SVHN,CelebA32\},\\mathcal\{I\}\_\{\\text\{small\}\}=\\\{\\text\{CIFAR\-10\},\\text\{SVHN\},\\text\{CelebA32\}\\\},and the OOD pool

𝒪small=\{CIFAR\-10,SVHN,CelebA32,CIFAR\-100,DTD\}\.\\mathcal\{O\}\_\{\\text\{small\}\}=\\\{\\text\{CIFAR\-10\},\\text\{SVHN\},\\text\{CelebA32\},\\text\{CIFAR\-100\},\\text\{DTD\}\\\}\.For each ID dataset, all remaining datasets in the pool are treated as OOD, producing1212ID→\\toOOD pairs per backbone\.

For each ID dataset, we define:

- •an ID\-fit split used to fit ID\-only statistics and density heads;
- •an ID\-test split used for in\-distribution evaluation;
- •external OOD\-test splits for all OOD datasets\.

These splits are fixed and shared across methods\.

##### Relative\-OOD interpretation\.

LetP⋆P\_\{\\star\}denote the training distribution of the frozen checkpoint, and letPPdenote the chosen evaluation ID dataset\. Our detector should be interpreted as performing OOD detection*relative to the evaluation reference bank*PP, not as an absolute test of membership inP⋆P\_\{\\star\}\. This distinction is especially important in cross\-source or transfer settings\.

### C\.2ImageNet200 / ImageNet1K protocol

ImageNet200 and ImageNet1K are used as checkpoint\-controlled large\-scale benchmarks under a single official ImageNet\-64 improved\-diffusion backbone\. Since the official training split is not used in our setup, we adopt deterministic disjoint val\-only protocols\. For ImageNet200, we use a class\-stratified split of the validation set into:

- •ID\-fit / bank split:used to fit ID statistics;
- •ID\-test split:the complementary subset used for in\-distribution evaluation\.

For ImageNet1K, we use the same val\-only deterministic split policy to define ID\-fit and ID\-test subsets\.

In both benchmarks, the OOD suite contains two near\-OOD sets,NINCOandSSB\-hard, together with one far\-OOD texture\-shift set,Textures\.

##### Checkpoint\-controlled protocol\.

All large\-scale results in this appendix use the same official ImageNet\-64 improved\-diffusion checkpoint \(imagenet64\_uncond\_100M\_1500K\.pt\)\. This removes cross\-family variability and isolates method differences under a single shared\-source backbone\.

### C\.3Canonical level construction:KgridK\_\{\\text\{grid\}\}vs\.KcK\_\{c\}

##### Purpose\.

Discrete backbones require mapping continuous canonical levels to discrete timesteps, which may produce duplicates\. We separate:

- •KgridK\_\{\\text\{grid\}\}: candidate grid resolution used to build a dense set of possible canonical levels;
- •KcK\_\{c\}: number of selected levels actually used by the method\.

##### Construction\.

We construct a candidate grid

\{λigrid\}i=1Kgrid⊂\[λmin,λmax\]\.\\\{\\lambda\_\{i\}^\{\\mathrm\{grid\}\}\\\}\_\{i=1\}^\{K\_\{\\text\{grid\}\}\}\\subset\[\\lambda\_\{\\min\},\\lambda\_\{\\max\}\]\\,\.For discrete backbones, each candidate level is mapped to the nearest available timestep in logSNR space\. Whenunique=true, repeated timestep assignments are removed, and the final selected set contains at mostKcK\_\{c\}distinct levels\. For continuous backbones, the selected levels are evenly spaced directly in canonical space\.

##### Interpretation\.

IncreasingKgridK\_\{\\text\{grid\}\}improves mapping fidelity on discrete backbones without necessarily increasing test\-time cost, because only the final selectedKcK\_\{c\}levels are used downstream\.

### C\.4Backbone\-specific realization of canonical corruption

#### C\.4\.1Improved\-diffusion \(VP\-DDPM\)

For improved\-diffusion checkpoints, the native forward corruption process is

𝐱t=α¯t𝐱0\+1−α¯t𝜺,𝜺∼𝒩\(𝟎,𝐈\),\\mathbf\{x\}\_\{t\}=\\sqrt\{\\bar\{\\alpha\}\_\{t\}\}\\,\\mathbf\{x\}\_\{0\}\+\\sqrt\{1\-\\bar\{\\alpha\}\_\{t\}\}\\,\\boldsymbol\{\\varepsilon\},\\qquad\\boldsymbol\{\\varepsilon\}\\sim\\mathcal\{N\}\(\\mathbf\{0\},\\mathbf\{I\}\)\\,,whereα¯t\\bar\{\\alpha\}\_\{t\}is the cumulative noise\-schedule coefficient of the checkpoint\.

Hence, the corresponding corruption coefficients in the shared canonical form

𝐱λ=a\(λ\)𝐱0\+b\(λ\)𝜺,\\mathbf\{x\}\_\{\\lambda\}=a\(\\lambda\)\\,\\mathbf\{x\}\_\{0\}\+b\(\\lambda\)\\,\\boldsymbol\{\\varepsilon\}\\,,are

at=α¯t,bt=1−α¯t,a\_\{t\}=\\sqrt\{\\bar\{\\alpha\}\_\{t\}\}\\,,\\qquad b\_\{t\}=\\sqrt\{1\-\\bar\{\\alpha\}\_\{t\}\}\\,,and the induced discrete canonical logSNR is

λt=log⁡α¯t1−α¯t\.\\lambda\_\{t\}=\\log\\frac\{\\bar\{\\alpha\}\_\{t\}\}\{1\-\\bar\{\\alpha\}\_\{t\}\}\\,\.
A desired canonical levelλ\\lambdais therefore mapped to the nearest native timestep in logSNR space:

t\(λ\)=arg⁡mint⁡\|λt−λ\|\.t\(\\lambda\)=\\arg\\min\_\{t\}\|\\lambda\_\{t\}\-\\lambda\|\\,\.When several candidate canonical levels map to the same native timestep, duplicates are removed by the unique\-level construction described in Appendix[C\.3](https://arxiv.org/html/2605.11014#A3.SS3)\.

#### C\.4\.2EDM family

EDM\-style models expose a continuous noise variable, but the exact native parameterization depends on the checkpoint preconditioning family\. Rather than matching checkpoints through their native interface directly, we canonicalize them through the effective corruption ratio

σ~\(λ\):=b\(λ\)a\(λ\)=exp⁡\(−λ/2\),\\tilde\{\\sigma\}\(\\lambda\):=\\frac\{b\(\\lambda\)\}\{a\(\\lambda\)\}=\\exp\(\-\\lambda/2\)\\,,which follows from

λ=log⁡a\(λ\)2b\(λ\)2\.\\lambda=\\log\\frac\{a\(\\lambda\)^\{2\}\}\{b\(\\lambda\)^\{2\}\}\\,\.
The adapter, therefore, maps each canonical levelλ\\lambdato the checkpoint\-specific model input corresponding to the same effective ratioσ~\(λ\)\\tilde\{\\sigma\}\(\\lambda\), and returns coefficients\(a,b\)\(a,b\)such that

𝐱λ=a𝐱0\+b𝜺\.\\mathbf\{x\}\_\{\\lambda\}=a\\,\\mathbf\{x\}\_\{0\}\+b\\,\\boldsymbol\{\\varepsilon\}\\,\.
Different EDM\-family preconditionings may realize the sameσ~\\tilde\{\\sigma\}with different coefficient pairs\(a,b\)\(a,b\)\. For example, a VE/EDM\-style realization uses

a=1,b=σ~,a=1,\\qquad b=\\tilde\{\\sigma\}\\,,whereas a VP\-style realization can be written as

a=\(1\+σ~2\)−1/2,b=σ~\(1\+σ~2\)−1/2\.a=\(1\+\\tilde\{\\sigma\}^\{2\}\)^\{\-1/2\}\\,,\\qquad b=\\tilde\{\\sigma\}\(1\+\\tilde\{\\sigma\}^\{2\}\)^\{\-1/2\}\\,\.In both cases,

ba=σ~,\\frac\{b\}\{a\}=\\tilde\{\\sigma\}\\,,so the same canonicalλ\\lambdacorresponds to the same effective corruption strength even though the native checkpoint interface differs\.

#### C\.4\.3Shared canonical semantics

Although improved\-diffusion and EDM differ in interface and preconditioning, all methods operate through the same canonical corruption abstraction\. This allows meaningful cross\-backbone comparisons at matched canonical levels\.

### C\.5Shared adapter outputs and reconstruction rules

All harmonized methods are routed through a shared adapter layer\. At each canonical level, the adapter provides:

- •a denoised estimate𝐱^0\\hat\{\\mathbf\{x\}\}\_\{0\},
- •optional access to intermediate activations,
- •and, when needed, an𝜺^\\hat\{\\boldsymbol\{\\varepsilon\}\}estimate recovered through 𝜺^=𝐱−a𝐱^0b\.\\hat\{\\boldsymbol\{\\varepsilon\}\}=\\frac\{\\mathbf\{x\}\-a\\,\\hat\{\\mathbf\{x\}\}\_\{0\}\}\{b\}\\,\.\(11\)

This ensures that all output\-space baselines and internal\-feature methods use a compatible corruption semantics, regardless of the native output conventions of the original implementation\.

### C\.6Implementation details specific toCFS

The conceptual design ofCFSis described in Section[4](https://arxiv.org/html/2605.11014#S4)\. Here we report only implementation\-level details needed for exact reproduction\.

#### C\.6\.1Admissible hook search and shortlist size

A candidate hook is admissible only if a dry forward pass returns a 4D tensor of shapeB×C×H×WB\\times C\\times H\\times W\. In practice, we restrict the search to stage\-level encoder and decoder blocks and keep a small shortlist within each structural region before applying the ID\-only proxy\. For every experiment, we report:

- •the number of admissible encoder candidates,
- •the number of admissible decoder candidates,
- •the shortlist size retained in each region,
- •the final selected encoder and decoder module names\.

#### C\.6\.2ID\-only probe configuration

The proxy uses a small ID\-only probe set\. For exact reproducibility, we report:

- •the number of ID probe images,
- •the number of corruption repeats per image,
- •whether the same noise draw is reused across canonical levels,

#### C\.6\.3Exact pooled\-slot construction

For each retained slot, we apply channel\-wise spatial mean and standard deviation pooling, yielding a descriptor of dimension2Ck,ℓ2C\_\{k,\\ell\}\. We report, for each backbone family and benchmark:

- •the selected canonical levels,
- •the selected encoder and decoder block names,
- •the resulting feature\-map shapes,
- •the pooled descriptor dimensions\.

#### C\.6\.4Diagonal score details

Each pooled slot is modeled independently with ID\-only diagonal statistics\. For exact reproduction, we report:

- •whether the slot features are standardized before fitting,
- •whether slot scores are averaged uniformly or reweighted\.

## Appendix DImplementation Taxonomy and Baseline Specifications

### D\.1Implementation taxonomy

We implement these methods to preserve the core probe and score logic of the original method while routing it through our shared adapter and canonicalization pipeline\.

##### Methods in this paper\.

- •MSMA: harmonized faithful port;
- •DiffPath: harmonized faithful port;
- •DDPM\-OOD: harmonized faithful port;
- •GEPC: harmonized faithful port;
- •CFS: proposed internal representation\-space method\.

##### Methods discussed but not retained in the strict main benchmark\.

We position SCOPED and EigenScore conceptually in related work, but do not include them in the strict mainMBEtable\. In our attempts, we did not obtain a controlled, harmonized rerun of SCOPED suitable for the shared benchmark\. EigenScore was also not retained because matched reruns within the shared adapter/canonicalization pipeline were substantially heavier, and some public checkpoint/artifact combinations did not yield reliable runs\. We therefore restrict the strict main benchmark to methods for which we can provide controlled backbone\-equated evaluation under matched corruption semantics and budget accounting\.

##### Official repositories used as starting points\.

### D\.2Common harmonized interface

All baselines are evaluated through the same adapter interface, the same input normalization to\[−1,1\]\[\-1,1\], and the same canonical corruption semantics\. For a selected canonical levelλk\\lambda\_\{k\}with coefficients\(ak,bk\)\(a\_\{k\},b\_\{k\}\), we explicitly corrupt the clean image as

𝐱k=ak𝐱0\+bk𝜺,𝜺∼𝒩\(𝟎,𝐈\)\.\\mathbf\{x\}\_\{k\}=a\_\{k\}\\,\\mathbf\{x\}\_\{0\}\+b\_\{k\}\\,\\boldsymbol\{\\varepsilon\},\\qquad\\boldsymbol\{\\varepsilon\}\\sim\\mathcal\{N\}\(\\mathbf\{0\},\\mathbf\{I\}\)\\,\.From the frozen backbone, we then recover native denoising quantities𝐱^0,k\\hat\{\\mathbf\{x\}\}\_\{0,k\}and𝜺^k\\hat\{\\boldsymbol\{\\varepsilon\}\}\_\{k\}\. For improved\-diffusion backbones,ε^k\\hat\{\\varepsilon\}\_\{k\}is extracted natively from the model output using the correct timestep scaling and mean\-parameterization conventions, and𝐱^0,k\\hat\{\\mathbf\{x\}\}\_\{0,k\}is recovered from the same forward pass\. For EDM\-style backbones, the adapter returns𝐱^0,k\\hat\{\\mathbf\{x\}\}\_\{0,k\}through one denoising call and reconstructs

𝜺^k=𝐱k−ak𝐱^0,kbk\.\\hat\{\\boldsymbol\{\\varepsilon\}\}\_\{k\}=\\frac\{\\mathbf\{x\}\_\{k\}\-a\_\{k\}\\,\\hat\{\\mathbf\{x\}\}\_\{0,k\}\}\{b\_\{k\}\}\\,\.Thus, one backbone evaluation at one level counts as one logical forward pass, even if both𝐱^0\\hat\{\\mathbf\{x\}\}\_\{0\}and𝜺^\\hat\{\\boldsymbol\{\\varepsilon\}\}are recovered from that same call\.

### D\.3MSMA \(harmonized faithful port\)

MSMA is the closest baseline in our suite to a harmonized faithful port\. Its core logic is preserved: build a multiscale descriptor from denoiser\-derived quantities acrossKcK\_\{c\}levels, then fit an ID\-only density head on the resulting feature vectors\.

##### Feature construction\.

For each canonical levelλk\\lambda\_\{k\}, we compute

fk\(𝐱0\)=‖𝜺^k‖2,f\_\{k\}\(\\mathbf\{x\}\_\{0\}\)=\\\|\\hat\{\\boldsymbol\{\\varepsilon\}\}\_\{k\}\\\|\_\{2\}\\,,where𝜺^k\\hat\{\\boldsymbol\{\\varepsilon\}\}\_\{k\}is obtained through the shared native adapter interface\. The final multiscale descriptor is

F\(𝐱0\)=\[f1\(𝐱0\),…,fKc\(𝐱0\)\]∈ℝKc\.F\(\\mathbf\{x\}\_\{0\}\)=\\big\[f\_\{1\}\(\\mathbf\{x\}\_\{0\}\),\\dots,f\_\{K\_\{c\}\}\(\\mathbf\{x\}\_\{0\}\)\\big\]\\in\\mathbb\{R\}^\{K\_\{c\}\}\\,\.

##### Head and scoring\.

We fit an ID\-only density model onF\(x\)F\(x\), after optional feature standardization\. Our implementation supports:

- •a diagonal Gaussian head,
- •a Gaussian mixture model \(GMM\),
- •akk\-nearest\-neighbor distance head\.

The final score is always OOD\-high: negative log\-likelihood for Gaussian/GMM heads, or distance for the KNN head\.

##### What is preserved and what is adapted\.

The preserved part is the multiscale descriptor and the density score\. The adapted part is the use of a common adapter interface, canonical logSNR levels, and a shared improved/EDM implementation\.

### D\.4DiffPath \(harmonized faithful port\)

DiffPath is implemented as a faithful port for path\-based diffusion OOD detection\. Rather than reproducing a native implementation verbatim, we preserve the key idea: summarize a*multilevel denoising path*and fit an ID\-only density model on the resulting path statistics\.

##### Recursive path construction\.

Levels are ordered from clean to noisy\. We initialize

𝐱1=a1𝐱0\+b1𝜺,\\mathbf\{x\}\_\{1\}=a\_\{1\}\\,\\mathbf\{x\}\_\{0\}\+b\_\{1\}\\,\\boldsymbol\{\\varepsilon\}\\,,and for each levelkkcompute\(𝐱^0,k,𝜺^k\)\(\\hat\{\\mathbf\{x\}\}\_\{0,k\},\\hat\{\\boldsymbol\{\\varepsilon\}\}\_\{k\}\)from the current state𝐱k\{\\mathbf\{x\}\}\_\{k\}\. A scalar path statisticqkq\_\{k\}is then extracted from𝜺^k\\hat\{\\boldsymbol\{\\varepsilon\}\}\_\{k\}using one of the following reductions:

qk∈\{mean\(𝜺^k\),mean\(\|𝜺^k\|\),mean\(𝜺^k2\)\}\.q\_\{k\}\\in\\Big\\\{\\mathrm\{mean\}\(\\hat\{\\boldsymbol\{\\varepsilon\}\}\_\{k\}\),\\,\\mathrm\{mean\}\(\|\\hat\{\\boldsymbol\{\\varepsilon\}\}\_\{k\}\|\),\\,\\mathrm\{mean\}\(\\hat\{\\boldsymbol\{\\varepsilon\}\}\_\{k\}^\{2\}\)\\Big\\\}\\,\.We then propagate the path recursively:

𝐱k\+1=ak\+1𝐱^0,k\+bk\+1𝜺^k\.\\mathbf\{x\}\_\{k\+1\}=a\_\{k\+1\}\\,\\hat\{\\mathbf\{x\}\}\_\{0,k\}\+b\_\{k\+1\}\\,\\hat\{\\boldsymbol\{\\varepsilon\}\}\_\{k\}\\,\.

##### Feature variants\.

We use two harmonized feature families\. The 1D variant computes

ϕ1d\(x\)=1Kc−1∑k=1Kc−1\(qk\+1−qkΔλk\)2,\\phi\_\{\\mathrm\{1d\}\}\(x\)=\\sqrt\{\\frac\{1\}\{K\_\{c\}\-1\}\\sum\_\{k=1\}^\{K\_\{c\}\-1\}\\left\(\\frac\{q\_\{k\+1\}\-q\_\{k\}\}\{\\Delta\\lambda\_\{k\}\}\\right\)^\{2\}\}\\,,
The 6D variant computes low\-order moments of both the path valuesQ=\(q1,…,qKc\)Q=\(q\_\{1\},\\ldots,q\_\{K\_\{c\}\}\)and their level\-wise differencesΔQ=\(q2−q1,…,qKc−qKc−1\)\\Delta Q=\(q\_\{2\}\-q\_\{1\},\\ldots,q\_\{K\_\{c\}\}\-q\_\{K\_\{c\}\-1\}\):

ϕ6d\(x\)=\[mean⁡\(Q\),mean⁡\(Q2\),‖Q‖3,mean⁡\(\|ΔQ\|\),mean⁡\(\(ΔQ\)2\),‖ΔQ‖3\]\.\\phi\_\{\\mathrm\{6d\}\}\(x\)=\\big\[\\operatorname\{mean\}\(Q\),\\,\\operatorname\{mean\}\(Q^\{2\}\),\\,\\\|Q\\\|\_\{3\},\\,\\operatorname\{mean\}\(\|\\Delta Q\|\),\\,\\operatorname\{mean\}\(\(\\Delta Q\)^\{2\}\),\\,\\\|\\Delta Q\\\|\_\{3\}\\big\]\.

##### Head and scoring\.

We fit either a 1D KDE for the 1D feature or a diagonal Gaussian model for the 6D feature\. Scores are OOD\-high via negative log\-density\.

### D\.5DDPM\-OOD \(harmonized faithful port\)

Our DDPM\-OOD baseline is implemented as a harmonized*multi\-start reconstruction*\. It preserves the main scoring logic of DDPM\-OOD: reconstruction error should be evaluated from several noisy starting points, normalized using ID statistics*per start*, and then aggregated into a single OOD score\.

##### Canonical reverse schedule and starting points\.

We first build a canonical level set

λ1\>λ2\>⋯\>λKc,\\lambda\_\{1\}\>\\lambda\_\{2\}\>\\cdots\>\\lambda\_\{K\_\{c\}\}\\,,ordered from clean to noisy, and reverse it into a noisy\-to\-clean reconstruction schedule\. A subset of starting points is then selected by subsampling this reverse schedule\.

##### Multi\-start reconstruction\.

For a selected starting levelλ\(s\)\\lambda^\{\(s\)\}, we generate

𝐱s=as𝐱0\+bs𝜺,𝜺∼𝒩\(𝟎,𝐈\),\\mathbf\{x\}\_\{s\}=a\_\{s\}\\,\\mathbf\{x\}\_\{0\}\+b\_\{s\}\\,\\boldsymbol\{\\varepsilon\}\\,,\\qquad\\boldsymbol\{\\varepsilon\}\\sim\\mathcal\{N\}\(\\mathbf\{0\},\\mathbf\{I\}\)\\,,and reconstruct deterministically along the corresponding reverse suffix\. If the suffix levels are denoted

λ\(s\)=λj1,λj2,…,λjm,\\lambda^\{\(s\)\}=\\lambda\_\{j\_\{1\}\},\\lambda\_\{j\_\{2\}\},\\dots,\\lambda\_\{j\_\{m\}\}\\,,then at each step, we compute\(x^0,jr,ε^jr\)\(\\hat\{x\}\_\{0,j\_\{r\}\},\\hat\{\\varepsilon\}\_\{j\_\{r\}\}\)from the current state and propagate toward the next cleaner canonical level:

𝐱jr\+1=ajr\+1𝐱^0,jr\+bjr\+1𝜺^jr\.\\mathbf\{x\}\_\{j\_\{r\+1\}\}=a\_\{j\_\{r\+1\}\}\\,\\hat\{\\mathbf\{x\}\}\_\{0,j\_\{r\}\}\+b\_\{j\_\{r\+1\}\}\\,\\hat\{\\boldsymbol\{\\varepsilon\}\}\_\{j\_\{r\}\}\\,\.The final clean reconstruction is the terminal𝐱^0\\hat\{\\mathbf\{x\}\}\_\{0\}at the end of the suffix\.

##### Per\-start reconstruction errors and ID normalization\.

For each startss, we compute the reconstruction error

ms\(𝐱0\)=1d‖𝐱^0,s−𝐱0‖22\.m\_\{s\}\(\\mathbf\{x\}\_\{0\}\)=\\frac\{1\}\{d\}\\,\\left\\\|\\hat\{\\mathbf\{x\}\}\_\{0,s\}\-\\mathbf\{x\}\_\{0\}\\right\\\|\_\{2\}^\{2\}\\,\.This yields a vector of per\-start errors

M\(𝐱0\)=\[m1\(𝐱0\),…,mS\(𝐱0\)\]\.M\(\\mathbf\{x\}\_\{0\}\)=\\big\[m\_\{1\}\(\\mathbf\{x\}\_\{0\}\),\\dots,m\_\{S\}\(\\mathbf\{x\}\_\{0\}\)\\big\]\\,\.For each startss, we fit ID\-only normalization statistics and form

zs\(𝐱\)=ms\(𝐱\)−csσs\.z\_\{s\}\(\\mathbf\{x\}\)=\\frac\{m\_\{s\}\(\\mathbf\{x\}\)\-c\_\{s\}\}\{\\sigma\_\{s\}\}\\,\.

##### Aggregation\.

The final scalar score is obtained by aggregating the per\-start normalized deviations:

S\(𝐱\)=Agg⁡\(z1\(𝐱\),…,zS\(𝐱\)\),S\(\\mathbf\{x\}\)=\\operatorname\{Agg\}\\big\(z\_\{1\}\(\\mathbf\{x\}\),\\dots,z\_\{S\}\(\\mathbf\{x\}\)\\big\)\\,,whereAgg\\operatorname\{Agg\}denotes mean, median, or sum\.

### D\.6GEPC \(harmonized faithful port\)

Our GEPC baseline is implemented as a MBE\-adapted output\-consistency faithful port\. The central idea is preserved: if the denoiser behaves approximately equivariantly under a small discrete transformation group, then in\-distribution inputs should exhibit stronger posterior consistency than OOD inputs\.

##### Canonical corruption and transformed outputs\.

For each selected canonical levelλk\\lambda\_\{k\}, we first form

𝐱k=ak𝐱0\+bk𝜺,𝜺∼𝒩\(𝟎,𝐈\),\\mathbf\{x\}\_\{k\}=a\_\{k\}\\,\\mathbf\{x\}\_\{0\}\+b\_\{k\}\\,\\boldsymbol\{\\varepsilon\}\\,,\\qquad\\boldsymbol\{\\varepsilon\}\\sim\\mathcal\{N\}\(\\mathbf\{0\},\\mathbf\{I\}\)\\,,then evaluate the denoiser on both𝐱k\\mathbf\{x\}\_\{k\}and transformed copiesg⋅𝐱kg\\cdot\\mathbf\{x\}\_\{k\}, whereggbelongs to a small discrete group such as flips or180∘180^\{\\circ\}rotation\. From each output we recover𝐱^0,k\\hat\{\\mathbf\{x\}\}\_\{0,k\}and𝜺^k\\hat\{\\boldsymbol\{\\varepsilon\}\}\_\{k\}, and build the score proxy

𝐬^k=−𝜺^kbk\.\\hat\{\\mathbf\{s\}\}\_\{k\}=\-\\frac\{\\hat\{\\boldsymbol\{\\varepsilon\}\}\_\{k\}\}\{b\_\{k\}\}\\,\.

##### Consistency features\.

We compare the reference output to the transformed\-and\-brought\-back outputsg−1⋅𝐬^k\(g⋅𝐱k\)g^\{\-1\}\\cdot\\hat\{\\mathbf\{s\}\}\_\{k\}\(g\\cdot\\mathbf\{x\}\_\{k\}\)andg−1⋅𝐱^0,k\(g⋅𝐱k\)g^\{\-1\}\\cdot\\hat\{\\mathbf\{x\}\}\_\{0,k\}\(g\\cdot\\mathbf\{x\}\_\{k\}\)\. This yields several OOD\-high consistency features, including a normalized score discrepancy, a cosine consistency score, and a normalized𝐱^0\\hat\{\\mathbf\{x\}\}\_\{0\}\-consistency score\.

##### Level\-wise calibration and aggregation\.

At each level, these raw consistency features are calibrated with an ID\-only head, typically KDE or z\-score normalization\. Feature scores are then aggregated within each level, and finally across canonical levels using mean, weighted mean, or trimmed mean aggregation\.

## Appendix EFocused Ablations

We restrict appendix ablations to controls that directly test the paper’s central claim: the gain ofCFScomes from*where*the frozen diffusion backbone is probed, rather than from hidden budget, brittle hook choices, or downstream head complexity\.

### E\.1Pareto budget analysis and budget\-matched comparisons

A standard concern in diffusion OOD comparisons is hidden compute\. We therefore report a budget\-aware comparison in which logical test\-time cost is made explicit\. The goal is simple: if output\-space baselines are given a matched or larger budget, does the representation\-space advantage persist?

More precisely, we vary the number of selected canonical levels while keeping Monte Carlo testMCtest=1\\mathrm\{MC\}\_\{\\mathrm\{test\}\}=1, so that the dominant logical cost scales with the number of backbone evaluations\. ForCFS,Kc×KsK\_\{c\}\\times K\_\{s\}denotes the number of selected canonical levels and retained stage slots\. Since all retained hooks at a given level are extracted in the same forward pass, logical cost depends onKcK\_\{c\}, not onKsK\_\{s\}\.

Table 6:Budget\-aware comparison underMBE\.We vary the logical test\-time budget while keepingMCtest=1\\mathrm\{MC\}\_\{\\mathrm\{test\}\}=1\. This disentangles representation quality from hidden multilevel accumulation\. ForCFS,Kc×KsK\_\{c\}\\times K\_\{s\}denotes the number of selected canonical levels and retained stage slots, respectively\. Because all retained hooks at a given canonical level are extracted within the same backbone forward pass, the logical cost depends onKcK\_\{c\}only, not onKsK\_\{s\}\.Table[6](https://arxiv.org/html/2605.11014#A5.T6)shows that the best one\-forward operating points are already sparseCFSvariants\. In particular,CFSdec\(1×1low\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\\times 1\_\{\\mathrm\{low\}\}\)andCFS\(1×2low\)\\textsc\{CFS\}\(1\\times 2\_\{\\mathrm\{low\}\}\)dominate budget\-matched output\-space baselines while using only one backbone evaluation per image\. RicherCFSvariants improve representation size without changing the backbone cost at fixedKcK\_\{c\}, but the gains beyond the strongest one\-forward operating points are modest\. The key conclusion is therefore not merely thatCFSis accurate, but that its gain does not come from hidden multilevel accumulation\.

### E\.2Hook\-pair robustness and proxy validation

A natural concern is that the encoder\-decoder pair selected byCFS\(1×2\)\(1\\times 2\)could be fragile or cherry\-picked\. To test this, we evaluate multiple admissible encoder and decoder candidates within the same structural regions and report \(i\) the pairwise performance landscape, \(ii\) the relation between the ID\-side pair proxy and final pair performance, and \(iii\) a complementary region\-wise proxy validation\. All results are computed at the fixed low\-noise canonical level used in the main paper\.

![Refer to caption](https://arxiv.org/html/2605.11014v1/figs/hook_heatmap_improved.png)

![Refer to caption](https://arxiv.org/html/2605.11014v1/figs/hook_heatmap_edm.png)

Figure 5:Corresponding hook\-pair heatmaps forCFS\(1×2\)\(1\\times 2\)\.Each cell reports the Avg AUROC of one admissible pair at the low\-noise canonical level used in the main paper\. The marker denotes the final pair obtained from the ID\-side proxy shortlist\. For the improved backbone, the selected pair falls in the same high\-performing basin as the oracle pair\. For EDM, the dominant pattern is a broad plateau of strong pairs, indicating low sensitivity to the exact hook choice even when the proxy is less predictive\.Figure[5](https://arxiv.org/html/2605.11014#A5.F5)supports robustness more strongly than exact pair recovery\. For the improved backbone, the landscape is structured rather than spiky: the proxy\-selected pair reaches0\.8860\.886Avg AUROC, versus0\.8950\.895for the oracle admissible pair, so the proxy lands in the correct high\-performing basin even without identifying the exact optimum\. For EDM, the heatmap is flatter, and the main conclusion is different: several nearby pairs perform similarly well, so the method is not driven by a single brittle choice\.

Table 7:Region\-wise proxy\-selected modules versus empirical oracle modules forCFS\(1×2\)\(1\\times 2\)\.For each backbone and region, we compare the downstreamAvgAUROC\\mathrm\{AvgAUROC\}obtained with the proxy\-selected module to that obtained with the best admissible module in hindsight, while keeping the opposite region fixed to its proxy\-selected choice\. Small gaps indicate that the proxy\-selected module lies close to the empirical oracle without using any OOD labels\.Table[7](https://arxiv.org/html/2605.11014#A5.T7)shows that, on improved\-diffusion, the proxy is more informative on the decoder side than on the encoder side, matching the stronger decoder\-column structure in the heatmap\. On EDM, the proxy is weaker as a regional ranker, but the oracle gaps remain small\. Overall, the selection rule does not need to find the exact best pair; it only needs to placeCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)in a stable, high\-performing part of the network without exhaustive search\.

### E\.3Canonical\-level robustness

The main paper identifies low\-noise canonical probing as the strongest operating regime\. We now test whether this effect reflects a stable region or a narrowly tuned choice of canonical level\.

Since the final method uses single\-level variants, we sweep the canonical levelλ\\lambdaand evaluate two representative detectors:CFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\)andCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)\. The key question is whether performance remains strong over a neighborhood of low\-noise levels, rather than peaking at a single hand\-picked value\.

![Refer to caption](https://arxiv.org/html/2605.11014v1/figs/lambda_sweep_avgauroc_run.png)

![Refer to caption](https://arxiv.org/html/2605.11014v1/figs/lambda_sweep_avgworst_run.png)

![Refer to caption](https://arxiv.org/html/2605.11014v1/figs/lambda_sweep_avgauroc_run_edm.png)

![Refer to caption](https://arxiv.org/html/2605.11014v1/figs/lambda_sweep_avgworst_run_edm.png)

Figure 6:Canonical\-level robustness for single\-levelCFSvariants\.Top:improved\-diffusion backbone\.Bottom:EDM backbone\.Left:AvgAUROC\\mathrm\{AvgAUROC\}\.Right:AvgWorstAUROC\\mathrm\{AvgWorstAUROC\}\. Across both backbones, performance is near chance for strongly negativeλ\\lambda, rises sharply through the intermediate regime, and then enters a broad high\-performing plateau at positiveλ\\lambda\. This supports the claim that the low\-noise advantage is robust rather than tied to a single finely tuned canonical level\.Figure[6](https://arxiv.org/html/2605.11014#A5.F6)shows the same qualitative pattern on both backbones\. For strongly negativeλ\\lambda, both detectors remain close to chance, indicating that highly corrupted canonical levels are not useful\. Performance then rises rapidly asλ\\lambdamoves toward lower\-noise regimes, before flattening into a broad plateau for positiveλ\\lambda\. Thus, the benefit of low\-noise probing is not confined to one operating point\.

There are, however, mild backbone\-specific differences\. On improved\-diffusion,CFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)is most helpful in the transition regime, whereas on the final high\-λ\\lambdaplateau the gap to decoder\-only probing becomes small\. EDM shows an even flatter plateau onceλ\\lambdabecomes positive\. Overall, the sweeps suggest that the canonical level is the primary driver, whereas richer representation composition mainly improves the intermediate regime and worst\-case robustness\.

### E\.4CFSfamily analysis

The role of representation composition is already visible in the budget\-matched rows of Table[6](https://arxiv.org/html/2605.11014#A5.T6), so we do not duplicate those results\. At fixed logical cost, the comparison betweenCFS\(2×1enc\)\\textsc\{CFS\}\(2\\times 1\_\{\\mathrm\{enc\}\}\),CFS\(2×1dec\)\\textsc\{CFS\}\(2\\times 1\_\{\\mathrm\{dec\}\}\), andCFS\(2×2\)\\textsc\{CFS\}\(2\\times 2\)shows how performance changes when using encoder\-only, decoder\-only, or combined encoder\-decoder snapshots across two canonical levels\.

Decoder\-only probing already captures most of the gain once the canonical level reaches the strong low\-noise regime\. Encoder\-decoder composition is most useful before the final plateau is reached and for worst\-case robustness\. The main claim is therefore not that composition is always necessary, but that it provides a meaningful robustness margin beyond an already strong decoder\-only baseline\.

### E\.5Pooling rule

We compare mean\-only, std\-only, and mean\+std pooling forCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)\. This tests whether first\-order channel averages suffice or whether within\-channel dispersion carries additional useful signal once both encoder and decoder snapshots are retained\. Results are presented in Table[8](https://arxiv.org/html/2605.11014#A5.T8)\.

Table 8:Pooling rule ablation forCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)\.We report both backbone families separately to test whether the gain of the primary detector relies only on first\-order channel averages or also on within\-channel dispersion captured by standard deviation pooling\.Combining mean and standard\-deviation pooling is clearly beneficial on improved\-diffusion, where it gives the best average and worst\-case performance\. On EDM, mean\+std gives the best average AUROC, while mean\-only is marginally better in AvgWorstAUROC\. Overall, dispersion features help the main operating point, but their gain is not perfectly uniform across criteria\.

### E\.6Diagonal score versus alternative ID\-only scores

The main paper uses a lightweight diagonal score so that the comparison stays focused on representation quality rather than downstream classifier capacity\. We now test whether the strong performance ofCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)persists under stronger but still ID\-only scores\.

Table 9:Head sensitivity under the same sparse representation forCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)\.We report both backbone families separately\. The goal is to test whether the usefulness of the primary detector comes mainly from the selected representation rather than from a particular downstream head\.Table[9](https://arxiv.org/html/2605.11014#A5.T9)shows that the sparseCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)representation remains strong across several ID\-only heads, which supports the view that the main signal comes from the selected features rather than from a specialized classifier\. At the same time, the exact ranking is not head\-invariant: KNN is strongest on both backbones, and shrinkage covariance also improves worst\-case robustness over the diagonal head\.

We therefore interpret the diagonal score as a conservative evaluation choice rather than as the empirically strongest one\. Its role in the main paper is to keep the detector lightweight and to avoid conflating feature quality with additional head capacity\.

### E\.7ID\-fit bank size sensitivity

BecauseCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)is calibrated from ID\-only reference statistics, practical performance may depend on the size of the ID\-fit bank\. We therefore vary the number of ID\-fit samples used to estimate the slot statistics and report the resulting performance\.

Table 10:Sensitivity to the ID\-fit bank size forCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)\.Each entry reports the mean over three random seeds\. The goal is to assess whether the sparse two\-slot representation reaches stable performance with moderate ID\-only calibration data\.Table[10](https://arxiv.org/html/2605.11014#A5.T10)shows thatCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)is only weakly sensitive to the ID\-fit bank size on both backbones\. Even a bank of 100 ID samples already performs very close to the larger\-bank regime\. There is therefore no evidence that the main operating point depends on an unusually large calibration set\.

### E\.8Statistical stability across random seeds

Finally, we report repeated runs that vary the stochastic seed in the evaluation pipeline\. The goal is not to estimate every source of variance exhaustively, but to check whether the main conclusions remain stable under the stochastic components that most directly affect the reported scores\.

Tables[11](https://arxiv.org/html/2605.11014#A5.T11)and[12](https://arxiv.org/html/2605.11014#A5.T12)report seed stability for bothCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)andCFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\)\.

Table 11:Statistical stability across random seeds on improved\-diffusion\.We report mean±\\pmstandard deviation over repeated runs\. The very small deviations indicate that the comparison between the two sparse CFS variants is not driven by stochastic fluctuations\.Table 12:Statistical stability across random seeds on EDM\.We report mean±\\pmstandard deviation over repeated runs\. Again, the deviations are extremely small, showing that the reported trade\-off between the two sparse CFS variants is stable across seeds\.Across both backbones, the standard deviations are tiny, so the observed rankings are highly stable across repeated runs\. For improved\-diffusion,CFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)consistently remains slightly stronger thanCFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\)in both average and worst\-case AUROC while also improving FPR95\. For EDM, the trade\-off is equally stable but slightly different:CFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\)retains a small advantage inAvgAUROC\\mathrm\{AvgAUROC\}and FPR95, whereasCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)retains the betterAvgWorstAUROC\\mathrm\{AvgWorstAUROC\}\.

### E\.9Architecture\-transfer sanity check

Although our main experiments use U\-Net\-style diffusion backbones, CFS only requires access to block\-level internal activations at canonical corruption levels\. As a sanity check, we apply the same sparse snapshot construction to a transformer\-based diffusion backbone, U\-ViT\. Unlike convolutional U\-Nets, U\-ViT keeps a token sequence of nearly constant size across its transformer blocks; its U\-shaped structure comes from input blocks, a middle block, output blocks, and long skip connections rather than from explicit spatial downsampling and upsampling\. We therefore replace the U\-Net encoder/decoder hook taxonomy by early, middle, and late transformer\-block snapshots, and pool token features using channel\-wise mean and standard deviation over image tokens\.

The purpose is to test whether the sparse\-snapshot principle transfers beyond convolutional U\-Nets\. All runs use the same low\-noise canonical level, with targetλ=5\.0\\lambda=5\.0, effectiveλ=5\.0329\\lambda=5\.0329, native U\-ViT timestept=21t=21, andb2=6\.48×10−3b^\{2\}=6\.48\\times 10^\{\-3\}\. The hooked U\-ViT activations have shapeB×257×512B\\times 257\\times 512, corresponding to 256 image tokens plus one time token; CFS pools the image tokens only\. We also include an output\-space baseline using the same U\-ViT checkpoint and the same canonical level: it scores images by the final reconstruction MSE‖𝐱^0−𝐱0‖2\\\|\\hat\{\\mathbf\{x\}\}\_\{0\}\-\\mathbf\{x\}\_\{0\}\\\|^\{2\}with an ID\-only one\-dimensional density fit\.

Table 13:Architecture\-transfer sanity check on U\-ViT\.CFS probes early, middle, and late U\-ViT block outputs instead of U\-Net encoder/decoder hooks\. All variants use one canonical low\-noise level, one backbone forward, token mean/std pooling, and an ID\-only diagonal score\. The output baseline uses the same checkpoint, level, and one\-forward budget, but scores only the final reconstruction error\. Averages are computed over 12 ordered pairs: for each ID dataset in \{CIFAR\-10, SVHN, CelebA32\}, we evaluate against the other two source datasets plus CIFAR\-100 and DTD as OOD\.Tables[13](https://arxiv.org/html/2605.11014#A5.T13)and[14](https://arxiv.org/html/2605.11014#A5.T14)support the architecture\-transfer interpretation: sparse internal snapshots remain informative even when spatial U\-Net feature maps are replaced by token\-level transformer states\. Importantly, the gain does not come merely from using a low\-noise U\-ViT forward: the output reconstruction baseline reaches only0\.7780\.778AvgAUROC and0\.6530\.653AvgFPR95, whereas CFS\-late improves to0\.8910\.891AvgAUROC and0\.2130\.213AvgFPR95 at the same one\-forward cost\. The paired early\+late variant slightly improves AvgAUROC, while the late snapshot gives the best AvgFPR95, suggesting that late transformer blocks already concentrate most of the useful signal in this setting\. We emphasize that this result should not be read as a new U\-ViT OOD benchmark: performance is not uniform across all source datasets, and the CIFAR\-10\-as\-ID case remains the main failure mode\.

Table 14:Per\-source breakdown of the U\-ViT architecture\-transfer sanity check\.Each cell reports the average over the four OOD datasets associated with the corresponding ID source\. The output baseline uses final reconstruction MSE, while CFS variants use internal token snapshots\. The main failure mode is CIFAR\-10 as ID, especially against CelebA32 and CIFAR\-100\.

## Appendix FExtended CIFAR\-Scale Results

This section provides the full pairwise breakdown behind the compact summaries shown in the main paper\. Its role is descriptive rather than argumentative: it exposes where the main representation\-space advantage comes from, pair by pair, under the same controlled protocol\.

We begin with the full pairwise breakdown for the primary shared\-source benchmark, i\.e\. the CIFAR10\-source checkpoint family used in the main paper\. Tables[15](https://arxiv.org/html/2605.11014#A6.T15)and[16](https://arxiv.org/html/2605.11014#A6.T16)report AUROC and FPR95 for all ID→\\toOOD pairs\. To keep the appendix readable, we expose pairwise operating\-point behavior mainly for the two main sparse variants,CFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\)andCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)\.

Table 15:Full per\-pair AUROC on the CIFAR\-scale benchmark under the primary CIFAR10\-source policy\.Each entry corresponds to one ID→\\toOOD pair\. The same fixed shared\-source protocol is used for all methods\. All metrics are reported with three decimal places\.Table 16:Full per\-pair FPR95 on the CIFAR\-scale benchmark under the primary CIFAR10\-source policy\.Each entry corresponds to one ID→\\toOOD pair\. Lower is better\. All metrics are reported with three decimal places\.The full tables make two patterns clear\. First, the main\-paper summaries are not driven by one isolated pair: the sparseCFSvariants are strong across a broad set of ID→\\toOOD configurations, especially on compact IDs and texture\-driven shifts\. Second, the decoder\-only and paired encoder\-decoder variants remain very close on many pairs, which reinforces the compression result already visible in the main tables\.

## Appendix GFull Source\-Family Robustness Results

This section tests whether the main representation\-space pattern is tied to one checkpoint family or survives a controlled change in the frozen source representation\.

We report the full pairwise breakdown for the alternative CelebA32\-source checkpoint family\. Tables[17](https://arxiv.org/html/2605.11014#A7.T17)and[18](https://arxiv.org/html/2605.11014#A7.T18)provide the detailed AUROC and FPR95 results\. The goal is to test whether the source\-family change modifies only absolute performance or also the qualitative ranking between representation\-space and output\-space methods\.

Table 17:Full per\-pair AUROC on the CIFAR\-scale benchmark under the alternative CelebA32\-source policy\.This is the detailed source\-family robustness breakdown complementary to the main\-paper summary\. All metrics are reported with three decimal places\.Table 18:Full per\-pair FPR95 on the CIFAR\-scale benchmark under the alternative CelebA32\-source policy\.This is the detailed source\-family robustness breakdown complementary to the main\-paper summary\. Lower is better\. All metrics are reported with three decimal places\.A source\-family change modifies the representation bias of the frozen checkpoint\. The relevant question is therefore not whether absolute numbers remain identical, but whether the ranking and the representation\-versus\-output\-space contrast remain qualitatively stable\.

The full tables support the same broad conclusion as the compact main\-paper summary: changing the source family shifts absolute numbers but does not erase the main pattern\. Sparse representation\-space probing remains highly competitive, and the contrast betweenCFSand broader output\-space baselines survives the change in frozen source representation\.

## Appendix HExternal Positioning Against Prior Reported Diffusion Results

This section provides external context by comparingCFSto previously reported diffusion\-based CIFAR\-scale results\.

Table 19:External positioning against prior reported diffusion\-based results on the CIFAR\-scale benchmark\.Rows above the final block are taken from prior work\. We separate: \(i\) prior diffusion\-based methods reported in ID\-specific settings, and \(ii\) single\-checkpoint diffusion methods using one frozen checkpoint across multiple ID/OOD pairs\.Table[19](https://arxiv.org/html/2605.11014#A8.T19)shows that the sparse representation\-space advantage ofCFSis not confined to the controlledMBEsetting\. Even outside backbone\-equated evaluation,CFSremains competitive with or stronger than previously reported single\-checkpoint diffusion results, while using only one backbone evaluation per image\.

### H\.1Positioning against DLSR on its native published settings

DLSR is the closest prior in spirit, but it studies a different object: learned feature reconstruction rather than sparse probing of native frozen states\. It also falls outside our MBE protocol\. Since DLSR is only available on the ID/OOD pairs supported by its published setup, we report in Table[20](https://arxiv.org/html/2605.11014#A8.T20)a positioning comparison on the DLSR\-native overlapping subset, without treating this table as part of the main MBE claim\.

Table 20:Positioning against DLSR on its native published evaluation pairs\.DLSR lies outside the MBE protocol and uses an additional learned feature\-reconstruction module, so we do not report a backbone\-forward\-equivalent logical cost for it\. For our methods, Cost denotes the backbone\-forward logical cost\. Averages are computed over the non\-missing overlapping entries of each method\.DLSR is therefore evaluated only on pairs supported by the original protocol/repo\. The full cross\-ID benchmark would require extending the original DLSR training and evaluation pipeline beyond the published setup\.

## Appendix ICheckpoint\-Controlled Large\-Scale Results

This appendix reports a checkpoint\-controlled large\-scale comparison on ImageNet200 and ImageNet1K using a single official ImageNet\-64 improved\-diffusion checkpoint\. These experiments are not meant to replace the controlled CIFAR\-scale evidence in the main paper\. Their role is narrower: to test whether the same representation\-first signal remains informative in a substantially harder large\-scale regime under a fixed backbone\.

In all experiments below, all methods use the same official ImageNet\-64 improved\-diffusion checkpoint \(imagenet64\_uncond\_100M\_1500K\.pt\)\. We evaluate on two near\-OOD regimes,NINCOandSSB\-hard, and one far\-OOD regime,Textures\. Unless stated otherwise, bothCFSvariants use a single low\-noise canonical level and therefore retain a logical test\-time cost of1F\+0J1F\+0J, whereasMSMAandDiffPathuse10F\+0J10F\+0J\.

Table[21](https://arxiv.org/html/2605.11014#A9.T21)reports the resulting large\-scale comparison\. In addition to the primary main\-paper operating pointCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\), we also include the ultra\-sparse decoder\-only companionCFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\)\.

Table 21:Checkpoint\-controlled large\-scale comparison on the official ImageNet\-64 improved\-diffusion backbone\.All methods use the same official improved\-diffusion checkpoint \(imagenet64\_uncond\_100M\_1500K\.pt\)\. We report AUROC on NINCO, SSB\-hard, and Textures, together with averaged AUROC, AUPR, and FPR95 over these three OOD sets\. The logical test\-time cost of the displayedCFSvariants is1F1F; forMSMAandDiffPathit is10F10F\.These numbers constitute a checkpoint\-controlled stress test of whether the representation\-first signal remains informative outside the controlled CIFAR\-scale regime

This checkpoint\-controlled large\-scale view yields a clear pattern\. First, near\-OOD ImageNet\-scale detection remains difficult for all methods, especially on SSB\-hard\. Second, Textures is substantially more discriminative and reveals a clear advantage for sparseCFSprobes overMSMAandDiffPath\. Third, under this official improved\-backbone setting, the sparse decoder\-only variantCFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\)is consistently slightly stronger thanCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)on both ImageNet200 and ImageNet1K, while preserving the same1F1Flogical cost\.

Interestingly, in the ImageNet\-scale checkpoint\-controlled setting,CFSdec\(1×1\)\\textsc\{CFS\}\_\{\\mathrm\{dec\}\}\(1\{\\times\}1\)is slightly stronger thanCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\)\. The result suggests that, in this harder large\-scale regime, the conditional encoder residual of Theorem[1](https://arxiv.org/html/2605.11014#Thmtheorem1), is either small or not reliably exploitable by the lightweight diagonal score, so the late decoder snapshot remains the most robust sparse probe\.

The NINCO and SSB\-hard results indicate that large\-scale semantic OOD remains difficult for all diffusion\-based methods under this fixed single\-backbone evaluation setting\.

## Appendix JImplementation and Configuration Details

This section reports the implementation details needed to reproduce the mainCFSruns and the corresponding ablations\. All methods are evaluated through the sharedMBEadapter described in Appendix[C](https://arxiv.org/html/2605.11014#A3)\. Unless stated otherwise, images are normalized to\[−1,1\]\[\-1,1\], the test\-time Monte Carlo count is one, and all reported main\-paper methods use OOD\-high scores\.

### J\.1CFS probing configuration

Table[22](https://arxiv.org/html/2605.11014#A10.T22)reports the sparse probing configuration used for the primary one\-levelCFSruns\. The same canonical probing setup is used forCFS\(1×2\)\\textsc\{CFS\}\(1\{\\times\}2\), while decoder\-only variants restrict the retained region to the decoder side\.

Table 22:SparseCFSprobing configuration\.This configuration specifies the canonical level, hook policy, pooling rule, Monte Carlo settings, and ID\-only hook\-selection proxy used for the main sparse probing runs\.For the main results, each selected slot is scored with the lightweight diagonal statistic in Eq\. \([7](https://arxiv.org/html/2605.11014#S4.E7)\)\. Table[23](https://arxiv.org/html/2605.11014#A10.T23)reports the corresponding ID\-only head configuration\. For the head\-sensitivity ablation in Appendix[E\.6](https://arxiv.org/html/2605.11014#A5.SS6), we additionally evaluate stronger ID\-only heads on the same sparse representation\.

Table 23:Default ID\-only head configuration for the mainCFSresults\.The main paper uses the diagonal score to keep the detector lightweight and to avoid conflating representation quality with downstream head capacity\.##### Pooling\.

The main\-paper pooling rule is the one defined in Eq\. \([6](https://arxiv.org/html/2605.11014#S4.E6)\): channel\-wise spatial mean concatenated with channel\-wise spatial standard deviation\. Mean\-only and std\-only variants are treated as pooling ablations in Appendix[E\.5](https://arxiv.org/html/2605.11014#A5.SS5), not as the defaultCFSconfiguration\.

### J\.2Compute profiling protocol

We profile wall\-clock inference cost using the same benchmark runner as the reported experiments in Table[24](https://arxiv.org/html/2605.11014#A10.T24)\. For each method, profiling is performed after ID\-only fitting and measures the full scoring path: canonical corruption, backbone evaluations, hook or output extraction, and score computation\. We use55warm\-up batches and5050measured batches, with CUDA synchronization before and after the measured region\. Peak memory is measured withtorch\.cuda\.max\_memory\_allocated\. All timings are reported at the same batch size used in the benchmark\.

We distinguish logical cost from measured runtime\. Logical cost counts backbone evaluations per image,\#F\\\#F; measured runtime additionally includes feature pooling, density\-head evaluation, data movement, and method\-specific aggregation overhead\.

Table 24:Representative compute profile on the CIFAR\-scale benchmark\.Timings are measured after ID\-only fitting using the full scoring path with55warm\-up batches and5050measured batches\. GPU\-hours estimate the fit plus evaluation cost for one ID\-vs\-OOD benchmark run under the reported split sizes\.All profiles in Table[24](https://arxiv.org/html/2605.11014#A10.T24)were obtained on a single NVIDIA GeForce RTX 4060 Laptop GPU with batch size128128, and PyTorch 2\.10\.0\. ForDDPM\-OODwith EDM, we report the logical cost but omit wall\-clock profiling because the364F364Fscoring path is prohibitively expensive; the improved\-diffusion profile already illustrates this order\-of\-magnitude cost\.

## Appendix KBroader Impact and Responsible Use

This work studies post\-hoc OOD detection for frozen diffusion backbones\. A potential positive impact is improved reliability and auditability of vision systems: sparse internal probes may help identify inputs that fall outside an evaluation reference distribution, while keeping test\-time cost low and making protocol confounders explicit\.

The main negative risk is over\-reliance\. An OOD score is not a certificate of safety, correctness, or fairness, and false negatives may create unwarranted confidence in high\-stakes settings\. This is especially important for sensitive visual domains such as medical imaging, biometric analysis, surveillance, or safety\-critical autonomy, where dataset shift, demographic imbalance, or acquisition bias may produce failures not captured by the evaluation protocol\.

Our method is intended as a diagnostic and benchmarking tool rather than a standalone deployment safeguard\. In practical use, it should be combined with domain\-specific validation, calibrated thresholds, uncertainty auditing, and human oversight where decisions may affect people\.
Backbone-Equated Diffusion OOD via Sparse Internal Snapshots

Similar Articles

Deep Dreams Are Made of This: Visualizing Monosemantic Features in Diffusion Models

Conditional Diffusion Under Linear Constraints: Langevin Mixing and Information-Theoretic Guarantees

Christoffel-DPS: Optimal sensor placement in diffusion posterior sampling for arbitrary distributions

Beyond Penalization: Diffusion-based Out-of-Distribution Detection and Selective Regularization in Offline Reinforcement Learning

DFlash: Block Diffusion for Flash Speculative Decoding

Submit Feedback

Similar Articles

Deep Dreams Are Made of This: Visualizing Monosemantic Features in Diffusion Models
Conditional Diffusion Under Linear Constraints: Langevin Mixing and Information-Theoretic Guarantees
Christoffel-DPS: Optimal sensor placement in diffusion posterior sampling for arbitrary distributions
Beyond Penalization: Diffusion-based Out-of-Distribution Detection and Selective Regularization in Offline Reinforcement Learning
DFlash: Block Diffusion for Flash Speculative Decoding