RRISE: Robust Radius Inference via a Surrogate Estimator

arXiv cs.LG Papers

Summary

RRISE introduces a learned surrogate estimator that reduces the Monte Carlo sampling cost of randomized smoothing for certified robustness to a single forward pass, maintaining accuracy within 0.84 percentage points while replacing up to 10^4 evaluations per query.

arXiv:2606.02876v1 Announce Type: new Abstract: Randomized smoothing (RS) uses a smoothed classifier to provide architecture-agnostic certificates of $\ell_2$ classification robustness, but its dependence on per-input Monte Carlo (MC) sampling undermines its use in real-time systems. We argue that this cost is structural rather than fundamental, such that it can be significantly reduced by sharing information across the deployment stream. We introduce RRISE, an RS framework that compresses certification into a single forward pass through a learned surrogate. RRISE trains the surrogate against precomputed MC class-count targets via a soft-label cross-entropy loss and converts surrogate predictions into provably conservative certified radii through a one-time conformal calibration step. The resulting certificate is deployment-verifiable: whenever the calibrated radius is positive, the surrogate's prediction provably matches the smoothed classifier's and the smoothed classifier is constant on a ball of that radius around the input. Across image classification benchmarks, RRISE matches fixed-budget MC certified accuracy within $0.84$ percentage points while replacing up to $10^4$ noisy base-model evaluations per query with a single surrogate forward pass, recouping MC training cost after $\approx 10^5$ deployment queries. On CIFAR-100 and Tiny ImageNet, where the only prior offline-surrogate method collapses, RRISE achieves $1.23$ to $1.91\times$ higher certified accuracy, establishing efficient randomized smoothing as a practical path to certified robustness in repeated-deployment settings.
Original Article
View Cached Full Text

Cached at: 06/03/26, 09:40 AM

# Robust Radius Inference via a Surrogate Estimator
Source: [https://arxiv.org/html/2606.02876](https://arxiv.org/html/2606.02876)
Jong\-Ik Park Carnegie Mellon University jongikp@andrew\.cmu\.edu&Shreyas Chaudhari11footnotemark:1 Carnegie Mellon University shreyasc@andrew\.cmu\.eduCarlee Joe\-Wong Carnegie Mellon University cjoewong@andrew\.cmu\.edu&José M\. F\. Moura Carnegie Mellon University moura@andrew\.cmu\.edu

###### Abstract

Randomized smoothing \(RS\) uses a smoothed classifier to provide architecture\-agnostic certificates ofℓ2\\ell\_\{2\}classification robustness, but its dependence on per\-input Monte Carlo \(MC\) sampling undermines its use in real\-time systems\. We argue that this cost is structural rather than fundamental, such that it can be significantly reduced by sharing information across the deployment stream\. We introduceRRISE, an RS framework that compresses certification into a single forward pass through a learned surrogate\.RRISEtrains the surrogate against precomputed MC class\-count targets via a soft\-label cross\-entropy loss and converts surrogate predictions into provably conservative certified radii through a one\-time conformal calibration step\. The resulting certificate is deployment\-verifiable: whenever the calibrated radius is positive, the surrogate’s prediction provably matches the smoothed classifier’s and the smoothed classifier is constant on a ball of that radius around the input\. Across image classification benchmarks,RRISEmatches fixed\-budget MC certified accuracy within0\.840\.84percentage points while replacing up to10410^\{4\}noisy base\-model evaluations per query with a single surrogate forward pass, recouping MC training cost after≈105\\approx 10^\{5\}deployment queries\. On CIFAR\-100 and Tiny ImageNet, where the only prior offline\-surrogate method collapses,RRISEachieves1\.231\.23to1\.91×1\.91\\timeshigher certified accuracy, establishing efficient randomized smoothing as a practical path to certified robustness in repeated\-deployment settings\.

## 1Introduction

Modern AI classification systems increasingly operate in high\-stakes, real\-time settings, where performance depends not only on pointwise accuracy but also on stability under input perturbations\(Fawziet al\.,[2018](https://arxiv.org/html/2606.02876#bib.bib42); Liuet al\.,[2025](https://arxiv.org/html/2606.02876#bib.bib49)\)\. Physically realizable perturbations—such as changes in viewpoint, lighting, or sensor noise—can, for example, trigger safety\-critical failures in autonomous driving\(Eykholtet al\.,[2018](https://arxiv.org/html/2606.02876#bib.bib29); Chiet al\.,[2024](https://arxiv.org/html/2606.02876#bib.bib41)\), while subtle variations in medical images may compromise clinical decision\-making\(Finlaysonet al\.,[2019](https://arxiv.org/html/2606.02876#bib.bib46); Maet al\.,[2021](https://arxiv.org/html/2606.02876#bib.bib30)\)\. Similar concerns arise in real\-time robotics\(Caoet al\.,[2023](https://arxiv.org/html/2606.02876#bib.bib39)\)and speech recognition\(Xieet al\.,[2020](https://arxiv.org/html/2606.02876#bib.bib40)\), where reliable decisions must be produced under strict latency constraints despite naturally occurring or adversarial input perturbations\. These settings motivate a*geometric*view of robustness, in which predictions should remain invariant within a neighborhood of the input, and the size of this neighborhood defines an operational safety margin\(Hein and Andriushchenko,[2017](https://arxiv.org/html/2606.02876#bib.bib15); Wanget al\.,[2018](https://arxiv.org/html/2606.02876#bib.bib38)\)\. By contrast, widely used pointwise reliability measures—such as confidence scores, predictive uncertainty, and calibration metrics\(Guoet al\.,[2017](https://arxiv.org/html/2606.02876#bib.bib1); Lakshminarayananet al\.,[2017](https://arxiv.org/html/2606.02876#bib.bib3); Gal and Ghahramani,[2016](https://arxiv.org/html/2606.02876#bib.bib2); Geifman and El\-Yaniv,[2017](https://arxiv.org/html/2606.02876#bib.bib47)\)—do not directly certify neighborhood invariance\.

Randomized smoothing \(RS\)\(Lecuyeret al\.,[2019](https://arxiv.org/html/2606.02876#bib.bib17); Cohenet al\.,[2019](https://arxiv.org/html/2606.02876#bib.bib16); Liet al\.,[2019](https://arxiv.org/html/2606.02876#bib.bib32)\)has emerged as a leading approach for certifying classifier robustness\. RS provides instance\-specific guarantees of prediction invariance under bounded perturbations\. Unlike bound\-propagation and convex\-relaxation methods\(Wenget al\.,[2018](https://arxiv.org/html/2606.02876#bib.bib36); Singhet al\.,[2019](https://arxiv.org/html/2606.02876#bib.bib35)\)that rely on architectural assumptions and remain difficult to scale to large networks, RS is architecture\-agnostic, requiring only black\-box query access to the classifier and applying broadly through Monte Carlo \(MC\) sampling\.

Despite these advantages, standard RS entails substantial computational costs, which hinder its deployment in real\-time, safety\-critical, risk\-aware decision\-making systemsKumariet al\.\([2023](https://arxiv.org/html/2606.02876#bib.bib51)\)\. Certification requires estimating “smoothed” class probabilities via MC sampling for each input\(Cohenet al\.,[2019](https://arxiv.org/html/2606.02876#bib.bib16)\), and achieving high\-confidence guarantees may require on the order of10510^\{5\}forward passes per input example\(Salmanet al\.,[2019](https://arxiv.org/html/2606.02876#bib.bib25)\)\. In latency\-sensitive settings, this overhead is prohibitive\. On modern GPU hardware, a single forward pass for a large RGB color image can take several milliseconds\(Xuet al\.,[2024](https://arxiv.org/html/2606.02876#bib.bib31)\), resulting in per\-input certification times on the order of hundreds of seconds\(Cohenet al\.,[2019](https://arxiv.org/html/2606.02876#bib.bib16); Bhardwajet al\.,[2024](https://arxiv.org/html/2606.02876#bib.bib28)\), far exceeding the requirements of typical latency\-sensitive applications like autonomous driving or speech recognition\. This gap between certified robustness guarantees and practical deployment thus motivates the development of substantially more efficient randomized smoothing certification methods\.

##### Contributions\.

We introduceRRISE\(Robust Radius Inference via a Surrogate Estimator\), a computationally efficient framework for randomized\-smoothing certification that replaces per\-input MC sampling with a single surrogate forward pass\. Our contributions are twofold\.

\(i\) A principled surrogate\-training method for computationally efficient smoothing\.We fine\-tune the base classifier to predict the smoothed class distribution under Gaussian noise, supervised by soft\-label cross\-entropy against finite\-budget MC class\-count targets\. Since cross\-entropy is linear in its target, its gradient is an unbiased estimate of the gradient at the realized MC target\. Divergence\-based alternatives used in prior offline\-surrogate work\(Bhardwajet al\.,[2024](https://arxiv.org/html/2606.02876#bib.bib28)\)are nonlinear in their first argument and incur a curvature\-induced gradient bias \(Appendix[D](https://arxiv.org/html/2606.02876#A4)\)\. Fine\-tuning, rather than training from scratch, lets the surrogate inherit the noise\-invariant representations the base classifier has already learned through Gaussian\-noise augmentation\.

\(ii\) A conformal calibration layer that yields deployment\-verifiable certificates\.On a held\-out calibration set, we compute a single scalar offsetδ\\deltathat, at inference time, converts the surrogate’s top\-class probability into a high\-probability lower bound on the smoothed top\-class probability — and thus into a certified radius computed entirely from one surrogate forward pass\. When this radius is positive, the surrogate’s prediction provably matches the smoothed classifier’s\. The standard assumption underlying amortized certification — that the surrogate’s argmax agrees with the smoothed classifier’s — becomes a condition the practitioner can check at inference time, with one calibration covering the entire deployment\.

The rest of this paper is organized as follows\. After giving an overview of the problem background and related work \(Section[2](https://arxiv.org/html/2606.02876#S2)\), we present theRRISEmethodology in Section[3](https://arxiv.org/html/2606.02876#S3)and evaluate it in Section[4](https://arxiv.org/html/2606.02876#S4)\. We discuss potential limitations ofRRISEin Section[5](https://arxiv.org/html/2606.02876#S5)before concluding in Section[6](https://arxiv.org/html/2606.02876#S6)\.

## 2Background and Related Work

### 2\.1Preliminaries

Randomized smoothing \(RS\)\(Cohenet al\.,[2019](https://arxiv.org/html/2606.02876#bib.bib16)\)constructs classifiers with provable robustness againstℓ2\\ell\_\{2\}\-bounded adversarial perturbations\. Unlike empirical defenses, which remain vulnerable to adaptive attacks\(Carlini and Wagner,[2017](https://arxiv.org/html/2606.02876#bib.bib52); Akhtar and Mian,[2018](https://arxiv.org/html/2606.02876#bib.bib54); Trameret al\.,[2020](https://arxiv.org/html/2606.02876#bib.bib57)\), RS yields certified guarantees that hold for*any*perturbation, no matter its source, within a prescribed radius\. The core idea is to convolve a base classifier with isotropic Gaussian noise, producing a smoothed classifier whose decision is provably stable in a neighborhood of each input\.

Letf:ℝd→\{1,…,K\}f:\\mathbb\{R\}^\{d\}\\to\\\{1,\\dots,K\\\}be a base classifier trained with a standard supervised objective\. For a smoothing parameterσ\>0\\sigma\>0and input𝐱\\mathbf\{x\}, RS defines the smoothed class probabilities for each classkk:

p​\(k∣𝐱,σ\)≜ℙ𝜺∼𝒩​\(𝟎,σ2​𝐈\)​\(f​\(𝐱\+𝜺\)=k\),p\(k\\mid\\mathbf\{x\},\\sigma\)\\;\\triangleq\\;\\mathbb\{P\}\_\{\\boldsymbol\{\\varepsilon\}\\sim\\mathcal\{N\}\(\\mathbf\{0\},\\sigma^\{2\}\\mathbf\{I\}\)\}\\\!\\big\(f\(\\mathbf\{x\}\+\\boldsymbol\{\\varepsilon\}\)=k\\big\),\(1\)and the induced*smoothed classifier*g​\(𝐱;σ\)≜arg⁡maxk⁡p​\(k∣𝐱,σ\)g\(\\mathbf\{x\};\\sigma\)\\triangleq\\arg\\max\_\{k\}p\(k\\mid\\mathbf\{x\},\\sigma\)returns the most likely class under noise\. LettingpA=maxk⁡p​\(k∣𝐱,σ\)p\_\{A\}=\\max\_\{k\}p\(k\\mid\\mathbf\{x\},\\sigma\)denote the smoothed top\-class probability,Cohenet al\.\([2019](https://arxiv.org/html/2606.02876#bib.bib16)\)prove that wheneverpA\>1/2p\_\{A\}\>1/2, the smoothed classifierggis robust withinℓ2\\ell\_\{2\}\-radius

R​\(𝐱;σ\)≜σ​Φ−1​\(pA\),R\(\\mathbf\{x\};\\sigma\)\\;\\triangleq\\;\\sigma\\,\\Phi^\{\-1\}\(p\_\{A\}\),\(2\)in the sense thatg​\(𝐱\+𝜹;σ\)=g​\(𝐱;σ\)g\(\\mathbf\{x\}\+\\boldsymbol\{\\delta\};\\sigma\)=g\(\\mathbf\{x\};\\sigma\)for all‖𝜹‖2≤R​\(𝐱;σ\)\\\|\\boldsymbol\{\\delta\}\\\|\_\{2\}\\leq R\(\\mathbf\{x\};\\sigma\), whereΦ−1\\Phi^\{\-1\}is the inverse standard Gaussian CDF\. SincepAp\_\{A\}cannot be computed in closed form, the standard approach is to estimate it via Monte Carlo \(MC\) sampling\. Drawingnnnoise vectors𝜺j∼𝒩​\(𝟎,σ2​𝐈\)\\boldsymbol\{\\varepsilon\}\_\{j\}\\sim\\mathcal\{N\}\(\\mathbf\{0\},\\sigma^\{2\}\\mathbf\{I\}\), the perturbed inputs𝐱\+𝜺j\\mathbf\{x\}\+\\boldsymbol\{\\varepsilon\}\_\{j\}are each classified byff, and the most\-frequently\-predicted classc^A\\widehat\{c\}\_\{A\}is taken as the prediction of the smoothed classifierg​\(𝐱;σ\)g\(\\mathbf\{x\};\\sigma\)\. The fraction of samples voting for that class,p^A=1n​∑j=1n𝟏​\{f​\(𝐱\+𝜺j\)=c^A\}\\widehat\{p\}\_\{A\}=\\tfrac\{1\}\{n\}\\sum\_\{j=1\}^\{n\}\\mathbf\{1\}\\\{f\(\\mathbf\{x\}\+\\boldsymbol\{\\varepsilon\}\_\{j\}\)=\\widehat\{c\}\_\{A\}\\\}, is an empirical estimate ofpAp\_\{A\}\. Sincep^A\\widehat\{p\}\_\{A\}is itself noisy, a one\-sided Clopper–Pearson lower confidence boundp¯A≤p^A\\underline\{p\}\_\{A\}\\leq\\widehat\{p\}\_\{A\}is used in place ofpAp\_\{A\}, yielding the high probability radiusR^​\(𝐱;σ\)≜σ​Φ−1​\(p¯A\)\\widehat\{R\}\(\\mathbf\{x\};\\sigma\)\\triangleq\\sigma\\,\\Phi^\{\-1\}\(\\underline\{p\}\_\{A\}\)\.

This procedure is statistically sound and broadly applicable, but its cost scales with the per\-input MC budget\.Cohenet al\.\([2019](https://arxiv.org/html/2606.02876#bib.bib16)\)use up ton=105n=10^\{5\}MC samples per certified ImageNet image, amounting to over1,5001\{,\}500GPU\-hours111Computed from the∼\\sim110 s per\-image certification time on a single NVIDIA RTX 2080 Ti reported inCohenet al\.\([2019](https://arxiv.org/html/2606.02876#bib.bib16)\)\.to certify the 50K images\. This cost is structural, aspAp\_\{A\}is estimated from scratch at every input, with no information shared across inputs\. We organize the remainder of the paper around the question this raises:*can the dependence of the certificate onpAp\_\{A\}be amortized across inputs, so that certifying a new input𝐱\\mathbf\{x\}no longer requires many forward passes throughff?*Section[3](https://arxiv.org/html/2606.02876#S3)answers affirmatively by training a neural surrogate that predicts the smoothed class distribution directly, and in particular Section[3\.2](https://arxiv.org/html/2606.02876#S3.SS2)shows that a one\-time conformal calibration converts surrogate predictions into certified radii with a high\-probability coverage guarantee\.

### 2\.2Reliability, Smoothing, and Acceleration

##### Pointwise reliability signals\.

Calibration, predictive entropy, Bayesian approximations, ensembles, distance\-aware models, selective prediction, and out\-of\-distribution detectors provide useful pointwise reliability information for a given classification model\(Gal and Ghahramani,[2016](https://arxiv.org/html/2606.02876#bib.bib2); Guoet al\.,[2017](https://arxiv.org/html/2606.02876#bib.bib1); Lakshminarayananet al\.,[2017](https://arxiv.org/html/2606.02876#bib.bib3); Geifman and El\-Yaniv,[2017](https://arxiv.org/html/2606.02876#bib.bib47); Maddoxet al\.,[2019](https://arxiv.org/html/2606.02876#bib.bib4); Lianget al\.,[2018](https://arxiv.org/html/2606.02876#bib.bib13); Liuet al\.,[2020](https://arxiv.org/html/2606.02876#bib.bib6),[2023](https://arxiv.org/html/2606.02876#bib.bib5)\)\. These signals can often be computed with little additional cost per input, but they do not certify neighborhood invariance and therefore do not provide an instance\-specific robustness radius\.

##### Certified randomized smoothing\.

Randomized smoothing has been extended beyond the original Gaussianℓ2\\ell\_\{2\}setting to additional norms, transformations, architectures, and smoothing distributions\(Lecuyeret al\.,[2019](https://arxiv.org/html/2606.02876#bib.bib17); Cohenet al\.,[2019](https://arxiv.org/html/2606.02876#bib.bib16); Liet al\.,[2019](https://arxiv.org/html/2606.02876#bib.bib32); Yanget al\.,[2020](https://arxiv.org/html/2606.02876#bib.bib27); Fischeret al\.,[2020](https://arxiv.org/html/2606.02876#bib.bib26); Pfrommeret al\.,[2023](https://arxiv.org/html/2606.02876#bib.bib33)\)\. Another line of work studies data\-dependent or input\-adaptive smoothing levels\(Alfarraet al\.,[2022](https://arxiv.org/html/2606.02876#bib.bib34)\)\. These methods improve the flexibility or quality of smoothing certificates, but the certification step still relies on expensive per\-input MC estimation of the smoothed class probabilities, making them difficult to deploy for latency\-sensitive applications\.

##### Reducing the Monte Carlo cost\.

Several methods reduce the online sampling burden without replacing MC certification entirely\. Confidence\-sequence and early\-stopping approaches adaptively terminate sampling once the radius estimate is sufficiently stable\(Voracek,[2024](https://arxiv.org/html/2606.02876#bib.bib20)\)\. Input\-specific budgeting methods allocate fewer samples to easy inputs and more samples to ambiguous ones\(Seferiset al\.,[2024](https://arxiv.org/html/2606.02876#bib.bib18)\)\. Incremental certification methods reuse information across related classifiers\(Ugareet al\.,[2024](https://arxiv.org/html/2606.02876#bib.bib23)\)\. These approaches reduce average sampling cost but still require noisy base\-model evaluations at test time\. The offline surrogate approach ofBhardwajet al\.\([2024](https://arxiv.org/html/2606.02876#bib.bib28)\)is closest to ours as it also trains a surrogate on precomputed MC targets\.RRISEdiffers in two ways: it uses a cross\-entropy objective whose finite\-budget loss is unbiased at fixed parameters, and it adds a conformal calibration layer that converts surrogate probabilities into conservative certified radii\. Appendix[D](https://arxiv.org/html/2606.02876#A4)gives a detailed comparison\.

## 3Methodology

Here, we describeRRISE: a computationally efficient alternative to MC\-based randomized smoothing certification\. At its core is a learned surrogateq𝜽q\_\{\\boldsymbol\{\\theta\}\}that predicts the smoothed class distribution from the clean input, replacingnnnoisy forward passes throughffwith a single forward pass throughq𝜽q\_\{\\boldsymbol\{\\theta\}\}\. A one\-time calibration procedure converts the surrogate’s predictions into certified radii with a high\-probability guarantee\. We proceed to describe the surrogate and its training \(Section[3\.1](https://arxiv.org/html/2606.02876#S3.SS1)\), and the calibration procedure \(Section[3\.2](https://arxiv.org/html/2606.02876#S3.SS2)\)\.

### 3\.1Training theRRISESurrogate

RRISEreplaces the per\-input MC estimate ofp\(⋅∣𝐱,σ\)p\(\\cdot\\mid\\mathbf\{x\},\\sigma\)with a learned predictorq𝜽:ℝd→ΔK−1q\_\{\\boldsymbol\{\\theta\}\}:\\mathbb\{R\}^\{d\}\\to\\Delta^\{K\-1\}whosekk\-th output approximates the smoothed class probability in \([1](https://arxiv.org/html/2606.02876#S2.E1)\):q𝜽​\(𝐱\)k≈p​\(k∣𝐱,σ\)q\_\{\\boldsymbol\{\\theta\}\}\(\\mathbf\{x\}\)\_\{k\}\\approx p\(k\\mid\\mathbf\{x\},\\sigma\)\. The surrogate’s argmax predicts the output of the smoothed classifier, and its top\-class probabilitiy estimatespAp\_\{A\}\. Concretely, the surrogate’s predicted class and top\-class probability are

g^​\(𝐱\)≜arg⁡maxk⁡q𝜽​\(𝐱\)k,qA​\(𝐱\)≜maxk⁡q𝜽​\(𝐱\)k,\\widehat\{g\}\(\\mathbf\{x\}\)\\;\\triangleq\\;\\arg\\max\_\{k\}q\_\{\\boldsymbol\{\\theta\}\}\(\\mathbf\{x\}\)\_\{k\},\\qquad q\_\{A\}\(\\mathbf\{x\}\)\\;\\triangleq\\;\\max\_\{k\}q\_\{\\boldsymbol\{\\theta\}\}\(\\mathbf\{x\}\)\_\{k\},\(3\)that mirror the smoothed classifierg​\(𝐱;σ\)g\(\\mathbf\{x\};\\sigma\)and top\-class probabilitypAp\_\{A\}in \([2](https://arxiv.org/html/2606.02876#S2.E2)\), but are computed in a single forward pass rather than fromnnnoisy evaluations offf\. We fixσ\\sigmathroughout and treatq𝜽q\_\{\\boldsymbol\{\\theta\}\}asσ\\sigma\-specific and the framework can be extended to the multi\-σ\\sigmasetting\.

We trainq𝜽q\_\{\\boldsymbol\{\\theta\}\}on a precomputed dataset of MC targets\. For each training input𝐱i\\mathbf\{x\}\_\{i\}, we drawnni\.i\.d\. \(independently and identically distributed\) noise samples𝜺i,j∼𝒩​\(𝟎,σ2​𝐈\)\\boldsymbol\{\\varepsilon\}\_\{i,j\}\\sim\\mathcal\{N\}\(\\mathbf\{0\},\\sigma^\{2\}\\mathbf\{I\}\)and form the empirical smoothed distributionp^i∈ΔK−1\\widehat\{p\}\_\{i\}\\in\\Delta^\{K\-1\}withp^i,k≜1n​∑j=1n𝟏​\{f​\(𝐱i\+𝜺i,j\)=k\}\\widehat\{p\}\_\{i,k\}\\triangleq\\tfrac\{1\}\{n\}\\sum\_\{j=1\}^\{n\}\\mathbf\{1\}\\\{f\(\\mathbf\{x\}\_\{i\}\+\\boldsymbol\{\\varepsilon\}\_\{i,j\}\)=k\\\}\. The dataset\{\(𝐱i,p^i\)\}\\\{\(\\mathbf\{x\}\_\{i\},\\widehat\{p\}\_\{i\}\)\\\}is fixed and reused across epochs and hyperparameter sweeps\. We initializeq𝜽q\_\{\\boldsymbol\{\\theta\}\}asff, and fine\-tune by minimizing the soft\-target cross\-entropyℒ^​\(𝜽\)≜1\|ℬ\|​∑i∈ℬ𝖢𝖤​\(p^i,q𝜽​\(𝐱i\)\)\\widehat\{\\mathcal\{L\}\}\(\\boldsymbol\{\\theta\}\)\\;\\triangleq\\;\\frac\{1\}\{\|\\mathcal\{B\}\|\}\\sum\_\{i\\in\\mathcal\{B\}\}\\mathsf\{CE\}\\big\(\\widehat\{p\}\_\{i\},\\,q\_\{\\boldsymbol\{\\theta\}\}\(\\mathbf\{x\}\_\{i\}\)\\big\)on minibatchesℬ\\mathcal\{B\}\.

Sincep^i\\widehat\{p\}\_\{i\}is itself an empirical estimate ofp\(⋅∣𝐱i,σ\)p\(\\cdot\\mid\\mathbf\{x\}\_\{i\},\\sigma\), training against it incurs anO​\(1/n\)O\(1/\\sqrt\{n\}\)estimation bias in the target that is easily controllable\. In Appendix[B\.2](https://arxiv.org/html/2606.02876#A2.SS2)we report how the surrogate performance fluctuates with thennused for dataset construction, and in Appendix[C](https://arxiv.org/html/2606.02876#A3)and[D](https://arxiv.org/html/2606.02876#A4)we show that cross entropy loss provides unbiased gradients with respect to the estimated distributionp^\\widehat\{p\}whereas alternative divergences used by existing approaches do not\. In our experiments,q𝜽q\_\{\\boldsymbol\{\\theta\}\}uses the same architecture as the base classifier, is initialized from the base classifier, and trains only the estimator head; Appendix[B\.2](https://arxiv.org/html/2606.02876#A2.SS2)ablates this choice against end\-to\-end and random\-initialized variants\.

### 3\.2From Surrogate Predictions to Certified Radii

We now turn the surrogate’s predictions into certified radii with a provable lower\-bound guarantee\. A single forward pass throughq𝜽q\_\{\\boldsymbol\{\\theta\}\}yields the predicted classg^​\(𝐱\)\\widehat\{g\}\(\\mathbf\{x\}\)and top probabilityqA​\(𝐱\)q\_\{A\}\(\\mathbf\{x\}\)in \([3](https://arxiv.org/html/2606.02876#S3.E3)\)\. Since the true radiusR​\(𝐱;σ\)=σ​Φ−1​\(pA​\(𝐱\)\)R\(\\mathbf\{x\};\\sigma\)=\\sigma\\,\\Phi^\{\-1\}\(p\_\{A\}\(\\mathbf\{x\}\)\)is monotone in the smoothed top probabilitypA​\(𝐱\)p\_\{A\}\(\\mathbf\{x\}\), any high\-probability lower bound onpA​\(𝐱\)p\_\{A\}\(\\mathbf\{x\}\)immediately yields a high\-probability lower bound onR​\(𝐱;σ\)R\(\\mathbf\{x\};\\sigma\)\. We thus reduce the certification problem to lower\-boundingpA​\(𝐱\)p\_\{A\}\(\\mathbf\{x\}\)fromqA​\(𝐱\)q\_\{A\}\(\\mathbf\{x\}\), and address it with a one\-time conformal calibration\(Shafer and Vovk,[2008](https://arxiv.org/html/2606.02876#bib.bib43); Leiet al\.,[2018](https://arxiv.org/html/2606.02876#bib.bib55)\)that determines a single parameterδ≥0\\delta\\geq 0on a held\-out set such thatqA​\(𝐱\)−δ≤pA​\(𝐱\)q\_\{A\}\(\\mathbf\{x\}\)\-\\delta\\leq p\_\{A\}\(\\mathbf\{x\}\)with high probability\. The calibrated radius

R~​\(𝐱;σ\)≜σ​Φ−1​\(qA​\(𝐱\)−δ\)\\widetilde\{R\}\(\\mathbf\{x\};\\sigma\)\\;\\triangleq\\;\\sigma\\,\\Phi^\{\-1\}\\\!\\big\(q\_\{A\}\(\\mathbf\{x\}\)\-\\delta\\big\)\(4\)is then a lower bound onR​\(𝐱;σ\)R\(\\mathbf\{x\};\\sigma\), computed from the same forward pass that producesg^​\(𝐱\)\\widehat\{g\}\(\\mathbf\{x\}\)\. As a result, certification at deployment is label\-free, MC\-free, and requires no test\-time sampling\.

The key observation enabling this reduction is that, by definition of the smoothed argmax,pA​\(𝐱\)≥p​\(g^​\(𝐱\)∣𝐱,σ\)p\_\{A\}\(\\mathbf\{x\}\)\\geq p\(\\widehat\{g\}\(\\mathbf\{x\}\)\\mid\\mathbf\{x\},\\sigma\)for any𝐱\\mathbf\{x\}, with equality wheneverg^​\(𝐱\)=g​\(𝐱;σ\)\\widehat\{g\}\(\\mathbf\{x\}\)=g\(\\mathbf\{x\};\\sigma\)\. Calibrating against the surrogate’s own argmaxg^\\widehat\{g\}therefore always yields a valid lower bound onpA​\(𝐱\)p\_\{A\}\(\\mathbf\{x\}\)\.

###### Proposition 1\(Calibrated lower bound on the smoothed top probability\)\.

Fix confidence parametersβ,γ∈\(0,1\)\\beta,\\gamma\\in\(0,1\)\. Let\{𝐱ical\}i=1M\\\{\\mathbf\{x\}\_\{i\}^\{\\mathrm\{cal\}\}\\\}\_\{i=1\}^\{M\}be a calibration set drawn i\.i\.d\. from the test distribution, disjoint from surrogate training\. For each calibration point, drawnnnoise samples𝛆i,j∼𝒩​\(𝟎,σ2​𝐈\)\\boldsymbol\{\\varepsilon\}\_\{i,j\}\\sim\\mathcal\{N\}\(\\mathbf\{0\},\\sigma^\{2\}\\mathbf\{I\}\)and letp¯i\\underline\{p\}\_\{i\}be the one\-sided Clopper–Pearson lower bound at confidence1−β1\-\\betaonp​\(g^​\(𝐱ical\)∣𝐱ical,σ\)p\(\\widehat\{g\}\(\\mathbf\{x\}\_\{i\}^\{\\mathrm\{cal\}\}\)\\mid\\mathbf\{x\}\_\{i\}^\{\\mathrm\{cal\}\},\\sigma\)\. Define residualsri=qA​\(𝐱ical\)−p¯ir\_\{i\}=q\_\{A\}\(\\mathbf\{x\}\_\{i\}^\{\\mathrm\{cal\}\}\)\-\\underline\{p\}\_\{i\}and setδ\\deltato the⌈\(M\+1\)​\(1−γ\)⌉\\lceil\(M\+1\)\(1\-\\gamma\)\\rceil\-th smallest ofr1,…,rMr\_\{1\},\\dots,r\_\{M\}\. Then for an independent test point𝐱\\mathbf\{x\},

ℙ​\[pA​\(𝐱\)≥qA​\(𝐱\)−δ\]≥1−β−γ\.\\mathbb\{P\}\\big\[\\,p\_\{A\}\(\\mathbf\{x\}\)\\geq q\_\{A\}\(\\mathbf\{x\}\)\-\\delta\\,\\big\]\\;\\geq\\;1\-\\beta\-\\gamma\.\(5\)

###### Proof sketch\.

We havep​\(g^​\(𝐱\)∣𝐱,σ\)≤pA​\(𝐱\)p\(\\widehat\{g\}\(\\mathbf\{x\}\)\\mid\\mathbf\{x\},\\sigma\)\\leq p\_\{A\}\(\\mathbf\{x\}\)for any𝐱\\mathbf\{x\}\. A Clopper–Pearson lower boundp¯test\\underline\{p\}\_\{\\mathrm\{test\}\}onp​\(g^​\(𝐱\)∣𝐱,σ\)p\(\\widehat\{g\}\(\\mathbf\{x\}\)\\mid\\mathbf\{x\},\\sigma\)fromnnfresh noise samples satisfiesp¯test≤p​\(g^​\(𝐱\)∣𝐱,σ\)\\underline\{p\}\_\{\\mathrm\{test\}\}\\leq p\(\\widehat\{g\}\(\\mathbf\{x\}\)\\mid\\mathbf\{x\},\\sigma\)with probability≥1−β\\geq 1\-\\beta\. Since calibration and test residuals are exchangeable, the conformal guarantee givesqA​\(𝐱\)−δ≤p¯testq\_\{A\}\(\\mathbf\{x\}\)\-\\delta\\leq\\underline\{p\}\_\{\\mathrm\{test\}\}with probability≥1−γ\\geq 1\-\\gamma\. A union bound chains these asqA​\(𝐱\)−δ≤p​\(g^​\(𝐱\)∣𝐱,σ\)≤pA​\(𝐱\)q\_\{A\}\(\\mathbf\{x\}\)\-\\delta\\leq p\(\\widehat\{g\}\(\\mathbf\{x\}\)\\mid\\mathbf\{x\},\\sigma\)\\leq p\_\{A\}\(\\mathbf\{x\}\)with probability≥1−β−γ\\geq 1\-\\beta\-\\gamma\. The full proof is in Appendix[E](https://arxiv.org/html/2606.02876#A5)\. ∎

In Proposition[1](https://arxiv.org/html/2606.02876#Thmproposition1),β\\betacontrols the Clopper–Pearson noise from calibration sampling andγ\\gammathe conformal slack absorbing the surrogate’s over\-prediction; their sumβ\+γ\\beta\+\\gammais the total miscoverage budget\. Any allocation withβ\+γ=α\\beta\+\\gamma=\\alphayields a\(1−α\)\(1\-\\alpha\)\-confidence lower bound onR​\(𝐱;σ\)R\(\\mathbf\{x\};\\sigma\)from a single forward pass throughq𝜽q\_\{\\boldsymbol\{\\theta\}\}, in contrast to direct MC\-based certification, which scales the per\-input noise budgetnnto reach the same confidence at deployment\.

Moreover, the proposition certifies that with high probability, the smoothed classifierg​\(⋅;σ\)g\(\\cdot;\\sigma\)is constant on the ball of radiusR~​\(𝐱;σ\)\\widetilde\{R\}\(\\mathbf\{x\};\\sigma\)around𝐱\\mathbf\{x\}\. Deployment, meanwhile, returns the surrogate’s predictiong^​\(𝐱\)\\widehat\{g\}\(\\mathbf\{x\}\)as a fast stand\-in forg​\(𝐱;σ\)g\(\\mathbf\{x\};\\sigma\)\. To bridge the two, we can consider the argmax\-agreement eventE​\(𝐱\)≜\{g^​\(𝐱\)=g​\(𝐱;σ\)\}E\(\\mathbf\{x\}\)\\;\\triangleq\\;\\\{\\widehat\{g\}\(\\mathbf\{x\}\)=g\(\\mathbf\{x\};\\sigma\)\\\}\. A naïve guarantee forg^\\widehat\{g\}would*assume*E​\(𝐱\)E\(\\mathbf\{x\}\)at the center and conclude thatg^​\(𝐱\)\\widehat\{g\}\(\\mathbf\{x\}\)inherits the smoothed classifier’s certificate\. Such an assumption unfortunately cannot be verified at deployment as checkingg^​\(𝐱\)=g​\(𝐱;σ\)\\widehat\{g\}\(\\mathbf\{x\}\)=g\(\\mathbf\{x\};\\sigma\)requires evaluatingg​\(𝐱;σ\)g\(\\mathbf\{x\};\\sigma\), which is precisely the expensive MC computation we seek to avoid\. The following corollary inverts this dependence, demonstrating thatE​\(𝐱\)E\(\\mathbf\{x\}\)becomes a*consequence*of a deployment\-observable condition, namely, the certified radius being positive\.

###### Corollary 1\.1\(Surrogate prediction matches the smoothed classifier on positive radii\)\.

Under the conditions of Proposition[1](https://arxiv.org/html/2606.02876#Thmproposition1), ifqA​\(𝐱\)−δ\>1/2q\_\{A\}\(\\mathbf\{x\}\)\-\\delta\>1/2, then with probability≥1−β−γ\\geq 1\-\\beta\-\\gammathe surrogate’s predictiong^​\(𝐱\)\\widehat\{g\}\(\\mathbf\{x\}\)coincides with the smoothed classifier’s predictiong​\(𝐱;σ\)g\(\\mathbf\{x\};\\sigma\), andg​\(⋅;σ\)g\(\\cdot;\\sigma\)is constant with valueg^​\(𝐱\)\\widehat\{g\}\(\\mathbf\{x\}\)on the ball\{𝐱′:‖𝐱′−𝐱‖2≤R~​\(𝐱;σ\)\}\\\{\\mathbf\{x\}^\{\\prime\}:\\\|\\mathbf\{x\}^\{\\prime\}\-\\mathbf\{x\}\\\|\_\{2\}\\leq\\widetilde\{R\}\(\\mathbf\{x\};\\sigma\)\\\}\.

Together, the clauses of the corollary turn the surrogate into a deployable certified classifier\. The first clause — prediction\-match at the center — closes the gap between Proposition[1](https://arxiv.org/html/2606.02876#Thmproposition1)\(which certifiesg​\(⋅;σ\)g\(\\cdot;\\sigma\)\) and deployment \(which queriesg^\\widehat\{g\}\)\. When the radius is positive, the surrogate’s prediction at𝐱\\mathbf\{x\}is provably the same prediction the smoothed classifier would have made\. The second clause — constancy on the ball with valueg^​\(𝐱\)\\widehat\{g\}\(\\mathbf\{x\}\)— is the standard randomized\-smoothing certificate\(Cohenet al\.,[2019](https://arxiv.org/html/2606.02876#bib.bib16)\), now anchored to the value returned by the single forward pass of the surrogate\.

## 4Experimental Evaluation

##### Evaluation Goals and Baselines

We organize the evaluation around three questions that are answered explicitly in the results below\.Q1: Certified accuracy\.DoesRRISEpreserve the certified accuracy of fixed\-budget MC randomized smoothing while replacing repeated noisy evaluations with one surrogate forward pass at inference?Q2: Boundary\-radius reliability\.After calibration, doesRRISEavoid inflated radii in the boundary\-confidence regime, where small probability errors can change whether an input is certified?Q3: Computational break\-even\.After accounting for offline target construction and surrogate training, how many deployment queries are required beforeRRISEbecomes cheaper than MC\-based certification? We compare against four baselines\.

Baseline 1is fixed\-budget MC randomized smoothing\(Cohenet al\.,[2019](https://arxiv.org/html/2606.02876#bib.bib16)\), which drawsnnnoisy samples per input and certifies via a one\-sided Clopper–Pearson lower bound\.Baseline 2is an input\-specific sample\-budgeting method followingSeferiset al\.\([2024](https://arxiv.org/html/2606.02876#bib.bib18)\), using a pilot estimate and budget\-mapping rule to reduce noisy evaluations while still certifying from realized count evidence\.Baseline 3is a budget\-prediction and early\-stopping method inspired byVoracek \([2024](https://arxiv.org/html/2606.02876#bib.bib20)\); the stopping rule is adaptive, but the final radius is again computed from a Clopper–Pearson lower bound\.Baseline 4is the offline Jensen–Shannon divergence surrogate ofBhardwajet al\.\([2024](https://arxiv.org/html/2606.02876#bib.bib28)\), which sharesRRISE’s MC class\-count targets but has no calibration procedure\. We equip it withRRISE’s conformal calibration for a fair radius comparison\.

These baselines span the relevant design space: Baseline 1 tests whetherRRISEpreserves the reference MC certificate; Baselines 2 and 3 test whether full surrogate inference provides additional savings beyond adaptive MC sampling; and Baseline 4 tests whether the calibrated surrogate design in Section[3](https://arxiv.org/html/2606.02876#S3)improves over an offline surrogate trained from MC class\-count targets\. Unless otherwise stated, all methods use an MC budget ofn=10,000n=10\{,\}000, with Baselines 2–3 in their tighter1%1\\%configuration: Baseline 2 uses decline level0\.010\.01, and Baseline 3 uses stopping tolerance0\.010\.01\. These hyperparameters make the adaptive MC baselines more conservative by allowing less approximation slack before continuing sampling\.RRISEis initialized from the base classifier with only the prediction head trained; Baseline 4 uses random initialization per its original setting\. Appendix[B](https://arxiv.org/html/2606.02876#A2)ablates MC budget, training strategy, calibration level, and baseline hyperparameters\.

##### Experimental Setup

We evaluate on FashionMNIST, CIFAR\-10, CIFAR\-100, and Tiny ImageNet using MLP\-Mixer\-Tiny, ResNet\-18 with a CIFAR\-style stem, EfficientNet\-B0, and ViT\-Tiny, respectively; theRRISEsurrogate inherits each base architecture\. Following standard randomized\-smoothing practice, base classifiers are trained with Gaussian noise augmentation at levelσbase\\sigma\_\{\\mathrm\{base\}\}, and certification, as well as surrogate\-target construction, uses smoothing levelσ\\sigma:

\(σbase,σ\)=\(0\.5,0\.25\)on FashionMNIST and CIFAR\-10,\(\\sigma\_\{\\mathrm\{base\}\},\\sigma\)=\(0\.5,0\.25\)\\quad\\text\{on FashionMNIST and CIFAR\-10,\}and

\(σbase,σ\)=\(0\.25,0\.10\)on CIFAR\-100 and Tiny ImageNet\.\(\\sigma\_\{\\mathrm\{base\}\},\\sigma\)=\(0\.25,0\.10\)\\quad\\text\{on CIFAR\-100 and Tiny ImageNet\.\}TheRRISEoffline target dataset stores the normalized class\-count vector obtained by evaluating the frozen base classifier under noisy perturbations of each training input\. The surrogate is trained on clean inputs with the cross\-entropy objective from Section[3](https://arxiv.org/html/2606.02876#S3); model selection follows the cross\-validation procedure in Appendix[A](https://arxiv.org/html/2606.02876#A1)\.

##### Calibration and Confidence Matching

For MC baselines, certificates use one\-sided Clopper–Pearson lower bounds at failure levelαMC\\alpha\_\{\\mathrm\{MC\}\}\. ForRRISEand Baseline 4, we use the calibration procedure of Section[3\.2](https://arxiv.org/html/2606.02876#S3.SS2)\. A10%10\\%calibration split estimates the scalar offsetδ\\delta, and the same offset is used for reporting\. In the main comparison,αMC=0\.25\\alpha\_\{\\mathrm\{MC\}\}=0\.25for Baselines 1–3 andβsur=0\.001\\beta\_\{\\mathrm\{sur\}\}=0\.001,γsur=0\.249\\gamma\_\{\\mathrm\{sur\}\}=0\.249for surrogate methods, soβsur\+γsur=0\.25\\beta\_\{\\mathrm\{sur\}\}\+\\gamma\_\{\\mathrm\{sur\}\}=0\.25\. Appendix[B](https://arxiv.org/html/2606.02876#A2)includes stricter surrogate calibration levelsβsur\+γsur∈\{0\.10,0\.05,0\.01\}\\beta\_\{\\mathrm\{sur\}\}\+\\gamma\_\{\\mathrm\{sur\}\}\\in\\\{0\.10,0\.05,0\.01\\\}\. Throughout this section, we denotep~A​\(𝐱\)\\widetilde\{p\}\_\{A\}\(\\mathbf\{x\}\)the method\-specific lower bound on the smoothed top\-class probability used for certification: the one\-sided Clopper–Pearson lower bound for MC baselines, and the calibrated quantityqA​\(𝐱\)−δq\_\{A\}\(\\mathbf\{x\}\)\-\\deltafor RRISE and Baseline 4\.

##### Metrics

All results are reported as mean±\\pmstandard deviation over seeds\{100,200,300\}\\\{100,200,300\\\}\. Lety^​\(𝐱\)\\widehat\{y\}\(\\mathbf\{x\}\)denote the method’s predicted class,y​\(𝐱\)y\(\\mathbf\{x\}\)the ground\-truth label,p~A​\(𝐱\)\\widetilde\{p\}\_\{A\}\(\\mathbf\{x\}\)the method\-specific lower top\-probability estimate, andR~​\(𝐱\)\\widetilde\{R\}\(\\mathbf\{x\}\)the corresponding calibrated or MC\-certified radius\. Certified accuracy at thresholdrris

CertAcc​\(r\)=1\|𝒟test\|​∑𝐱∈𝒟test𝟏​\[y^​\(𝐱\)=y​\(𝐱\),p~A​\(𝐱\)\>12,R~​\(𝐱\)≥r\]\.\\mathrm\{CertAcc\}\(r\)=\\frac\{1\}\{\|\\mathcal\{D\}\_\{\\mathrm\{test\}\}\|\}\\sum\_\{\\mathbf\{x\}\\in\\mathcal\{D\}\_\{\\mathrm\{test\}\}\}\\mathbf\{1\}\\\!\\left\[\\widehat\{y\}\(\\mathbf\{x\}\)=y\(\\mathbf\{x\}\),\\ \\widetilde\{p\}\_\{A\}\(\\mathbf\{x\}\)\>\\frac\{1\}\{2\},\\ \\widetilde\{R\}\(\\mathbf\{x\}\)\\geq r\\right\]\.\(6\)CertAcc@0 is the fraction of test inputs that are simultaneously correctly classified and certified with positive lower top probability; higher values indicate that the method certifies more of the test set with a non\-trivial radius\.

To study behavior near the certification boundary, we consider the boundary\-confidence subset

ℬ≜\{𝐱∈𝒟test:0\.5<p~A​\(𝐱\)<0\.75\}\.\\mathcal\{B\}\\triangleq\\\{\\mathbf\{x\}\\in\\mathcal\{D\}\_\{\\mathrm\{test\}\}:0\.5<\\widetilde\{p\}\_\{A\}\(\\mathbf\{x\}\)<0\.75\\\}\.\(7\)Boundary Mass is\|ℬ\|/\|𝒟test\|\|\\mathcal\{B\}\|/\|\\mathcal\{D\}\_\{\\mathrm\{test\}\}\|, the fraction of test inputs the method places in the diagnostic region just above the certification threshold\. A method that places too few inputs inℬ\\mathcal\{B\}may be over\-confident, while a method that places too many may be under\-confident; therefore boundary mass should be read jointly with average radius and CertAcc\. OCA denotes ordinary classification accuracy on the specified subset; in the boundary tables it is

OCA​\(ℬ\)=1\|ℬ\|​∑𝐱∈ℬ𝟏​\[y^​\(𝐱\)=y​\(𝐱\)\],\\mathrm\{OCA\}\(\\mathcal\{B\}\)=\\frac\{1\}\{\|\\mathcal\{B\}\|\}\\sum\_\{\\mathbf\{x\}\\in\\mathcal\{B\}\}\\mathbf\{1\}\[\\widehat\{y\}\(\\mathbf\{x\}\)=y\(\\mathbf\{x\}\)\],\(8\)when\|ℬ\|\>0\|\\mathcal\{B\}\|\>0, and is undefined otherwise\. Avg\. Radius is the average ofR~​\(𝐱\)\\widetilde\{R\}\(\\mathbf\{x\}\)on the same subset\.

The certified\-radius distribution \(CRD\) is reported in two complementary forms\. Boundary CRD measures the fraction of all test inputs that are both inℬ\\mathcal\{B\}and have radius above thresholdtt:

Similar Articles

StableRCA: Robust Graph-Agnostic Mechanism-Level Root Cause Analysis

arXiv cs.LG

StableRCA is a novel root cause analysis framework that identifies intervention targets by estimating local Markov boundaries and detecting conditional distribution shifts, avoiding the need for global causal graph discovery and demonstrating robustness across synthetic and real-world datasets.

ROSE: An Intent-Centered Evaluation Metric for NL2SQL

Hugging Face Daily Papers

ROSE is a novel intent-centered evaluation metric for NL2SQL that uses a Prover-Refuter cascade to assess semantic correctness independently of ground-truth SQL, achieving 24% better agreement with human experts than existing metrics. The paper addresses limitations of Execution Accuracy and provides a re-evaluation of 19 NL2SQL methods with publicly released resources.