Enforcing Constraints in Generative Sampling via Adaptive Correction Scheduling
Summary
This research paper introduces adaptive correction scheduling for enforcing hard constraints in generative sampling, demonstrating that it improves the cost-accuracy frontier compared to terminal or stepwise projection methods.
View Cached Full Text
Cached at: 05/13/26, 06:35 AM
# Enforcing Constraints in Generative Sampling via Adaptive Correction Scheduling
Source: [https://arxiv.org/html/2605.11214](https://arxiv.org/html/2605.11214)
Noah Trupin Yexiang Xue Department of Computer Science Purdue University \{ntrupin,yexiang\}@purdue\.edu
###### Abstract
Hard constraints in generative sampling are typically enforced by projection, applied either once at the end of sampling or after every update\. This binary framing overlooks a fundamental issue:*projection changes the distribution of states which future updates depend on\.*As a result, delayed projection can produce samples that are feasible but inconsistent with the intended sampling dynamics, even after final projection\. We formalize constraint enforcement as a correction scheduling problem over the generative rollout\. Using one\-step constraint defect as a local signal of geometric mismatch, we introduce*adaptive correction scheduling*, a state\-dependent policy that allocates projection budget to the steps that most strongly perturb the trajectory\. Terminal and stepwise projection arise as limiting cases of this family\. Across controlled manifold rollouts and a learned projected diffusion sampler, adaptive scheduling improves the cost–accuracy frontier at matched projection budgets, recovering71\.2%71\.2\\%of full stepwise benefit with75%75\\%fewer corrections\. These results show that constraint timing is a first\-class design variable in generative sampling, and that enforcing feasibility alone is insufficient to preserve the intended constrained sampling dynamics\.
Figure 1:*Adaptive correction preserves rollouts at a fraction of the cost\.*Terminal correction projects only after the trajectory has already drifted from the ridged terrain, producing a feasible but dynamically inconsistent path\. Stepwise correction prevents drift by projecting after every update, but pays the full projection cost\. Our adaptive scheduler uses the same projection operator but spends only25%25\\%of the projection calls, concentrating them where defect is largest to closely track the stepwise rollout\.## 1Introduction
Generative models are increasingly deployed in settings where outputs must satisfy hard constraints, including trajectory synthesis, robot planning, and structured generation\(Liet al\.,[2024](https://arxiv.org/html/2605.11214#bib.bib11); Lianget al\.,[2025](https://arxiv.org/html/2605.11214#bib.bib13); Zhanget al\.,[2025](https://arxiv.org/html/2605.11214#bib.bib20); Christopheret al\.,[2024](https://arxiv.org/html/2605.11214#bib.bib19); Cardeiet al\.,[2025b](https://arxiv.org/html/2605.11214#bib.bib18); Santiet al\.,[2025](https://arxiv.org/html/2605.11214#bib.bib17)\)\. In these domains, samples are required to lie on manifolds defined by geometric, physical, or task\-specific structures\. A common strategy is to run the generative process in the ambient space and enforce feasibility through projection\. This raises the question:*when should projection be applied?*
In practice, two strategies dominate: terminal correction projects only at the end of sampling, while stepwise correction projects after every update\. Both are widely used, and both can produce feasible outputs\. However, they treat correction timing as a fixed design choice rather than part of the algorithm\.
This framing ignores that generative updates are state\-dependent, as each step is computed from the current iterate\. Once a rollout leaves the constraint set, subsequent updates are evaluated at states that a constrained trajectory would never visit\. A later projection can restore feasibility, but it cannot in general undo the distributional shift induced by these off\-manifold states\. As a result, terminally corrected samples are often feasible but wrong\. They lie on the constraint set, but correspond to a trajectory that deviated from the constrained dynamics during rollout\.
Constraint enforcement is therefore not only about satisfying feasibility at the endpoint, but about preserving the trajectory distribution that produces that endpoint\. This makes correction timing a resource allocation problem\. In typical rollouts, most updates remain close to the constraint set, while a smaller subset produces large excursions that drive downstream error\. Terminal correction under\-allocates effort to these critical steps, while stepwise correction over\-allocates effort everywhere\. The central challenge is thus how to allocate a limited number of corrections across the rollout\.
We address this by reframing constraint enforcement as a correction scheduling problem\. We use one\-step constraint defect, the distance of a proposed update from the feasible set, as a local signal of geometric mismatch\. Small defect indicates that the rollout remains close to the constrained trajectory, while large defect identifies steps whose effects are unlikely to be corrected later\. This leads to an online budgeted policy: spend corrections on high\-defect steps, where projection is expected to most reduce downstream trajectory error\.
This yields a family of online budgeted schedules that interpolate between terminal and stepwise correction while adapting to the realized defect profile of each rollout\. Fundamentally, this policy is not a heuristic layered onto existing methods, but a direct consequence of viewing constraint enforcement through the lens of trajectory consistency\. It is model\-agnostic, requires no gradients through projection, and introduces negligible overhead beyond defect evaluation\.
Our contribution is a scheduling view of constrained generative sampling:
1. 1\.*Timing changes the rollout\.*We show that delayed correction alters the distribution of future updates, producing endpoint deviations that cannot be resolved by terminal projection alone\.
2. 2\.*Constraint enforcement is a budget allocation problem\.*We formalize correction as allocating a limited number of projections across rollout, making terminal, periodic, and adaptive strategies directly comparable\.
3. 3\.*Defect provides an effective value proxy\.*An online budgeted scheduler uses defect to spend corrections on the steps that most strongly perturb the trajectory\.
4. 4\.*Adaptive scheduling improves the cost–accuracy frontier\.*At matched correction cost, adaptive policies recover near\-stepwise fidelity while using substantially fewer projections\.
Across controlled manifold rollouts and learned diffusion settings, we observe that constraint violations are highly heterogeneous across time, and that selectively correcting high\-defect steps consistently outperforms uniform strategies\. These results demonstrate that constraint timing is a fundamental degree of freedom in generative sampling, and that enforcing feasibility alone is insufficient to preserve the intended constrained distribution\.
## 2Related Work
Prior work enforces hard constraints in generative modeling by incorporating projection or optimization into the sampling loop\(Christopheret al\.,[2024](https://arxiv.org/html/2605.11214#bib.bib19); Santiet al\.,[2025](https://arxiv.org/html/2605.11214#bib.bib17); Utkarshet al\.,[2025](https://arxiv.org/html/2605.11214#bib.bib21); Zhanget al\.,[2025](https://arxiv.org/html/2605.11214#bib.bib20); Hoseinpour and Dvorkin,[2025](https://arxiv.org/html/2605.11214#bib.bib16)\)\. In diffusion and flow\-based models, projected sampling methods often apply projection after each step to maintain feasibility throughout generation\(Chiet al\.,[2024](https://arxiv.org/html/2605.11214#bib.bib10); Liet al\.,[2024](https://arxiv.org/html/2605.11214#bib.bib11); Xiaoet al\.,[2024](https://arxiv.org/html/2605.11214#bib.bib14); Janneret al\.,[2022](https://arxiv.org/html/2605.11214#bib.bib15); Lianget al\.,[2025](https://arxiv.org/html/2605.11214#bib.bib13); Yanget al\.,[2026](https://arxiv.org/html/2605.11214#bib.bib12); Santiet al\.,[2025](https://arxiv.org/html/2605.11214#bib.bib17); Utkarshet al\.,[2025](https://arxiv.org/html/2605.11214#bib.bib21); Ni and Qureshi,[2024](https://arxiv.org/html/2605.11214#bib.bib1)\)\. Variants introduce projected gradients, differentiable optimization layers, or discrete constraint operators\(Cardeiet al\.,[2025a](https://arxiv.org/html/2605.11214#bib.bib23),[b](https://arxiv.org/html/2605.11214#bib.bib18); Chenget al\.,[2024](https://arxiv.org/html/2605.11214#bib.bib22); Hoseinpour and Dvorkin,[2025](https://arxiv.org/html/2605.11214#bib.bib16)\)\.
These approaches address how to enforce constraints, but largely treat correction timing as fixed\. Stepwise projection is typically adopted as a default, while terminal projection is used as a cheaper alternative\. Our work instead focuses on when correction should be applied, holding the underlying generative process fixed\.
This question is closely related to geometric integration, where the interaction between an update rule and a projection or retraction determines long\-horizon behavior\(Hairer,[2001](https://arxiv.org/html/2605.11214#bib.bib7); Séguinet al\.,[2024](https://arxiv.org/html/2605.11214#bib.bib8); Christopheret al\.,[2024](https://arxiv.org/html/2605.11214#bib.bib19); McLachlanet al\.,[2014](https://arxiv.org/html/2605.11214#bib.bib6); Liñán and Diego,[2023](https://arxiv.org/html/2605.11214#bib.bib9)\)\. Classical results show that projection and dynamics do not generally commute, leading to trajectory drift even when endpoints are feasible\. Our setting differs in two key ways: the dynamics are learned and stochastic, and the objective is efficient allocation of correction compute alongside accuracy\.
A separate line of work studies adaptive schedules in diffusion, selecting timesteps or noise levels to trade off speed and sample quality\. Methods such as AdaDiff, TDPM, stepsize distillation, and active noise estimation allocate denoising compute across time\(Zhanget al\.,[2024](https://arxiv.org/html/2605.11214#bib.bib5); Yeet al\.,[2025](https://arxiv.org/html/2605.11214#bib.bib4); Peiet al\.,[2025](https://arxiv.org/html/2605.11214#bib.bib3); Kim and Kim,[2026](https://arxiv.org/html/2605.11214#bib.bib2)\)\. These approaches operate along an orthogonal axis: they control the generative dynamics themselves\. In contrast, we allocate constraint enforcement, treating projection as a limited resource applied to preserve trajectory consistency\.
Closest are methods that interleave optimization with sampling, but these apply correction uniformly or through fixed heuristics\. We instead treat correction as a budgeted allocation problem\.
## 3Adaptive Correction Scheduling
Projection timing is a control decision\. Rather than projecting only at the end or after every update, we treat correction as a finite resource to be allocated across the rollout\. Given a rollout of lengthTTand a correction budgetBB, the scheduler must choose which proposed updates to project\. The ideal policy would spend corrections where they most reduce future trajectory error, but this marginal value is not directly observable in practice\. Our method uses one\-step constraint defect as an online proxy\.
At each step, the scheduler first applies the generative update without correction, measures how far the proposed state departs from the constraint set, and then decides whether this departure is worth spending one unit of budget\. Large defect indicates that the rollout has entered a region where future updates are likely to differ from the constrained trajectory, while small defect indicates that projection can be safely delayed\. As such, the adaptive scheduler spends corrections where the realized rollout is most likely to drift, while matching the projection budget of fixed periodic baselines\.
This section formalizes that idea: we describe rollouts with correction events, define defect as the scheduling signal, and derive our budget\-aware online policy\.
### 3\.1Rollouts with correction events
Figure 2:*Adaptive gains appear when trajectory error is concentrated in time\.*Each panel shows normalized state error from the stepwise reference alongside projection events\. In the volatile terrain setting \(left\), a few high\-defect regions dominate path error; periodic correction spends budget uniformly and misses several of these events, while adaptive correction concentrates projections where they most reduce downstream drift\. In the homogeneous setting \(right\), defects are more diffuse, so the gap between adaptive and periodic narrows\. This supports that adaptive correction helps most when a subset of rollout steps controls the trajectory error\.We consider a generative process that ideally evolves on a constraint setℳ⊂𝒳\\mathcal\{M\}\\subset\\mathcal\{X\}, where𝒳⊆ℝd\\mathcal\{X\}\\subseteq\\mathbb\{R\}^\{d\}is an ambient space\. In many applications,ℳ\\mathcal\{M\}is a smooth embedded manifold or a set defined by hard constraints, such as kinematic feasibility, geometric structure, or physical validity\.
In practice, however, generative models operate in the ambient space\. A rollout proceeds by repeatedly applying a learned update rule
xt\+1=Φh\(xt\),t=0,…,T−1,x\_\{t\+1\}=\\Phi\_\{h\}\(x\_\{t\}\),\\qquad t=0,\\ldots,T\-1,whereΦh\\Phi\_\{h\}denotes a discrete update operator, such as a reverse diffusion step or flow\-matching update, with step sizehh\. To enforce feasibility, we assume access to a projection or retraction operatorΠ:𝒳→ℳ\\Pi:\\mathcal\{X\}\\to\\mathcal\{M\}which maps arbitrary states back to the constraint set\.
The standard objective is to produce a final samplexT∈ℳx\_\{T\}\\in\\mathcal\{M\}\. However, this endpoint view is incomplete, as the generative process is inherently trajectory\-dependent: each update is computed from the current state\. If a rollout leavesℳ\\mathcal\{M\}, subsequent updates are evaluated at states a constrained trajectory would never visit\. While later projection may restore feasibility, it generally cannot undo the off\-manifold updates already taken\. Fig\.[2](https://arxiv.org/html/2605.11214#S3.F2)illustrates greater accumulation of error throughout sampling without principled constraint enforcement\. As such, projection is an intervention in the dynamics, and applying it changes both the states from which future updates are computed and the final sample\.
We formalize this intervention by introducing a correction scheduleσ:\{0,…,T−1\}→\{0,1\}\\sigma:\\\{0,\\ldots,T\-1\\\}\\to\\\{0,1\\\}, whereσ\(t\)=1\\sigma\(t\)=1indicates that projection is applied after steptt\. Given the proposed update
x~t\+1=Φh\(xt\),\\tilde\{x\}\_\{t\+1\}=\\Phi\_\{h\}\(x\_\{t\}\),\(1\)the corrected rollout is defined by
xt\+1=\{Π\(x~t\+1\)ifσ\(t\)=1,x~t\+1ifσ\(t\)=0\.x\_\{t\+1\}=\\begin\{cases\}\\Pi\(\\tilde\{x\}\_\{t\+1\}\)&\\text\{if \}\\sigma\(t\)=1,\\\\ \\tilde\{x\}\_\{t\+1\}&\\text\{if \}\\sigma\(t\)=0\.\\end\{cases\}
This formulation unifies standard strategies:*terminal correction*setsσ\(t\)=0\\sigma\(t\)=0for allt<T−1t<T\-1and projects once at the end,*stepwise correction*setsσ\(t\)=1\\sigma\(t\)=1for alltt, and*periodic correction*corrects at fixed intervals\. All these correspond to different ways of allocating correction events across the rollout\. Our goal is to construct a fourth schedule that spends up to the same budget as a periodic rule but chooses correction times from the realized rollout\.
### 3\.2Defect as the scheduling signal
To determine when correction is necessary, we require a signal that quantifies how much a proposed update departs from the constraint set\. We define a defect functiond\(x,ℳ\)d\(x,\\mathcal\{M\}\)that measures the distance or constraint violation of a statexxrelative toℳ\\mathcal\{M\}\. This may be Euclidean distance from the constraint surface or some other domain\-specific metric\.
For a proposed updatex~t\+1=Φh\(xt\)\\tilde\{x\}\_\{t\+1\}=\\Phi\_\{h\}\(x\_\{t\}\), we define the one\-step defect
st=d\(x~t\+1,ℳ\)\.s\_\{t\}=d\(\\tilde\{x\}\_\{t\+1\},\\mathcal\{M\}\)\.\(2\)
This scalar captures the local mismatch introduced by the update\. A smallsts\_\{t\}indicates that the rollout remains close to the constraint set, while largests\_\{t\}implies that the update produces a significant excursion\. Large defect marks steps where future updates are likely to differ most from the constrained rollout, makingsts\_\{t\}a local proxy for downstream trajectory distortion\.
### 3\.3Budgeted online allocation
We distinguish two types of error:*Endpoint error,*the deviation of the final sample from a reference constrained trajectory, and*pathwise error,*cumulative deviation from the constraint set during rollout\.
A final projection eliminates endpoint infeasibility, but not the pathwise error accumulated during rollout\. We measure pathwise deviation via cumulative defect:
Epath=∑t=1Td\(xt,ℳ\)\.E\_\{\\mathrm\{path\}\}=\\sum\_\{t=1\}^\{T\}d\(x\_\{t\},\\mathcal\{M\}\)\.
From this perspective, constraint enforcement becomes a resource allocation problem\. Each projection incurs cost, and a schedule determines how a limited number of projections is distributed across time\. Terminal and stepwise correction are extreme allocations: one spends the budget at the end, the other spends it everywhere\. If large deviations are concentrated in a small subset of steps, both are inefficient\. This motivates adaptive, state\-dependent schedules\.
A budgeted schedule therefore answers an online marginal question: at timett, withbbcorrections remaining, is the current deviation worth spending one correction, or should the budget be reserved for future steps? Let
Vt\(xt\)=𝔼\[Euncorrected−Ecorrected∣xt\]V\_\{t\}\(x\_\{t\}\)=\\mathbb\{E\}\\\!\\left\[E\_\{\\mathrm\{uncorrected\}\}\-E\_\{\\mathrm\{corrected\}\}\\mid x\_\{t\}\\right\]denote the downstream value of correcting at steptt\. This quantity captures how much correcting the current step changes the remainder of the rollout\.
Given a rollout of lengthTTwith a finite correction budgetBB, the ideal policy would allocate corrections to maximize total reduction in trajectory error:
maxσ∑t=0T−1σ\(t\)Vt\(xt\)s\.t\.∑t=0T−1σ\(t\)≤B,\\max\_\{\\sigma\}\\sum\_\{t=0\}^\{T\-1\}\\sigma\(t\)V\_\{t\}\(x\_\{t\}\)\\quad\\text\{s\.t\.\}\\quad\\sum\_\{t=0\}^\{T\-1\}\\sigma\(t\)\\leq B,\(3\)whereσ\(t\)∈\{0,1\}\\sigma\(t\)\\in\\\{0,1\\\}indicates whether correction is applied at steptt\.
This formulation makes explicit that correction scheduling is a budgeted allocation problem over time\. In practice, however,VtV\_\{t\}is not directly observable\. Computing it would require estimating how a correction changes the remainder of the rollout under future stochastic updates\. We therefore require a tractable online proxy\. Noting that large one\-step defect identifies precisely the states where the rollout has entered regions that induce large downstream deviation, we use defect as a surrogate of this marginal value\.
### 3\.4Budget\-aware thresholds
The ideal policy in Eq\. \([3](https://arxiv.org/html/2605.11214#S3.E3)\) requires the downstream valueVt\(xt\)V\_\{t\}\(x\_\{t\}\)of correcting at each step\. This value is not observable online\. We therefore use one\-step defectsts\_\{t\}as a local proxy for marginal value, but unlike a fixed\-threshold rule, the decision must account for the remaining budget\.
Letbtb\_\{t\}denote the number of corrections remaining before steptt\. Our scheduler uses a family of thresholds
λt,b,t=0,…,T−1,b=0,…,B,\\lambda\_\{t,b\},\\qquad t=0,\\ldots,T\-1,\\quad b=0,\\ldots,B,whereλt,b\\lambda\_\{t,b\}is the marginal price of spending one correction at timettwithbbcorrections remaining\. Given a proposed update Eq\. \([1](https://arxiv.org/html/2605.11214#S3.E1)\) and defect Eq\. \([2](https://arxiv.org/html/2605.11214#S3.E2)\), the online budgeted policy is
xt\+1=\{Π\(x~t\+1\)ifbt\>0andst≥λt,bt,x~t\+1otherwise\.x\_\{t\+1\}=\\begin\{cases\}\\Pi\(\\tilde\{x\}\_\{t\+1\}\)&\\text\{if \}b\_\{t\}\>0\\text\{ and \}s\_\{t\}\\geq\\lambda\_\{t,b\_\{t\}\},\\\\ \\tilde\{x\}\_\{t\+1\}&\\text\{otherwise\}\.\\end\{cases\}If projection is applied, the remaining budget is updated asbt\+1=bt−1b\_\{t\+1\}=b\_\{t\}\-1; otherwisebt\+1=btb\_\{t\+1\}=b\_\{t\}\.
The dependence on bothttandbbproves essential: the same defect may be worth correcting late in the rollout, when few future opportunities remain, but not early, when the scheduler should reserve budget for larger future excursions\. Thusλt,b\\lambda\_\{t,b\}implements an online marginal\-value threshold conditioned on the remaining horizon and correction budget\.
The thresholdsλt,b\\lambda\_\{t,b\}determine how aggressively the scheduler spends its remaining corrections\. In principle, they approximate the value boundary of the budgeted allocation problem: correction is applied when the observed defect is large enough to justify consuming one unit of budget\. In Sec\.[4](https://arxiv.org/html/2605.11214#S4), thresholds are estimated on held\-out calibration rollouts\. For each timettand remaining budgetbb, we chooseλt,b\\lambda\_\{t,b\}so that the scheduler spends corrections at the desired pace while preserving budget for future high\-defect events\. This yields an online policy that uses exactly the same total projection budget as periodic baselines, but allocates that budget according to the geometry of the realized rollout\. In our experiments, this surface is estimated by held\-out quantiles of future defect traces\.
This budget\-aware policy reduces to familiar strategies as special cases: setting all thresholds to\+∞\+\\inftyrecovers terminal correction and setting all feasible thresholds to0recovers stepwise correction\.
### 3\.5Algorithm
Algorithm 1Online Adaptive Correction Scheduling1:budget
BB, thresholds
\{λt,b\}t=0,b=0T−1,B\\\{\\lambda\_\{t,b\}\\\}\_\{t=0,b=0\}^\{T\-1,B\}
2:
b←Bb\\leftarrow B
3:for
t=0,…,T−1t=0,\\ldots,T\-1do
4:Propose
x~t\+1←Φh\(xt\)\\tilde\{x\}\_\{t\+1\}\\leftarrow\\Phi\_\{h\}\(x\_\{t\}\)
5:Compute defect
st←d\(x~t\+1,ℳ\)s\_\{t\}\\leftarrow d\(\\tilde\{x\}\_\{t\+1\},\\mathcal\{M\}\)
6:if
b\>0b\>0and
st≥λt,bs\_\{t\}\\geq\\lambda\_\{t,b\}then
7:
xt\+1←Π\(x~t\+1\)x\_\{t\+1\}\\leftarrow\\Pi\(\\tilde\{x\}\_\{t\+1\}\)
8:
b←b−1b\\leftarrow b\-1
9:else
10:
xt\+1←x~t\+1x\_\{t\+1\}\\leftarrow\\tilde\{x\}\_\{t\+1\}
11:endif
12:endfor
Alg\.[1](https://arxiv.org/html/2605.11214#alg1)summarizes the online budgeted scheduler\. The policy observes only the proposed update, its defect, the current time, and the remaining correction budget\. It does not require gradients through projection, retraining the generative model, or modifying the update rule\. Its overhead is the cost of evaluating the defect and comparing it to the precomputed thresholdλt,b\\lambda\_\{t,b\}\.
Since the policy conditions on remaining budget, it avoids the main failure mode of fixed thresholding: spending corrections too early on moderate defects and exhausting the budget before larger excursions occur\. The scheduler is therefore adaptive in two senses: it responds to the realized geometry of the rollout throughsts\_\{t\}, and it adapts its spending rule to the remaining budget throughλt,b\\lambda\_\{t,b\}\.
## 4Experiments
We evaluate adaptive correction scheduling across manifold\-valued and trajectory diffusion settings\. The experiments ask: \(i\) how projection timing affects trajectory error, \(ii\) whether online allocation improves performance at fixed correction budget, and \(iii\) whether defect concentration explains these gains\.
### 4\.1Projection timing induces persistent trajectory error
Figure 3:*Adaptive scheduling improves the cost–accuracy frontier across controlled manifolds\.*We plot Normalized Excess Path Error \(NEPE\), where stepwise correction is0and terminal correction is11; lower is better\. At matched projection budgetB/TB/T, adaptive correction consistently lies below periodic correction, showing that*where*projections are applied matters more than simply applying them at a fixed frequency\. The largest gains occur in the impulse, lever, and ridge variants, where defect is most concentrated\.We begin by isolating the effect of correction timing\. Fig\.[1](https://arxiv.org/html/2605.11214#S0.F1)visualizes representative rollouts under terminal, periodic, and adaptive schedules\. Although terminal correction restores feasibility at the final step, the resulting trajectory deviates substantially from the constrained rollout\.
This effect is quantified in Tab\.[1](https://arxiv.org/html/2605.11214#S4.T1), which measures endpoint distance to the stepwise reference as a function of correction budget\. Even at low defect levels, delayed correction induces a persistent bias: the rollout evolves off\-manifold, and subsequent updates compound this deviation\. Final projection enforces feasibility but does not recover the original trajectory\. An accompanying figure for the endpoint distances reported in Tab\.[1](https://arxiv.org/html/2605.11214#S4.T1)can be found in App\.[E\.7](https://arxiv.org/html/2605.11214#A5.SS7)\.
These results establish that constraint enforcement cannot be understood purely as a final projection step\. Projection timing affects the entire rollout, and the resulting error grows continuously with defect\.
### 4\.2Scheduling dominates frequency at fixed budget
Table 1:*Fixed\-budget synthetic summary atB/T≈0\.25B/T\\approx 0\.25\.*Normalized Excess Path Error \(NEPE\) is normalized between stepwise correction \(0\) and terminal correction \(11\); lower is better\. All methods are evaluated at matched projection budget, and entries report mean±\\pmSE over paired seeds\. Adaptive scheduling consistently improves NEPE, with especially large gains in the concentrated\-defect variants; endpoint improvements are geometry\-dependent, reflecting that pathwise consistency and final endpoint displacement are related but not identical\.Fig\.[3](https://arxiv.org/html/2605.11214#S4.F3)shows the main cost–accuracy frontier\. Across all six controlled domains, adaptive correction achieves lower NEPE than periodic correction at matched projection budget, with the largest gains in the impulse, lever, and ridge variants\. Tab\.[1](https://arxiv.org/html/2605.11214#S4.T1)reports the fixed\-budget slice atB/T≈0\.25B/T\\approx 0\.25: adaptive consistently improves pathwise fidelity, while endpoint distance is geometry\-dependent\. This distinction is expected, as pathwise consistency measures whether the rollout followed the constrained dynamics while endpoint displacement also depends on how each geometry maps trajectory errors into final state error\.
Paired win\-rate diagnostics in App\.[E\.7](https://arxiv.org/html/2605.11214#A5.SS7)confirm this trend: adaptive wins on the pathwise metric in 85–90% of matched synthetic comparisons, showing that at fixed budget,*where*corrections are applied is more important than*how often*they are applied\.
### 4\.3Adaptive schedules track defect concentration
Figure 4:*Adaptive scheduling recovers more of the original PDM sampler at every projection budget\.*We keep the PDM model, constraint set, and projection operator fixed, and vary only projection timing\. Normalized Excess Path Error \(NEPE\) is measured relative to PDM \(stepwise\), which projects after every inner Langevin update and has NEPE0; terminal correction has NEPE11\. Across budgets, adaptive scheduling remains below periodic correction, showing that defect\-aware projection timing is a better use of the same projection budget\.To understand why scheduling matters, we examine how defect is distributed along the rollout\. Fig\.[2](https://arxiv.org/html/2605.11214#S3.F2)visualizes defect over time together with correction events for different policies\. Defect is highly non\-uniform: large excursions are concentrated in specific regions of the trajectory, while many steps incur negligible deviation\.
Periodic schedules allocate corrections uniformly, ignoring this structure\. In contrast, adaptive schedules concentrate corrections on high\-defect regions, avoiding unnecessary projections elsewhere\. This behavior is consistent with the budgeted formulation in Sec\.[3](https://arxiv.org/html/2605.11214#S3): defect estimates marginal value, while the\(t,b\)\(t,b\)\-dependent threshold determines whether that value justifies spending one of the remaining corrections\.
This mechanism explains the improvements observed in Fig\.[3](https://arxiv.org/html/2605.11214#S4.F3)and Tab\.[1](https://arxiv.org/html/2605.11214#S4.T1)\. By allocating corrections to the small subset of steps that dominate trajectory distortion, adaptive schedules approximate the optimal budget allocation\.
### 4\.4Generalization to diffusion and planning domains
Figure 5:*Projection timing changes constrained diffusion trajectories\.*We wrap the Projected Diffusion Models \(PDM\) trajectory sampler and vary only when its projection operator is applied\. Terminal correction delays projection until the end, producing paths that visibly collide with obstacles before being corrected\. Periodic correction uses the same projection budget as adaptive but spends it uniformly, leaving avoidable detours and collisions\. Adaptive correction spends projections only at high\-defect steps, yielding a trajectory closer to PDM \(stepwise\), which projects after every inner Langevin update\. Raster ticks denote projection calls\.Table 2:*Scheduling projection inside Projected Diffusion Models at fixed budget\.*Original PDM projects after every inner Langevin update and defines Normalized Excess Path Error \(NEPE\)=0=0; terminal correction defines NEPE=1=1\. At25%25\\%of the projection cost, adaptive scheduling recovers71\.2%71\.2\\%of the pathwise benefit of original PDM, compared with51\.4%51\.4\\%for periodic correction\. Thus, adaptive timing gives a41%41\\%reduction in excess pathwise error over periodic correction at the same budget\.Finally, we test whether the same scheduling principle applies inside an existing constrained, diffusion sampler\. Projected Diffusion Models \(PDM\) enforce constraints by applying projection throughout sampling\(Christopheret al\.,[2024](https://arxiv.org/html/2605.11214#bib.bib19); Lianget al\.,[2025](https://arxiv.org/html/2605.11214#bib.bib13)\)\. In our terminology, the original PDM sampler is the stepwise baseline, projecting after every inner Langevin update\. We keep the trained model, constraint set, and projection operator fixed, and vary only the timing of projection\. Rather than introducing a new sampler, we wrap an existing projected diffusion method and replace its all\-step projection rule with terminal, periodic, and online budgeted schedules\. We measure normalized excess pathwise error \(NEPE\) relative to the original PDM sampler, so original PDM has NEPE0and terminal correction has NEPE11\. Lower values therefore indicate closer agreement with the fully projected PDM trajectory\. A complete definition of NEPE appears in App\.[D\.3](https://arxiv.org/html/2605.11214#A4.SS3)\.
Fig\.[5](https://arxiv.org/html/2605.11214#S4.F5)visualizes representative trajectories under each projection schedule\. Terminal correction satisfies the constraint only after the trajectory has already drifted, while periodic correction improves stability by spending projections uniformly\. The adaptive scheduler instead allocates the same projection budget according to the realized defect profile, producing trajectories that remain visibly closer to the original PDM sampler while using only sparse projection events\.
Fig\.[4](https://arxiv.org/html/2605.11214#S4.F4)quantifies this behavior across projection budgets\. At every tested budget, adaptive scheduling recovers more of the original PDM behavior than periodic correction\. The fixed\-budget summary in Tab\.[2](https://arxiv.org/html/2605.11214#S4.T2)shows that atB/T≈0\.25B/T\\approx 0\.25, periodic correction achieves NEPE0\.486±0\.0090\.486\\pm 0\.009, while adaptive scheduling reduces this to0\.288±0\.0090\.288\\pm 0\.009, a41%41\\%reduction in excess pathwise error at the same projection budget\. Since original PDM projects after every inner Langevin update, this policy recovers71\.2%71\.2\\%of PDM’s performance at25%25\\%of the cost\.
Thus, even when the projection operator and diffusion model are inherited from a prior constrained diffusion method, projection timing remains a meaningful algorithmic degree of freedom\.
## 5Discussion
Projection is usually treated as a feasibility operation: run the sampler, then force the result back onto the constraint set\. Our results suggest a different view: in state\-dependent generative rollouts, projection is an intervention in the dynamics, and delaying it changes the states from which future updates are computed\. Applying it everywhere avoids this drift, but spends correction effort indiscriminately\. Correction timing is therefore a budgeted control decision rather than a binary choice between terminal and stepwise projection\.
At matched projection budgets, adaptive scheduling improves pathwise fidelity over periodic correction by spending projections on high\-defect steps rather than uniformly in time\. The gains are largest when defect is concentrated, and smaller when violations are diffuse, which is exactly the regime predicted by the scheduling view\. In the PDM experiment, the same idea improves a learned projected diffusion sampler without changing its model, constraints, or projection operator\.
### 5\.1Limitations and future work
Our scheduler uses one\-step defect as a proxy for downstream trajectory error\. This is effective when local constraint violation identifies steps that will perturb future updates, but it can fail when defect is poorly aligned with trajectory distortion or when small violations are dynamically amplified\. The method also assumes access to a stable projection operator\. If projection is cheap, stepwise correction may be preferable, and if projection is unstable or approximate, both the defect signal and corrected rollout may degrade\. Finally, our learned diffusion experiment isolates projection timing by wrapping an existing PDM sampler, but larger models and higher\-dimensional constraints may introduce interactions or noise not captured here\.
Future work could learn correction value directly, add lookahead or uncertainty estimates, and train generative models with the scheduler in the loop\. More broadly, constrained generation should treat feasibility operations as trajectory\-level interventions: the path that produces a feasible sample matters\.
## References
- Constrained Discrete Diffusion\.arXiv\.Note:arXiv:2503\.09790 \[cs\]External Links:[Link](http://arxiv.org/abs/2503.09790),[Document](https://dx.doi.org/10.48550/arXiv.2503.09790)Cited by:[§2](https://arxiv.org/html/2605.11214#S2.p1.1)\.
- M\. Cardei, J\. K\. Christopher, B\. Kailkhura, T\. Hartvigsen, and F\. Fioretto \(2025b\)Constrained Molecular Generation with Discrete Diffusion for Drug Discovery\.\(en\)\.External Links:[Link](https://openreview.net/forum?id=YVbHExvm05)Cited by:[§1](https://arxiv.org/html/2605.11214#S1.p1.1),[§2](https://arxiv.org/html/2605.11214#S2.p1.1)\.
- C\. Cheng, B\. Han, D\. C\. Maddix, A\. F\. Ansari, A\. Stuart, M\. W\. Mahoney, and B\. Wang \(2024\)Gradient\-Free Generation for Hard\-Constrained Systems\.\(en\)\.External Links:[Link](https://openreview.net/forum?id=teE4pl9ftK)Cited by:[§2](https://arxiv.org/html/2605.11214#S2.p1.1)\.
- C\. Chi, Z\. Xu, S\. Feng, E\. Cousineau, Y\. Du, B\. Burchfiel, R\. Tedrake, and S\. Song \(2024\)Diffusion Policy: Visuomotor Policy Learning via Action Diffusion\.arXiv\.Note:arXiv:2303\.04137 \[cs\] version: 5External Links:[Link](http://arxiv.org/abs/2303.04137),[Document](https://dx.doi.org/10.48550/arXiv.2303.04137)Cited by:[§2](https://arxiv.org/html/2605.11214#S2.p1.1)\.
- J\. K\. Christopher, S\. Baek, and F\. Fioretto \(2024\)Constrained Synthesis with Projected Diffusion Models\.\(en\)\.External Links:[Link](https://openreview.net/forum?id=FsdB3I9Y24&referrer=%5Bthe%20profile%20of%20Ferdinando%20Fioretto%5D(%2Fprofile%3Fid%3D%CB%9CFerdinando_Fioretto1))Cited by:[§F\.1](https://arxiv.org/html/2605.11214#A6.SS1.p1.1),[§1](https://arxiv.org/html/2605.11214#S1.p1.1),[§2](https://arxiv.org/html/2605.11214#S2.p1.1),[§2](https://arxiv.org/html/2605.11214#S2.p3.1),[§4\.4](https://arxiv.org/html/2605.11214#S4.SS4.p1.2)\.
- E\. Hairer \(2001\)Geometric Integration of Ordinary Differential Equations on Manifolds\.BIT Numerical Mathematics41\(5\),pp\. 996–1007\(en\)\.External Links:ISSN 1572\-9125,[Link](https://doi.org/10.1023/A:1021989212020),[Document](https://dx.doi.org/10.1023/A%3A1021989212020)Cited by:[§2](https://arxiv.org/html/2605.11214#S2.p3.1)\.
- M\. Hoseinpour and V\. Dvorkin \(2025\)Constrained Diffusion Models for Synthesizing Representative Power Flow Datasets\.arXiv\.Note:arXiv:2506\.11281 \[cs\]External Links:[Link](http://arxiv.org/abs/2506.11281),[Document](https://dx.doi.org/10.48550/arXiv.2506.11281)Cited by:[§2](https://arxiv.org/html/2605.11214#S2.p1.1)\.
- M\. Janner, Y\. Du, J\. B\. Tenenbaum, and S\. Levine \(2022\)Planning with Diffusion for Flexible Behavior Synthesis\.arXiv\.Note:arXiv:2205\.09991 \[cs\]External Links:[Link](http://arxiv.org/abs/2205.09991),[Document](https://dx.doi.org/10.48550/arXiv.2205.09991)Cited by:[§2](https://arxiv.org/html/2605.11214#S2.p1.1)\.
- K\. Kim and S\. Kim \(2026\)Model Already Knows the Best Noise: Bayesian Active Noise Selection via Attention in Video Diffusion Model\.arXiv\.Note:arXiv:2505\.17561 \[cs\]External Links:[Link](http://arxiv.org/abs/2505.17561),[Document](https://dx.doi.org/10.48550/arXiv.2505.17561)Cited by:[§2](https://arxiv.org/html/2605.11214#S2.p4.1)\.
- A\. Li, Z\. Ding, A\. B\. Dieng, and R\. Beeson \(2024\)DiffuSolve: Diffusion\-based Solver for Non\-convex Trajectory Optimization\.arXiv\.Note:arXiv:2403\.05571 \[cs\] version: 4External Links:[Link](http://arxiv.org/abs/2403.05571),[Document](https://dx.doi.org/10.48550/arXiv.2403.05571)Cited by:[§1](https://arxiv.org/html/2605.11214#S1.p1.1),[§2](https://arxiv.org/html/2605.11214#S2.p1.1)\.
- J\. Liang, J\. K\. Christopher, S\. Koenig, and F\. Fioretto \(2025\)Simultaneous Multi\-Robot Motion Planning with Projected Diffusion Models\.arXiv\.Note:arXiv:2502\.03607 \[cs\]External Links:[Link](http://arxiv.org/abs/2502.03607),[Document](https://dx.doi.org/10.48550/arXiv.2502.03607)Cited by:[§F\.1](https://arxiv.org/html/2605.11214#A6.SS1.p1.1),[§1](https://arxiv.org/html/2605.11214#S1.p1.1),[§2](https://arxiv.org/html/2605.11214#S2.p1.1),[§4\.4](https://arxiv.org/html/2605.11214#S4.SS4.p1.2)\.
- M\. B\. Liñán and D\. M\. d\. Diego \(2023\)Retraction maps: a seed of geometric integrators\.Foundations of Computational Mathematics23\(4\),pp\. 1335–1380\.Note:arXiv:2106\.00607 \[math\]External Links:ISSN 1615\-3375, 1615\-3383,[Link](http://arxiv.org/abs/2106.00607),[Document](https://dx.doi.org/10.1007/s10208-022-09571-x)Cited by:[§2](https://arxiv.org/html/2605.11214#S2.p3.1)\.
- R\. I\. McLachlan, K\. Modin, O\. Verdier, and M\. Wilkins \(2014\)Geometric Generalisations of SHAKE and RATTLE\.Foundations of Computational Mathematics14\(2\),pp\. 339–370\.Note:arXiv:1207\.3367 \[math\]External Links:ISSN 1615\-3375, 1615\-3383,[Link](http://arxiv.org/abs/1207.3367),[Document](https://dx.doi.org/10.1007/s10208-013-9163-y)Cited by:[§2](https://arxiv.org/html/2605.11214#S2.p3.1)\.
- R\. Ni and A\. H\. Qureshi \(2024\)Physics\-informed Neural Motion Planning on Constraint Manifolds\.arXiv\.Note:arXiv:2403\.05765 \[cs\]External Links:[Link](http://arxiv.org/abs/2403.05765),[Document](https://dx.doi.org/10.48550/arXiv.2403.05765)Cited by:[§2](https://arxiv.org/html/2605.11214#S2.p1.1)\.
- J\. Pei, H\. Hu, and S\. Gu \(2025\)Optimal Stepsize for Diffusion Sampling\.arXiv\.Note:arXiv:2503\.21774 \[cs\] version: 1External Links:[Link](http://arxiv.org/abs/2503.21774),[Document](https://dx.doi.org/10.48550/arXiv.2503.21774)Cited by:[§2](https://arxiv.org/html/2605.11214#S2.p4.1)\.
- R\. D\. Santi, K\. Protopapas, Y\. Hsieh, and A\. Krause \(2025\)Verifier\-Constrained Flow Expansion for Discovery Beyond the Data\.\(en\)\.External Links:[Link](https://openreview.net/forum?id=IfDYQbsWf4)Cited by:[§1](https://arxiv.org/html/2605.11214#S1.p1.1),[§2](https://arxiv.org/html/2605.11214#S2.p1.1)\.
- A\. Séguin, G\. Ceruti, and D\. Kressner \(2024\)From low\-rank retractions to dynamical low\-rank approximation and back\.BIT Numerical Mathematics64\(3\),pp\. 25\(en\)\.External Links:ISSN 1572\-9125,[Link](https://doi.org/10.1007/s10543-024-01028-7),[Document](https://dx.doi.org/10.1007/s10543-024-01028-7)Cited by:[§2](https://arxiv.org/html/2605.11214#S2.p3.1)\.
- U\. Utkarsh, P\. Cai, A\. Edelman, R\. Gomez\-Bombarelli, and C\. V\. Rackauckas \(2025\)Physics\-Constrained Flow Matching: Sampling Generative Models with Hard Constraints\.arXiv\.Note:arXiv:2506\.04171 \[cs\]External Links:[Link](http://arxiv.org/abs/2506.04171),[Document](https://dx.doi.org/10.48550/arXiv.2506.04171)Cited by:[§2](https://arxiv.org/html/2605.11214#S2.p1.1)\.
- W\. Xiao, T\. Wang, C\. Gan, R\. Hasani, M\. Lechner, and D\. Rus \(2024\)SafeDiffuser: Safe Planning with Diffusion Probabilistic Models\.\(en\)\.External Links:[Link](https://openreview.net/forum?id=ig2wk7kK9J)Cited by:[§2](https://arxiv.org/html/2605.11214#S2.p1.1)\.
- Z\. Yang, X\. Dai, D\. Yu, Z\. Li, M\. Khadiv, S\. Hirche, and S\. Haddadin \(2026\)UniConFlow: A Unified Constrained Flow\-Matching Framework for Certified Motion Planning\.arXiv\.Note:arXiv:2506\.02955 \[cs\] version: 2External Links:[Link](http://arxiv.org/abs/2506.02955),[Document](https://dx.doi.org/10.48550/arXiv.2506.02955)Cited by:[§2](https://arxiv.org/html/2605.11214#S2.p1.1)\.
- Z\. Ye, Z\. Chen, T\. Li, Z\. Huang, W\. Luo, and G\. Qi \(2025\)Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation\.arXiv\.Note:arXiv:2412\.01243 \[cs\] version: 3External Links:[Link](http://arxiv.org/abs/2412.01243),[Document](https://dx.doi.org/10.48550/arXiv.2412.01243)Cited by:[§2](https://arxiv.org/html/2605.11214#S2.p4.1)\.
- H\. Zhang, Z\. Wu, Z\. Xing, J\. Shao, and Y\. Jiang \(2024\)AdaDiff: Adaptive Step Selection for Fast Diffusion Models\.arXiv\.Note:arXiv:2311\.14768 \[cs\] version: 2External Links:[Link](http://arxiv.org/abs/2311.14768),[Document](https://dx.doi.org/10.48550/arXiv.2311.14768)Cited by:[§2](https://arxiv.org/html/2605.11214#S2.p4.1)\.
- J\. Zhang, L\. Zhao, A\. Papachristodoulou, and J\. Umenberger \(2025\)Constrained Diffusers for Safe Planning and Control\.arXiv\.Note:arXiv:2506\.12544 \[eess\]External Links:[Link](http://arxiv.org/abs/2506.12544),[Document](https://dx.doi.org/10.48550/arXiv.2506.12544)Cited by:[§1](https://arxiv.org/html/2605.11214#S1.p1.1),[§2](https://arxiv.org/html/2605.11214#S2.p1.1)\.
## Appendix AOverview
This appendix provides additional details for the formulation, implementation, metrics, and experiments in the main paper\. Sec\.[B](https://arxiv.org/html/2605.11214#A2)fixes notation and describes all correction schedules, Sec\.[C](https://arxiv.org/html/2605.11214#A3)gives the full online budgeted scheduler and threshold construction, Sec\.[D](https://arxiv.org/html/2605.11214#A4)defines all reported metrics, Sec\.[E](https://arxiv.org/html/2605.11214#A5)describes the controlled manifold experiments, Sec\.[F](https://arxiv.org/html/2605.11214#A6)details the Projected Diffusion Models experiment, Sec\.[G](https://arxiv.org/html/2605.11214#A7)gives implementation details, Sec\.[H](https://arxiv.org/html/2605.11214#A8)gives reproducibility notes, and Sec\.[I](https://arxiv.org/html/2605.11214#A9)gives additional diagnostics and failure modes\.
## Appendix BNotation and Correction Schedules
#### Generative rollout\.
We consider a generative rollout in an ambient space𝒳⊆ℝd\\mathcal\{X\}\\subseteq\\mathbb\{R\}^\{d\}with a constraint setℳ⊂𝒳\\mathcal\{M\}\\subset\\mathcal\{X\}\. A rollout is generated by repeatedly applying an update operator
x~t\+1=Φh\(xt\),t=0,…,T−1,\\tilde\{x\}\_\{t\+1\}=\\Phi\_\{h\}\(x\_\{t\}\),\\qquad t=0,\\ldots,T\-1,wherehhdenotes the step size or sampler discretization parameter\. In diffusion models,Φh\\Phi\_\{h\}may be one reverse diffusion update, one denoising update, or one inner Langevin update, depending on the sampler\. In all experiments, the scheduling horizonTTcounts the number of update locations at which projection could be applied\.
#### Projection or retraction\.
We assume access to a correction map
Π:𝒳→ℳ,\\Pi:\\mathcal\{X\}\\to\\mathcal\{M\},which maps a proposed state back to the constraint set\. We use “projection” broadly:Π\\Pimay be an exact Euclidean projection, a retraction, a constraint solver, or the projection operator inherited from a projected diffusion method\. The scheduler does not require gradients throughΠ\\Pi\.
#### Defect\.
For a proposed updatex~t\+1\\tilde\{x\}\_\{t\+1\}, the one\-step defect is
st=d\(x~t\+1,ℳ\),s\_\{t\}=d\(\\tilde\{x\}\_\{t\+1\},\\mathcal\{M\}\),whereddis a domain\-specific constraint violation or distance\-to\-feasibility score\. When a closed\-form defect is unavailable, we use the projection residual
d\(x,ℳ\)=‖x−Π\(x\)‖,d\(x,\\mathcal\{M\}\)=\\\|x\-\\Pi\(x\)\\\|,with the norm chosen to match the state representation\.
#### Correction schedule\.
A correction schedule is a binary policy
σ:\{0,…,T−1\}→\{0,1\},\\sigma:\\\{0,\\ldots,T\-1\\\}\\to\\\{0,1\\\},whereσ\(t\)=1\\sigma\(t\)=1means projection is applied after updatett\. The corrected rollout is
xt\+1=\{Π\(x~t\+1\)ifσ\(t\)=1,x~t\+1ifσ\(t\)=0\.x\_\{t\+1\}=\\begin\{cases\}\\Pi\(\\tilde\{x\}\_\{t\+1\}\)&\\text\{if \}\\sigma\(t\)=1,\\\\ \\tilde\{x\}\_\{t\+1\}&\\text\{if \}\\sigma\(t\)=0\.\\end\{cases\}
#### Standard schedules\.
The main paper compares four correction schedules:
- •Terminal correction:no intermediate projection is applied, and projection is applied only at the end of the rollout\.
- •Stepwise correction:projection is applied after every update\.
- •Periodic correction:projection is applied uniformly in time subject to a fixed projection budget\.
- •Adaptive budgeted correction:projection is applied online according to defect, time, and remaining budget\.
#### Projection budget\.
For a rollout of lengthTT, a budgetBBpermits at mostBBprojection calls\. We report budget as the fraction
Periodic and adaptive schedules are compared at matchedB/TB/T\. Stepwise correction hasB/T=1B/T=1\. Terminal correction has only a final correction; when normalized against stepwise cost, its intermediate projection budget is effectively zero\.
## Appendix COnline Budgeted Scheduling
### C\.1Ideal budgeted allocation
The scheduling problem can be written as a finite\-budget allocation problem\. LetVt\(xt\)V\_\{t\}\(x\_\{t\}\)denote the downstream value of correcting at steptt, i\.e\. the expected reduction in future trajectory error if projection is applied at the current proposal\. The ideal schedule solves
maxσ∑t=0T−1σ\(t\)Vt\(xt\)s\.t\.∑t=0T−1σ\(t\)≤B,\\max\_\{\\sigma\}\\sum\_\{t=0\}^\{T\-1\}\\sigma\(t\)V\_\{t\}\(x\_\{t\}\)\\qquad\\text\{s\.t\.\}\\qquad\\sum\_\{t=0\}^\{T\-1\}\\sigma\(t\)\\leq B,whereσ\(t\)∈\{0,1\}\\sigma\(t\)\\in\\\{0,1\\\}\.
This formulation is conceptually useful but intractable in practice, as estimatingVt\(xt\)V\_\{t\}\(x\_\{t\}\)requires knowing how a correction changes future stochastic updates, which depends on the remaining sampler trajectory\. We therefore approximate marginal value using the observable one\-step defectsts\_\{t\}\.
### C\.2Budget\-aware thresholds
The online budgeted scheduler uses thresholds indexed by time and remaining budget:
λt,b,t=0,…,T−1,b=0,…,B\.\\lambda\_\{t,b\},\\qquad t=0,\\ldots,T\-1,\\quad b=0,\\ldots,B\.The thresholdλt,b\\lambda\_\{t,b\}is interpreted as the marginal price of spending one correction at timettwithbbcorrections remaining\. Given a proposed updatex~t\+1\\tilde\{x\}\_\{t\+1\}and defectsts\_\{t\}, the policy is
xt\+1=\{Π\(x~t\+1\)ifbt\>0andst≥λt,bt,x~t\+1otherwise\.x\_\{t\+1\}=\\begin\{cases\}\\Pi\(\\tilde\{x\}\_\{t\+1\}\)&\\text\{if \}b\_\{t\}\>0\\text\{ and \}s\_\{t\}\\geq\\lambda\_\{t,b\_\{t\}\},\\\\ \\tilde\{x\}\_\{t\+1\}&\\text\{otherwise\.\}\\end\{cases\}If projection is applied, thenbt\+1=bt−1b\_\{t\+1\}=b\_\{t\}\-1; otherwisebt\+1=btb\_\{t\+1\}=b\_\{t\}\.
The dependence on bothttandbbis important\. Early in the rollout, a moderate defect may not justify spending a scarce projection if many future opportunities remain\. Late in the rollout, the same defect may be worth correcting because unused budget has less future value\. The threshold surfaceλt,b\\lambda\_\{t,b\}captures this online tradeoff\.
### C\.3Empirical threshold construction
We estimateλt,b\\lambda\_\{t,b\}on held\-out calibration rollouts disjoint from evaluation seeds\. The calibration procedure collects defect traces from rollouts without intermediate adaptive correction\. Let
𝒮t=\{si,t\}i=1Ncal\\mathcal\{S\}\_\{t\}=\\\{s\_\{i,t\}\\\}\_\{i=1\}^\{N\_\{\\mathrm\{cal\}\}\}denote the empirical distribution of defects at timettacross calibration rollouts\. A simple budget\-aware threshold can be constructed by considering the future defect pool
𝒮≥t=\{si,u:i=1,…,Ncal,u=t,…,T−1\}\.\\mathcal\{S\}\_\{\\geq t\}=\\\{s\_\{i,u\}:i=1,\\ldots,N\_\{\\mathrm\{cal\}\},\\ u=t,\\ldots,T\-1\\\}\.For remaining budgetbb, we set
λt,b=Q1−b/\(T−t\)\(𝒮≥t\),\\lambda\_\{t,b\}=Q\_\{1\-b/\(T\-t\)\}\(\\mathcal\{S\}\_\{\\geq t\}\),whereQqQ\_\{q\}is the empiricalqq\-quantile\. The convention is:
λt,0=\+∞,λt,b=−∞ifb≥T−t\.\\lambda\_\{t,0\}=\+\\infty,\\qquad\\lambda\_\{t,b\}=\-\\infty\\text\{ if \}b\\geq T\-t\.Thus, when no budget remains, correction is impossible; when enough budget remains to correct every future step, the scheduler corrects all remaining proposals\.
This rule estimates the defect level above which a future proposal belongs to the topbbremaining events\. It is online, model\-agnostic, and uses only held\-out defect statistics\.
### C\.4Budget compliance
Adaptive schedules are evaluated against periodic schedules with the same nominal budgetB/TB/T\. In implementations where the threshold rule may underspend, we record the achieved budget
B^/T=1T∑t=0T−1σ\(t\)\.\\widehat\{B\}/T=\\frac\{1\}\{T\}\\sum\_\{t=0\}^\{T\-1\}\\sigma\(t\)\.Main results use settings in which adaptive and periodic budgets are matched up to numerical tolerance\. Tables report the target budget, and logs record achieved budget means and standard errors\.
### C\.5Algorithm
Algorithm 2Online budgeted adaptive correction1:rollout length
TT, budget
BB, thresholds
\{λt,b\}\\\{\\lambda\_\{t,b\}\\\}, update rule
Φh\\Phi\_\{h\}, projection
Π\\Pi, defect
dd
2:
b←Bb\\leftarrow B
3:for
t=0,…,T−1t=0,\\ldots,T\-1do
4:
x~t\+1←Φh\(xt\)\\tilde\{x\}\_\{t\+1\}\\leftarrow\\Phi\_\{h\}\(x\_\{t\}\)
5:
st←d\(x~t\+1,ℳ\)s\_\{t\}\\leftarrow d\(\\tilde\{x\}\_\{t\+1\},\\mathcal\{M\}\)
6:if
b\>0b\>0and
st≥λt,bs\_\{t\}\\geq\\lambda\_\{t,b\}then
7:
xt\+1←Π\(x~t\+1\)x\_\{t\+1\}\\leftarrow\\Pi\(\\tilde\{x\}\_\{t\+1\}\)
8:
b←b−1b\\leftarrow b\-1
9:else
10:
xt\+1←x~t\+1x\_\{t\+1\}\\leftarrow\\tilde\{x\}\_\{t\+1\}
11:endif
12:endfor
## Appendix DMetrics
### D\.1Endpoint distance to stepwise
Endpoint distance measures how far a method’s final sample deviates from the stepwise reference:
Eend\(σ\)=ρ\(xTσ,xTstep\),E\_\{\\mathrm\{end\}\}\(\\sigma\)=\\rho\(x\_\{T\}^\{\\sigma\},x\_\{T\}^\{\\mathrm\{step\}\}\),whereρ\\rhois the domain\-appropriate state distance\. For Euclidean states,ρ\\rhois the Euclidean norm, and for manifold\-valued states,ρ\\rhois the corresponding geodesic or product distance\. In all cases, endpoint distance is computed between matched rollouts sharing the same initial condition and stochastic seed\.
Endpoint distance isolates the distributional shift induced by delayed correction\. A terminally projected sample may be feasible, but if it differs substantially from the stepwise endpoint, then final feasibility did not recover the constrained rollout\.
### D\.2Pathwise error
We use pathwise error to measure trajectory\-level deviation\. In the synthetic experiments, the primary pathwise score is cumulative constraint defect:
Epath\(σ\)=∑t=1Td\(xtσ,ℳ\)\.E\_\{\\mathrm\{path\}\}\(\\sigma\)=\\sum\_\{t=1\}^\{T\}d\(x\_\{t\}^\{\\sigma\},\\mathcal\{M\}\)\.
For comparisons to a stepwise reference trajectory, we also compute state\-space pathwise deviation:
Estate\(σ\)=∑t=1Tρ\(xtσ,xtstep\)\.E\_\{\\mathrm\{state\}\}\(\\sigma\)=\\sum\_\{t=1\}^\{T\}\\rho\(x\_\{t\}^\{\\sigma\},x\_\{t\}^\{\\mathrm\{step\}\}\)\.The main paper reports normalized excess pathwise error using the pathwise score appropriate to each experiment\.
### D\.3Normalized excess pathwise error
Normalized excess pathwise error \(NEPE\) is defined between stepwise correction and terminal correction:
NEPE\(σ\)=Epath\(σ\)−Epath\(stepwise\)Epath\(terminal\)−Epath\(stepwise\)\.\\mathrm\{NEPE\}\(\\sigma\)=\\frac\{E\_\{\\mathrm\{path\}\}\(\\sigma\)\-E\_\{\\mathrm\{path\}\}\(\\mathrm\{stepwise\}\)\}\{E\_\{\\mathrm\{path\}\}\(\\mathrm\{terminal\}\)\-E\_\{\\mathrm\{path\}\}\(\\mathrm\{stepwise\}\)\}\.Thus stepwise correction has NEPE0and terminal correction has NEPE11\. Lower values are better\. When the denominator is below a numerical tolerance, the rollout is marked degenerate and excluded from normalized summaries\. Raw pathwise values are retained in logs\.
### D\.4Improvement percentages
For a metricmmwhere lower is better, the adaptive improvement over periodic is
Δm=mperiodic−madaptivemperiodic\.\\Delta\_\{m\}=\\frac\{m\_\{\\mathrm\{periodic\}\}\-m\_\{\\mathrm\{adaptive\}\}\}\{m\_\{\\mathrm\{periodic\}\}\}\.Positive values indicate adaptive improvement\. Negative values indicate periodic is better\. Tables computeΔm\\Delta\_\{m\}from unrounded paired values\.
For PDM, we also report the fraction of full correction benefit recovered:
Benefit\(σ\)=1−NEPE\(σ\)\.\\mathrm\{Benefit\}\(\\sigma\)=1\-\\mathrm\{NEPE\}\(\\sigma\)\.
### D\.5Win rates
Win rates compare adaptive and periodic schedules at matched domain, seed, and projection budget\. For metricmm, the adaptive win indicator is
𝟏\[madaptive<mperiodic\]\.\\mathbf\{1\}\[m\_\{\\mathrm\{adaptive\}\}<m\_\{\\mathrm\{periodic\}\}\]\.
The reported win rate is the empirical mean of this indicator across paired comparisons\. Uncertainty is reported as binomial standard error:
SE=p^\(1−p^\)N,\\mathrm\{SE\}=\\sqrt\{\\frac\{\\hat\{p\}\(1\-\\hat\{p\}\)\}\{N\}\},wherep^\\hat\{p\}is the observed win rate andNNis the number of paired comparisons\.
## Appendix ESynthetic Manifold Experiments
### E\.1Overview
The synthetic experiments are designed to isolate correction timing under controlled geometry\. Each domain specifies:
- •a constraint setℳ\\mathcal\{M\};
- •an ambient update ruleΦh\\Phi\_\{h\};
- •a projection or retractionΠ\\Pi;
- •a defect functiond\(x,ℳ\)d\(x,\\mathcal\{M\}\);
- •a distanceρ\\rhoused for endpoint and state\-space errors\.
We evaluate six domains:
SO\(3\),SE\(3\),Terrain,SO\(3\)\-Impulse,SE\(3\)\-Lever,Terrain\-Ridge\.\\mathrm\{SO\}\(3\),\\quad\\mathrm\{SE\}\(3\),\\quad\\mathrm\{Terrain\},\\quad\\mathrm\{SO\}\(3\)\\text\{\-Impulse\},\\quad\\mathrm\{SE\}\(3\)\\text\{\-Lever\},\\quad\\mathrm\{Terrain\}\\text\{\-Ridge\}\.
The first three provide smooth baseline settings\. The latter three introduce localized high\-defect events, producing the heterogeneous defect profiles for which adaptive scheduling is designed\.
### E\.2SO\(3\)\\mathrm\{SO\}\(3\)
States are represented as matrices inℝ3×3\\mathbb\{R\}^\{3\\times 3\}with the constraint set
ℳ=SO\(3\)=\{R∈ℝ3×3:R⊤R=I,det\(R\)=1\}\.\\mathcal\{M\}=\\mathrm\{SO\}\(3\)=\\\{R\\in\\mathbb\{R\}^\{3\\times 3\}:R^\{\\top\}R=I,\\ \\det\(R\)=1\\\}\.
Projection is implemented by polar decomposition\. Given an ambient matrixAA, compute
A=UΣV⊤,A=U\\Sigma V^\{\\top\},and set
Π\(A\)=UV⊤,\\Pi\(A\)=UV^\{\\top\},with the determinant corrected if necessary\. The defect is the orthogonality residual
d\(A,SO\(3\)\)=‖A⊤A−I‖F\+\|det\(A\)−1\|\.d\(A,\\mathrm\{SO\}\(3\)\)=\\\|A^\{\\top\}A\-I\\\|\_\{F\}\+\|\\det\(A\)\-1\|\.
Endpoint distances use the geodesic rotation distance
ρ\(R1,R2\)=‖log\(R1⊤R2\)‖F/2\.\\rho\(R\_\{1\},R\_\{2\}\)=\\\|\\log\(R\_\{1\}^\{\\top\}R\_\{2\}\)\\\|\_\{F\}/\\sqrt\{2\}\.
### E\.3SE\(3\)\\mathrm\{SE\}\(3\)
States consist of a rotation and translation pair\(R,p\)\(R,p\), with
R∈SO\(3\),p∈ℝ3\.R\\in\\mathrm\{SO\}\(3\),\\qquad p\\in\\mathbb\{R\}^\{3\}\.
Projection applies theSO\(3\)\\mathrm\{SO\}\(3\)projection to the rotational block and leaves translation unchanged unless the domain\-specific constraint requires translation correction\. The defect combines rotational feasibility and translational constraint violation:
d\(\(A,p\),SE\(3\)\)=d\(A,SO\(3\)\)\+αdtrans\(p\)\.d\(\(A,p\),\\mathrm\{SE\}\(3\)\)=d\(A,\\mathrm\{SO\}\(3\)\)\+\\alpha d\_\{\\mathrm\{trans\}\}\(p\)\.
Endpoint distance is the weighted product metric
ρ\(\(R1,p1\),\(R2,p2\)\)=ρSO\(3\)\(R1,R2\)\+α‖p1−p2‖2\.\\rho\(\(R\_\{1\},p\_\{1\}\),\(R\_\{2\},p\_\{2\}\)\)=\\rho\_\{\\mathrm\{SO\}\(3\)\}\(R\_\{1\},R\_\{2\}\)\+\\alpha\\\|p\_\{1\}\-p\_\{2\}\\\|\_\{2\}\.
### E\.4Terrain
The terrain domain uses a graph\-like constraint manifold
ℳ=\{\(u,v,z\):z=f\(u,v\)\},\\mathcal\{M\}=\\\{\(u,v,z\):z=f\(u,v\)\\\},whereffis a smooth terrain height field\. Projection maps an ambient point\(u,v,z\)\(u,v,z\)to
Π\(u,v,z\)=\(u,v,f\(u,v\)\)\.\\Pi\(u,v,z\)=\(u,v,f\(u,v\)\)\.
The defect is vertical deviation:
d\(\(u,v,z\),ℳ\)=\|z−f\(u,v\)\|\.d\(\(u,v,z\),\\mathcal\{M\}\)=\|z\-f\(u,v\)\|\.
Endpoint and pathwise distances are computed in ambient Euclidean coordinates\.
### E\.5Impulse, lever, and ridge variants
The volatile variants introduce localized regions where the ambient dynamics produce larger mismatch with the constraint set\. These variants are not separate methods; their purpose is to create rollouts in which defect mass is concentrated in a small subset of steps\.
#### SO\(3\)\\mathrm\{SO\}\(3\)\-Impulse\.
The impulse variant adds a localized ambient perturbation to the rotation update, producing short bursts of large orthogonality defect\.
#### SE\(3\)\\mathrm\{SE\}\(3\)\-Lever\.
The lever variant couples rotational and translational errors so that small rotational drift can induce amplified endpoint displacement\. This tests whether adaptive correction can identify geometrically consequential errors rather than merely large Euclidean deviations\.
#### Terrain\-Ridge\.
The ridge terrain adds high\-curvature regions to the height field\. Rollouts crossing the ridge produce localized projection–dynamics mismatch, while flatter regions remain low\-defect\.
### E\.6Defect concentration
To quantify how concentrated defect is along a rollout, we compute top\-qqdefect mass\. Lets1,…,sTs\_\{1\},\\ldots,s\_\{T\}be the defect sequence and letIqI\_\{q\}index the largest⌈qT⌉\\lceil qT\\rceildefect values\. The top\-qqmass is
Cq=∑t∈Iqst∑t=1Tst\.C\_\{q\}=\\frac\{\\sum\_\{t\\in I\_\{q\}\}s\_\{t\}\}\{\\sum\_\{t=1\}^\{T\}s\_\{t\}\}\.
In the main experiments,q=0\.2q=0\.2unless otherwise stated\. LargeCqC\_\{q\}indicates that a small fraction of steps accounts for most of the defect\. This is the regime where adaptive scheduling should have the largest advantage over periodic correction\.
### E\.7Additional Results
Figure 6:*Endpoint distance to the stepwise constrained reference\.*Delayed correction can change the final sample even when a terminal projection restores feasibility\. Adaptive scheduling often reduces this endpoint shift at fixed projection budget, especially in volatile domains, but endpoint gains are more geometry\-dependent than pathwise gains\. This complements the main Normalized Excess Path Error results in Fig\.[3](https://arxiv.org/html/2605.11214#S4.F3)\.Table 3:*Adaptive scheduling is reliable on paired budget\-matched comparisons\.*Each comparison matches domain, seed, and projection budget\. Endpoint win rate reports how often adaptive ends closer to the stepwise reference than periodic; pathwise win rate reports how often adaptive obtains lower Normalized Excess Path Error\. Adaptive is especially reliable on the pathwise metric, winning8585–90%90\\%of paired comparisons across domains, while endpoint wins vary with geometry\.Fig\.[6](https://arxiv.org/html/2605.11214#A5.F6)provides visualization of endpoint distance from stepwise to periodic and adaptive policies alongside the metrics in Tab\.[1](https://arxiv.org/html/2605.11214#S4.T1)\. Tab\.[3](https://arxiv.org/html/2605.11214#A5.T3)provides paired win\-rate diagnostics to accompany the rollouts reported in Tab\.[1](https://arxiv.org/html/2605.11214#S4.T1)\.
## Appendix FProjected Diffusion Models Experiment
### F\.1Purpose
The PDM experiment tests whether correction scheduling can wrap an existing constrained diffusion sampler\. Unlike the controlled synthetic domains, this experiment does not introduce a new generative model or projection operator: we inherit the model, constraints, and projection machinery from Projected Diffusion Models \(PDM\), and vary only when projection is applied\[Christopheret al\.,[2024](https://arxiv.org/html/2605.11214#bib.bib19), Lianget al\.,[2025](https://arxiv.org/html/2605.11214#bib.bib13)\]\.
### F\.2Original PDM as stepwise projection
PDM applies projection throughout sampling\. In the trajectory experiment used here, projection is applied after every inner Langevin update\. Therefore, in our terminology, original PDM is the stepwise baseline:
σPDM\(t\)=1∀t\.\\sigma\_\{\\mathrm\{PDM\}\}\(t\)=1\\qquad\\forall t\.
The scheduling horizonTTcounts inner Langevin updates, not just outer diffusion noise levels\. This is important for cost accounting: if the sampler usesKKouter noise levels andLLinner Langevin steps per level, then
A budgetB/T=0\.25B/T=0\.25therefore means projection is applied to one quarter of all inner Langevin update locations\.
### F\.3Wrapped schedules
We evaluate four variants:
- •Original PDM / stepwise:project after every inner Langevin update\.
- •Terminal:run the sampler without intermediate projection and apply projection at the end\.
- •Periodic:apply projection uniformly over inner Langevin updates under budgetBB\.
- •Adaptive budgeted:apply projection whenst≥λt,bts\_\{t\}\\geq\\lambda\_\{t,b\_\{t\}\}and budget remains\.
All variants use the same trained PDM model, the same constraints, and the same projection operator\.
### F\.4Defect for PDM
When PDM exposes an explicit constraint violation, we use that violation as the defect\. Otherwise, we define defect by projection residual:
st=‖x~t\+1−Π\(x~t\+1\)‖\.s\_\{t\}=\\\|\\tilde\{x\}\_\{t\+1\}\-\\Pi\(\\tilde\{x\}\_\{t\+1\}\)\\\|\.
This measures how far the proposed state lies from the constraint set under the same projection operator used by PDM\.
### F\.5Metrics
PDM NEPE is normalized between original PDM and terminal correction:
NEPE\(σ\)=Epath\(σ\)−Epath\(PDM\)Epath\(terminal\)−Epath\(PDM\)\.\\mathrm\{NEPE\}\(\\sigma\)=\\frac\{E\_\{\\mathrm\{path\}\}\(\\sigma\)\-E\_\{\\mathrm\{path\}\}\(\\mathrm\{PDM\}\)\}\{E\_\{\\mathrm\{path\}\}\(\\mathrm\{terminal\}\)\-E\_\{\\mathrm\{path\}\}\(\\mathrm\{PDM\}\)\}\.
Thus, original PDM has NEPE0and terminal correction has NEPE11\. Lower values indicate closer agreement with the fully projected PDM trajectory\.
### F\.6Fixed\-budget result
AtB/T≈0\.25B/T\\approx 0\.25, periodic correction achieves NEPE0\.486±0\.0090\.486\\pm 0\.009, while adaptive budgeted correction achieves NEPE0\.288±0\.0090\.288\\pm 0\.009\. The improvement over periodic is41%41\\%\. Since original PDM uses projection at every inner Langevin update, this operating point uses only25%25\\%of the original projection calls, saving75%75\\%of projection calls\. In absolute terms, adaptive scheduling recovers71\.2%71\.2\\%of the full PDM correction benefit at one quarter of the projection cost\.
## Appendix GImplementation Details
### G\.1Random seeds and paired evaluation
All comparisons are paired by seed\. For each initial condition and random seed, we run terminal, periodic, adaptive, and stepwise schedules under the same underlying stochastic updates whenever possible\. This reduces variance and ensures that differences are attributable to projection timing rather than different noise realizations\.
### G\.2Calibration/evaluation split
Thresholds are estimated using held\-out calibration rollouts, and evaluation rollouts are disjoint from calibration rollouts\. Synthetic and PDM experiments each use separate calibration and evaluation seeds to prevent the adaptive scheduler from selecting thresholds based on evaluation trajectories\.
### G\.3Budget grid
Experiments evaluate a grid of projection budgets from0\.000\.00to1\.001\.00with a step size of0\.050\.05\. Tables report a representative operating point atB/T≈0\.25B/T\\approx 0\.25\.
### G\.4Uncertainty
Curves report means over seeds\. Shaded regions, when shown, indicate standard error of the mean\. Tables report mean±\\pmstandard error unless otherwise stated\. Win\-rate uncertainties use binomial standard error\.
### G\.5Numerical safeguards
Normalized metrics are not reported when the normalization denominator is below tolerance\. Specifically, if
Epath\(terminal\)−Epath\(stepwise\)<ϵ,E\_\{\\mathrm\{path\}\}\(\\mathrm\{terminal\}\)\-E\_\{\\mathrm\{path\}\}\(\\mathrm\{stepwise\}\)<\\epsilon,the corresponding NEPE value is marked degenerate\. This prevents near\-identical terminal and stepwise runs from producing unstable normalized values\.
### G\.6Projection cost
The primary cost metric is the number of projection calls\. This is the relevant cost for settings where projection or constraint solving is expensive relative to a model update\. We do not claim that projection count exactly equals wall\-clock time in every domain; rather, it gives a model\-independent accounting of constraint enforcement effort\. In PDM, where projection is applied after inner Langevin updates, cost is counted at the inner\-update level\.
## Appendix HReproducibility Notes
A full evaluation run consists of:
1. 1\.generating calibration rollouts;
2. 2\.estimatingλt,b\\lambda\_\{t,b\};
3. 3\.running paired evaluation rollouts for all schedules;
4. 4\.aggregating endpoint and pathwise metrics;
5. 5\.producing figures and tables\.
All full paper evaluations run on a single NVIDIA A100 GPU via Modal in fewer than88hours, with lower\-fidelity and preliminary evaluations running within2424hours on an Apple M1 \(16GB\)\.
## Appendix IAdditional Diagnostics
### I\.1Budget usage
For each method, domain, and target budget, we log:
B^/T=1T∑t=0T−1σ\(t\)\.\\widehat\{B\}/T=\\frac\{1\}\{T\}\\sum\_\{t=0\}^\{T\-1\}\\sigma\(t\)\.Periodic schedules use the requested budget by construction up to rounding\. Adaptive schedules are checked to ensure that achieved budgets match periodic budgets within tolerance\.
### I\.2Degenerate regimes
Some domains or sampler settings may produce little difference between terminal and stepwise correction\. In such cases, NEPE becomes unstable because the denominator is small\. These regimes indicate that projection timing is not consequential under that sampler and constraint configuration\. We report raw pathwise errors in diagnostics and exclude degenerate normalized values\.
### I\.3Failure modes
The method can fail or underperform when:
- •defect is poorly aligned with downstream trajectory distortion;
- •the projection operator is unstable or discontinuous;
- •constraint violation is diffuse and uniform, making periodic correction competitive;
- •the adaptive threshold surface is poorly calibrated;
- •projection cost is negligible, making stepwise correction preferable\.
These cases are consistent with the scheduling interpretation, that adaptive allocation is most useful when projection is costly and defect is concentrated\.Similar Articles
Active Learning for Conditional Generative Compressed Sensing
This paper proposes a framework for conditional generative compressed sensing, proving stable recovery bounds for prompt-conditioned models and demonstrating how prompt matching influences sampling distributions in experiments with Stable Diffusion.
Dynamic Sampling that Adapts: Self-Aware Iterative Data Persistent Optimization for Mathematical Reasoning
SAI-DPO introduces a dynamic sampling framework that adapts training data to a model's evolving capabilities during mathematical reasoning tasks, using self-aware difficulty metrics and knowledge semantic alignment to achieve state-of-the-art efficiency with less data on benchmarks like AIME24 and AMC23.
Steering Without Breaking: Mechanistically Informed Interventions for Discrete Diffusion Language Models
This paper introduces a novel adaptive scheduler for steering discrete diffusion language models using sparse autoencoders, demonstrating that targeting interventions based on when specific attributes commit improves control quality and strength over uniform methods.
Adaptive auditing of AI systems with anytime-valid guarantees
This paper introduces a statistical framework for adaptively auditing AI systems using Safe Anytime-Valid Inference (SAVI) to draw rigorous conclusions with limited data. It proposes a 'testing by betting' approach to validate model robustness while controlling type-I errors during adaptive sampling.
A Randomized Scheduler with Probabilistic Guarantees of Finding Bugs
This Microsoft Research paper introduces a randomized scheduling technique designed to provide probabilistic guarantees for uncovering bugs in software systems. Published for the ASPLOS conference, it focuses on systematic fault detection through algorithmic randomness.