Tag
This paper proposes Geodesic Flow Matching, a Riemannian transport method for denoising Spatial Semantic Pointers (SSPs) on toroidal manifolds, and demonstrates a 72% reduction in tracking error and 40% efficiency gain in a spiking neural SLAM system.
This paper introduces Temporal-Spatial Parallel Decoding (TSPD) and Confidence Extrapolation (CE) to accelerate inference in diffusion-based large language models by dynamically deciding when tokens have converged and forecasting logit trends, reducing unnecessary denoising steps while preserving output quality.
Introduces GARD, a diffusion-based framework that operates in the feature space of a feed-forward 3D reconstructor to jointly recover scene geometry and high-quality imagery from degraded inputs.
Revisits uniform diffusion models, identifying a mismatch between the plug-in ELBO and cross-entropy denoising objective, and proposes leave-one-out parameterizations along with an absorbing-state reformulation that improves generation without additional training.
This paper identifies weaknesses in existing reinforcement learning methods for diffusion language models—lack of temporal credit assignment and biased likelihood estimates—and proposes DACA-GRPO, a plug-and-play enhancement that introduces denoising progress scores and stratified masking likelihood, achieving consistent improvements across reasoning, code generation, and constrained generation benchmarks.
This paper introduces a triangulation-agnostic flow matching method for mesh-based signal generation, using Matérn processes as noise and PoissonNet as denoiser, achieving high-quality results on large meshes.
This paper introduces a new energy-based model for linear inverse problems that learns normalized posterior densities, overcoming limitations of diffusion models. It enables unbiased sampling, adaptive sampling, and blind degradation estimation, with competitive performance on ImageNet, CelebA, and AFHQ.
This paper proposes Sphere Latent Encoder, an efficient few-step image generation framework that performs denoising entirely in a spherical latent space, achieving high-quality 256×256 images with significantly reduced computational cost and improved FID scores on ImageNet-1K.
This paper introduces the Safety-Aware Denoiser (SAD), a framework for integrating safety constraints into text diffusion models during the denoising process. It aims to reduce unsafe generations while preserving quality, addressing a gap in safety research for non-autoregressive models.
This paper introduces DAWN, a latent generative baseline for World-Action Interactive Models (WAIMs) that jointly models scene evolution and action generation through recursive refinement, achieving strong long-horizon planning in autonomous driving scenarios.
This paper introduces JuRe (Just Repair), a minimal denoising network for time series anomaly detection that matches or exceeds complex neural baselines on the TSB-AD and UCR benchmarks, demonstrating that a proper manifold-projection training objective is more important than architectural complexity.
This paper identifies a Signal-to-Noise Ratio timestep (SNR-t) bias in diffusion probabilistic models during inference, where SNR-timestep alignment from training is disrupted at inference time. The authors propose a differential correction method that decomposes samples into frequency components and corrects each separately, improving generation quality across models like IDDPM, ADM, DDIM, EDM, and FLUX with minimal computational overhead.