latent-diffusion

#latent-diffusion

DiffGI: Differentiable Geometry Images for High-Fidelity Thin-Shell 3D Generation

Hugging Face Daily Papers ↗ · 6d ago Cached

DiffGI introduces a differentiable geometry image representation for high-fidelity thin-shell 3D generation, enabling end-to-end optimization and superior reconstruction quality.

0 favorites 0 likes

#latent-diffusion

Multiplayer Interactive World Models with Representation Autoencoders

Hugging Face Daily Papers ↗ · 2026-07-06 Cached

This paper introduces MIRA, the first large-scale multiplayer world model for highly dynamic physics-based environments, trained on 10,000 hours of Rocket League gameplay. The 5-billion-parameter latent diffusion model generates stable four-player rollouts in real time, with distributional quality holding steady for hours.

0 favorites 0 likes

#latent-diffusion

Patch-PODiff-ViT: Structured Latent Diffusion with Patchwise POD for Super-Resolution and Uncertainty Quantification

arXiv cs.LG ↗ · 2026-07-01 Cached

Patch-PODiff-ViT introduces a structured latent diffusion framework using patchwise Proper Orthogonal Decomposition (POD) for super-resolution and uncertainty quantification, enabling efficient diffusion with a fixed linear orthonormal basis and analytic propagation of predictive variance.

0 favorites 0 likes

#latent-diffusion

BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation

arXiv cs.AI ↗ · 2026-06-20 Cached

Introduces BrainG3N, a dual-purpose tokenizer for 3D brain MRI latent diffusion using a frozen masked autoencoder encoder for clinically informative embeddings and a CNN decoder for reconstruction, achieving state-of-the-art performance on a 23-task benchmark and enabling controllable generation and longitudinal forecasting.

0 favorites 0 likes

#latent-diffusion

Smoothing Dark Areas in Molecular Latent Diffusion

arXiv cs.LG ↗ · 2026-06-15 Cached

This paper introduces TopVAE, a topology-optimized VAE that reduces 'dark areas' in molecular latent diffusion by making the decoder internalize structural and chemical constraints, achieving significant improvements in molecular generation quality.

0 favorites 0 likes

#latent-diffusion

@artemZholus: thanks! in the second paper (https://arxiv.org/abs/2605.06388) we used your (and RAE's) recipe and it worked.

X AI KOLs Following ↗ · 2026-05-26 Cached

This paper systematically compares reconstruction-based and semantic latent spaces for action-conditioned latent diffusion world models in robotics. It finds that semantic encoders like V-JEPA 2.1 generally outperform reconstruction encoders on policy-relevant metrics, advocating for semantic latent spaces as a stronger foundation for robotics world models.

0 favorites 0 likes

#latent-diffusion

@xuanchi13: The latent-vs-pixel debate misses the point. GPT Image 2 shows what users notice: pixel-level fidelity. Latent models s…

X AI KOLs Timeline ↗ · 2026-05-26 Cached

NVIDIA introduces PiD, a Pixel Diffusion Decoder that replaces traditional VAE/RAE decoders in latent diffusion models, enabling fast, high-resolution decoding with up to 6× speedup and improved visual fidelity.

0 favorites 0 likes

#latent-diffusion

@FeitengLi: NVIDIA Spatial Intelligence Lab proposes PiD, redesigning the decoding stage in latent diffusion models. Current mainstream text-to-image generation happens in latent space, then uses a VAE decoder to map back to pixels. This decoder's…

X AI KOLs Timeline ↗ · 2026-05-25 Cached

NVIDIA Spatial Intelligence Lab proposes PiD, which redesigns the decoding stage of latent diffusion models as a conditional pixel diffusion process, unifying decoding and upsampling to achieve low-latency, high-resolution decoding.

0 favorites 0 likes

#latent-diffusion

AirfoilGen: A valid-by-construction and performance-aware latent diffusion model for airfoil generation

arXiv cs.LG ↗ · 2026-05-21 Cached

This paper proposes AirfoilGen, a latent diffusion model for airfoil shape generation that ensures geometric validity via a circle sweeping representation and enables control over aerodynamic performance (lift/drag coefficients). Experiments show 98.41% performance-conditioning accuracy, using a new dataset of over 200,000 airfoils.

0 favorites 0 likes

#latent-diffusion

Provably Learning Diffusion Models under the Manifold Hypothesis: Collapse and Refine

arXiv cs.LG ↗ · 2026-05-21 Cached

This paper identifies a collapse-and-refine mechanism in diffusion models under the manifold hypothesis, proposing Score-induced Latent Diffusion (SiLD) that provably avoids the curse of dimensionality. Experiments show SiLD matches or outperforms VAE-based latent diffusion models.

0 favorites 0 likes

#latent-diffusion

Stable Audio 3

Hacker News Top ↗ · 2026-05-20 Cached

Stable Audio 3 introduces a family of fast latent diffusion models for variable-length audio generation and editing, with open-source release of small and medium model weights.

0 favorites 0 likes

#latent-diffusion

When Latent Geometry Is Not Enough: Draft-Conditioned Latent Refinement for Non-Autoregressive Text Generation

arXiv cs.CL ↗ · 2026-05-18 Cached

This technical report investigates draft-conditioned latent refinement for non-autoregressive text generation, showing that good latent geometry does not guarantee good decoding and emphasizing decoder recoverability as a key evaluation metric.

0 favorites 0 likes

#latent-diffusion

ByteDance-Seed/Cola-DLM · Hugging Face

Reddit r/LocalLLaMA ↗ · 2026-05-15 Cached

ByteDance releases Cola-DLM, a hierarchical continuous latent-space diffusion language model combining a Text VAE with a block-causal Diffusion Transformer, available on Hugging Face with model weights, code, and paper.

0 favorites 0 likes

#latent-diffusion

The DAWN of World-Action Interactive Models

Hugging Face Daily Papers ↗ · 2026-05-12 Cached

This paper introduces DAWN, a latent generative baseline for World-Action Interactive Models (WAIMs) that jointly models scene evolution and action generation through recursive refinement, achieving strong long-horizon planning in autonomous driving scenarios.

0 favorites 0 likes

#latent-diffusion

L2P: Unlocking Latent Potential for Pixel Generation

Hugging Face Daily Papers ↗ · 2026-05-12 Cached

The L2P paper introduces a Latent-to-Pixel transfer paradigm that leverages pre-trained latent diffusion models to create efficient pixel-space models capable of 4K generation with minimal training overhead.

0 favorites 0 likes

#latent-diffusion

What Matters for Diffusion-Friendly Latent Manifold? Prior-Aligned Autoencoders for Latent Diffusion

Hugging Face Daily Papers ↗ · 2026-05-08 Cached

This article introduces Prior-Aligned Autoencoders (PAE), a new method for creating diffusion-friendly latent manifolds that achieves state-of-the-art image generation quality while enabling 13x faster training convergence.

0 favorites 0 likes

#latent-diffusion

TextLDM: Language Modeling with Continuous Latent Diffusion

Hugging Face Daily Papers ↗ · 2026-05-08 Cached

This paper introduces TextLDM, a method that adapts visual latent diffusion transformers for language modeling by mapping discrete tokens to continuous latents. It demonstrates that this approach, enhanced by representation alignment, matches GPT-2 performance and unifies visual and text generation architectures.

0 favorites 0 likes

#latent-diffusion

zhen-nan/L2P

Hugging Face Models Trending ↗ · 2026-05-03 Cached

L2P proposes an efficient transfer paradigm that leverages pre-trained latent diffusion models to build pixel-space diffusion models, enabling high-quality generation with minimal computational overhead and data requirements, and supporting native 4K resolution.

0 favorites 0 likes

#latent-diffusion

RuneXX/LTX-2.3-Workflows

Hugging Face Models Trending ↗ · 2026-03-05 Cached

This Hugging Face repository provides workflows and model downloads for Lightricks' LTX-2.3 video generation model, designed for use with ComfyUI, including split models, GGUF versions, and required custom nodes.

0 favorites 0 likes

latent-diffusion

Submit Feedback