latent-diffusion

Tag

Cards List
#latent-diffusion

@artemZholus: thanks! in the second paper (https://arxiv.org/abs/2605.06388) we used your (and RAE's) recipe and it worked.

X AI KOLs Following · 2026-05-26 Cached

This paper systematically compares reconstruction-based and semantic latent spaces for action-conditioned latent diffusion world models in robotics. It finds that semantic encoders like V-JEPA 2.1 generally outperform reconstruction encoders on policy-relevant metrics, advocating for semantic latent spaces as a stronger foundation for robotics world models.

0 favorites 0 likes
#latent-diffusion

@xuanchi13: The latent-vs-pixel debate misses the point. GPT Image 2 shows what users notice: pixel-level fidelity. Latent models s…

X AI KOLs Timeline · 2026-05-26 Cached

NVIDIA introduces PiD, a Pixel Diffusion Decoder that replaces traditional VAE/RAE decoders in latent diffusion models, enabling fast, high-resolution decoding with up to 6× speedup and improved visual fidelity.

0 favorites 0 likes
#latent-diffusion

@FeitengLi: NVIDIA Spatial Intelligence Lab proposes PiD, redesigning the decoding stage in latent diffusion models. Current mainstream text-to-image generation happens in latent space, then uses a VAE decoder to map back to pixels. This decoder's…

X AI KOLs Timeline · 2026-05-25 Cached

NVIDIA Spatial Intelligence Lab proposes PiD, which redesigns the decoding stage of latent diffusion models as a conditional pixel diffusion process, unifying decoding and upsampling to achieve low-latency, high-resolution decoding.

0 favorites 0 likes
#latent-diffusion

AirfoilGen: A valid-by-construction and performance-aware latent diffusion model for airfoil generation

arXiv cs.LG · 2026-05-21 Cached

This paper proposes AirfoilGen, a latent diffusion model for airfoil shape generation that ensures geometric validity via a circle sweeping representation and enables control over aerodynamic performance (lift/drag coefficients). Experiments show 98.41% performance-conditioning accuracy, using a new dataset of over 200,000 airfoils.

0 favorites 0 likes
#latent-diffusion

Provably Learning Diffusion Models under the Manifold Hypothesis: Collapse and Refine

arXiv cs.LG · 2026-05-21 Cached

This paper identifies a collapse-and-refine mechanism in diffusion models under the manifold hypothesis, proposing Score-induced Latent Diffusion (SiLD) that provably avoids the curse of dimensionality. Experiments show SiLD matches or outperforms VAE-based latent diffusion models.

0 favorites 0 likes
#latent-diffusion

Stable Audio 3

Hacker News Top · 2026-05-20 Cached

Stable Audio 3 introduces a family of fast latent diffusion models for variable-length audio generation and editing, with open-source release of small and medium model weights.

0 favorites 0 likes
#latent-diffusion

When Latent Geometry Is Not Enough: Draft-Conditioned Latent Refinement for Non-Autoregressive Text Generation

arXiv cs.CL · 2026-05-18 Cached

This technical report investigates draft-conditioned latent refinement for non-autoregressive text generation, showing that good latent geometry does not guarantee good decoding and emphasizing decoder recoverability as a key evaluation metric.

0 favorites 0 likes
#latent-diffusion

ByteDance-Seed/Cola-DLM · Hugging Face

Reddit r/LocalLLaMA · 2026-05-15 Cached

ByteDance releases Cola-DLM, a hierarchical continuous latent-space diffusion language model combining a Text VAE with a block-causal Diffusion Transformer, available on Hugging Face with model weights, code, and paper.

0 favorites 0 likes
#latent-diffusion

The DAWN of World-Action Interactive Models

Hugging Face Daily Papers · 2026-05-12 Cached

This paper introduces DAWN, a latent generative baseline for World-Action Interactive Models (WAIMs) that jointly models scene evolution and action generation through recursive refinement, achieving strong long-horizon planning in autonomous driving scenarios.

0 favorites 0 likes
#latent-diffusion

L2P: Unlocking Latent Potential for Pixel Generation

Hugging Face Daily Papers · 2026-05-12 Cached

The L2P paper introduces a Latent-to-Pixel transfer paradigm that leverages pre-trained latent diffusion models to create efficient pixel-space models capable of 4K generation with minimal training overhead.

0 favorites 0 likes
#latent-diffusion

What Matters for Diffusion-Friendly Latent Manifold? Prior-Aligned Autoencoders for Latent Diffusion

Hugging Face Daily Papers · 2026-05-08 Cached

This article introduces Prior-Aligned Autoencoders (PAE), a new method for creating diffusion-friendly latent manifolds that achieves state-of-the-art image generation quality while enabling 13x faster training convergence.

0 favorites 0 likes
#latent-diffusion

TextLDM: Language Modeling with Continuous Latent Diffusion

Hugging Face Daily Papers · 2026-05-08 Cached

This paper introduces TextLDM, a method that adapts visual latent diffusion transformers for language modeling by mapping discrete tokens to continuous latents. It demonstrates that this approach, enhanced by representation alignment, matches GPT-2 performance and unifies visual and text generation architectures.

0 favorites 0 likes
#latent-diffusion

zhen-nan/L2P

Hugging Face Models Trending · 2026-05-03 Cached

L2P proposes an efficient transfer paradigm that leverages pre-trained latent diffusion models to build pixel-space diffusion models, enabling high-quality generation with minimal computational overhead and data requirements, and supporting native 4K resolution.

0 favorites 0 likes
#latent-diffusion

RuneXX/LTX-2.3-Workflows

Hugging Face Models Trending · 2026-03-05 Cached

This Hugging Face repository provides workflows and model downloads for Lightricks' LTX-2.3 video generation model, designed for use with ComfyUI, including split models, GGUF versions, and required custom nodes.

0 favorites 0 likes
← Back to home

Submit Feedback