masked-diffusion

Tag

Cards List
#masked-diffusion

Adaptive Order Policies for Masked Diffusion

arXiv cs.LG · 2d ago Cached

Proposes learning the unmasking order in masked diffusion models using a lightweight policy network, with a weighted loss that outperforms heuristics on combinatorial tasks and protein design.

0 favorites 0 likes
#masked-diffusion

DLLM-JEPA: Joint Embedding Predictive Architectures for Masked Diffusion Language Models

arXiv cs.CL · 2d ago Cached

Introduces DLLM-JEPA, a JEPA formulation for masked diffusion language models that constructs two views from a single input via the diffusion noise schedule, reducing training FLOPs by 33% relative to LLM-JEPA and improving fine-tuning performance on tasks like GSM8K.

0 favorites 0 likes
#masked-diffusion

The Confidence Shortcut: A Reasoning Failure Mode of Masked Diffusion Models

arXiv cs.AI · 6d ago Cached

This paper identifies a failure mode in masked diffusion language models where confidence-based decoding leads to high-confidence errors on complex reasoning tasks, and shows that confidence-aligned training exacerbates this issue while random masking preserves reasoning performance.

0 favorites 0 likes
#masked-diffusion

Masked Diffusion Language Models are Strong and Steerable Text-Based World Models for Agentic RL [R]

Reddit r/MachineLearning · 2026-05-21

This paper proposes using Masked Diffusion Language Models (MDLMs) as text-based world models for agentic reinforcement learning, showing that their any-order denoising objective avoids prefix mode collapse and leads to stronger performance than autoregressive baselines.

0 favorites 0 likes
#masked-diffusion

AnchorDiff: Topology-Aware Masked Diffusion with Confidence-based Rewriting for Radiology Report Generation

arXiv cs.AI · 2026-05-19 Cached

AnchorDiff proposes a topology-aware masked diffusion framework for radiology report generation, integrating RadGraph-derived clinical anchors and confidence-based rewriting to achieve state-of-the-art results on MIMIC-CXR and MIMIC-RG4 benchmarks.

0 favorites 0 likes
#masked-diffusion

Discrete Stochastic Localization for Non-autoregressive Generation

arXiv cs.LG · 2026-05-14 Cached

Introduces Discrete Stochastic Localization (DSL), a continuous-state diffusion framework for non-autoregressive text generation that uses unit-sphere token embeddings and a timestep-invariant denoiser, achieving better distributional faithfulness than masked discrete diffusion models on OpenWebText.

0 favorites 0 likes
#masked-diffusion

Remask, Don't Replace: Token-to-Mask Refinement in Masked Diffusion Language Models

arXiv cs.CL · 2026-04-22 Cached

Introduces Token-to-Mask (T2M) remasking to fix generation errors in masked diffusion LMs by resetting suspect tokens to mask state instead of overwriting, yielding up to +5.92 accuracy on CMATH without extra training or parameters.

0 favorites 0 likes
#masked-diffusion

CRoCoDiL: Continuous and Robust Conditioned Diffusion for Language

arXiv cs.CL · 2026-04-20 Cached

CRoCoDiL proposes a continuous and robust conditioned diffusion approach for language that shifts masked diffusion models into a continuous semantic space, achieving superior generation quality and 10x faster sampling speeds compared to discrete methods like LLaDA.

0 favorites 0 likes
← Back to home

Submit Feedback