diffusion-llms

Tag

Cards List
#diffusion-llms

WaveFilter: Enhancing the Long-Context Capability of Diffusion LLMs via Wavelet-Guided KV Cache Filtering

arXiv cs.CL · 2026-06-02 Cached

WaveFilter proposes a training-free, wavelet-guided KV cache filtering framework for diffusion large language models that enhances long-context capability by precisely identifying key tokens and constructing sparse caches, improving performance on complex long-context tasks.

0 favorites 0 likes
#diffusion-llms

Efficient Diffusion LLMs via Temporal-Spatial Parallel Decoding and Confidence Extrapolation

arXiv cs.CL · 2026-06-01 Cached

This paper introduces Temporal-Spatial Parallel Decoding (TSPD) and Confidence Extrapolation (CE) to accelerate inference in diffusion-based large language models by dynamically deciding when tokens have converged and forecasting logit trends, reducing unnecessary denoising steps while preserving output quality.

0 favorites 0 likes
#diffusion-llms

Roll Out and Roll Back: Diffusion LLMs are Their Own Efficiency Teachers

arXiv cs.CL · 2026-05-19 Cached

This paper introduces WINO and WINO+, methods that enable revokable parallel decoding in diffusion LLMs and distill efficient denoising trajectories, significantly improving the quality-speed trade-off.

0 favorites 0 likes
#diffusion-llms

DARE: Diffusion Language Model Activation Reuse for Efficient Inference

arXiv cs.LG · 2026-05-12 Cached

This paper introduces DARE, a method for improving the inference efficiency of Diffusion Large Language Models by reusing cached key-value and output activations to reduce computational redundancy with negligible quality loss.

0 favorites 0 likes
← Back to home

Submit Feedback