parallel-generation

Tag

Cards List
#parallel-generation

DiffRetriever: Parallel Representative Tokens for Retrieval with Diffusion Language Models

Hugging Face Daily Papers · 5d ago Cached

This paper introduces DiffRetriever, a method that uses diffusion language models to generate multiple representative tokens in parallel for efficient information retrieval, outperforming autoregressive baselines in speed and accuracy.

0 favorites 0 likes
#parallel-generation

DFlash: Block Diffusion for Flash Speculative Decoding

Papers with Code Trending · 2026-02-05 Cached

DFlash is a new speculative decoding framework that uses a lightweight block diffusion model for parallel token drafting, achieving over 6x acceleration compared to autoregressive methods. It significantly outperforms existing state-of-the-art methods like EAGLE-3 while maintaining high output quality.

0 favorites 0 likes
← Back to home

Submit Feedback