tree-drafting

Tag

Cards List
#tree-drafting

JetSpec: Breaking the Scaling Ceiling of Speculative Decoding with Parallel Tree Drafting

Hugging Face Daily Papers · 4d ago Cached

JetSpec is a speculative decoding framework that combines efficient forward drafting with causal conditioning to improve LLM inference speed and acceptance rates, achieving up to 9.64x speedup on MATH-500 and 4.58x on conversational workloads.

0 favorites 0 likes
#tree-drafting

SpecBlock: Block-Iterative Speculative Decoding with Dynamic Tree Drafting

arXiv cs.CL · 2026-05-11 Cached

This paper introduces SpecBlock, a block-iterative speculative decoding method that combines path dependence with efficient drafting to accelerate LLM inference. It demonstrates improved speedup over existing methods like EAGLE-3 while maintaining lower drafting costs.

0 favorites 0 likes
← Back to home

Submit Feedback