causal-conditioning

#causal-conditioning

JetSpec: Breaking the Scaling Ceiling of Speculative Decoding with Parallel Tree Drafting

Hugging Face Daily Papers ↗ · 4d ago Cached

JetSpec is a speculative decoding framework that combines efficient forward drafting with causal conditioning to improve LLM inference speed and acceptance rates, achieving up to 9.64x speedup on MATH-500 and 4.58x on conversational workloads.

0 favorites 0 likes

causal-conditioning

JetSpec: Breaking the Scaling Ceiling of Speculative Decoding with Parallel Tree Drafting

Submit Feedback