acceleration-collapse

#acceleration-collapse

Mistletoe: Stealthy Acceleration-Collapse Attacks on Speculative Decoding

arXiv cs.CL ↗ · 8h ago Cached

This paper identifies a new vulnerability in model-based speculative decoding for large language models, where small perturbations can reduce draft token acceptance without affecting output quality, collapsing acceleration. The authors propose Mistletoe, an attack that jointly optimizes degradation and semantic preservation, demonstrating significant speedup reduction across various systems.

0 favorites 0 likes

acceleration-collapse

Mistletoe: Stealthy Acceleration-Collapse Attacks on Speculative Decoding

Submit Feedback