Tag
BlockPilot proposes an instance-adaptive policy that predicts the optimal block size for diffusion-based speculative decoding, achieving significant speedup with minimal overhead.