Tag
VIA-SD introduces a multi-tier speculative decoding framework using intra-model routing to reduce verification costs, achieving significant speedups over traditional approaches.