time-to-first-token

Tag

Cards List
#time-to-first-token

EarlyTom: Early Token Compression Completes Fast Video Understanding

Hugging Face Daily Papers · 2026-05-28 Cached

EarlyTom is a training-free framework that compresses visual tokens early in the vision encoder to reduce time-to-first-token and computational costs while maintaining accuracy, achieving up to 2.65x TTFT reduction.

0 favorites 0 likes
← Back to home

Submit Feedback