cross-modal-fusion

Tag

Cards List
#cross-modal-fusion

Stage-adaptive Token Selection for Efficient Omni-modal LLMs

Hugging Face Daily Papers · 2026-05-19 Cached

SEATS is a training-free, stage-adaptive token selection method that reduces computational overhead in omni-modal LLMs by progressively pruning redundant visual and audio tokens, achieving a 9.3x FLOPs reduction and 4.8x prefill speedup while preserving 96.3% performance.

0 favorites 0 likes
← Back to home

Submit Feedback