latency-energy

Tag

Cards List
#latency-energy

INAR-VL: Input-Aware Routing for Edge-Cloud Vision-Language Inference

arXiv cs.LG · 2026-05-20

INAR-VL proposes a lightweight routing system for edge-cloud vision-language inference that dynamically selects between edge and cloud models based on query complexity, achieving significant latency and energy reductions while preserving near-cloud accuracy.

0 favorites 0 likes
← Back to home

Submit Feedback