Tag
INAR-VL proposes a lightweight routing system for edge-cloud vision-language inference that dynamically selects between edge and cloud models based on query complexity, achieving significant latency and energy reductions while preserving near-cloud accuracy.