Tag
ProxyKV is a cross-model proxy pruning framework that offloads importance scoring to a lightweight small model, achieving high precision KV cache pruning with much lower prefilling overhead, matching KVZip accuracy across Llama-3.1, Qwen-2.5, and Qwen-3 families.