confidence-aware

Tag

Cards List
#confidence-aware

CONF-KV: Confidence-Aware KV Cache Eviction with Mixed-Precision Storage for Long-Horizon LLM

Hugging Face Daily Papers · 2026-05-24 Cached

CONF-KV is a KV-cache management system that uses model uncertainty to dynamically adjust cache retention, improving memory efficiency for long-context LLM inference while maintaining accuracy within 1.5-2.1 perplexity points.

0 favorites 0 likes
← Back to home

Submit Feedback