low-rank-reconstruction

#low-rank-reconstruction

High-Fidelity KV Cache Summarization Using Entropy and Low-Rank Reconstruction

Hacker News Top ↗ · 2026-04-19 Cached

Proposes an SRC pipeline that uses entropy-based selection and low-rank reconstruction to summarize KV cache instead of pruning tokens, reducing VRAM for million-token LLM contexts while avoiding catastrophic attention errors.

0 favorites 0 likes

low-rank-reconstruction

High-Fidelity KV Cache Summarization Using Entropy and Low-Rank Reconstruction

Submit Feedback