low-rank-reconstruction

Tag

Cards List
#low-rank-reconstruction

High-Fidelity KV Cache Summarization Using Entropy and Low-Rank Reconstruction

Hacker News Top · 2026-04-19 Cached

Proposes an SRC pipeline that uses entropy-based selection and low-rank reconstruction to summarize KV cache instead of pruning tokens, reducing VRAM for million-token LLM contexts while avoiding catastrophic attention errors.

0 favorites 0 likes
← Back to home

Submit Feedback