minimax-risk

#minimax-risk

The risk of KV cache compression

arXiv cs.LG ↗ · 2d ago Cached

This paper theoretically characterizes the minimax risk of KV cache compression in transformers, providing design principles for accurate compression under causal masking, and instantiates them in a practical algorithm with promising results on LongBench.

0 favorites 0 likes

minimax-risk

The risk of KV cache compression

Submit Feedback