error-accumulation

Tag

Cards List
#error-accumulation

KVarN: Variance-Normalized KV-Cache Quantization Mitigates Error Accumulation in Reasoning Tasks

Hugging Face Daily Papers · 4d ago Cached

KVarN is a calibration-free KV-cache quantizer that uses Hadamard rotation and dual-scaling variance normalization to reduce error accumulation during autoregressive decoding in large language models, achieving state-of-the-art 2-bit precision on reasoning benchmarks.

0 favorites 0 likes
← Back to home

Submit Feedback