Tag
KVarN is a calibration-free KV-cache quantizer that uses Hadamard rotation and dual-scaling variance normalization to reduce error accumulation during autoregressive decoding in large language models, achieving state-of-the-art 2-bit precision on reasoning benchmarks.