weight-quantization

Tag

Cards List
#weight-quantization

QAM-W: Joint 2D Codebook Quantization for LLM Weights via Hadamard Rotation and Activation-Aware Scaling

arXiv cs.LG · 2026-05-27 Cached

Introduces QAM-W, a joint 2D codebook quantization method for LLM weights using Hadamard rotation and activation-aware scaling, achieving near BF16 perplexity at 5–6 bits per weight and matching SmoothQuant W8A8 quality with 32% fewer weight bits.

0 favorites 0 likes
← Back to home

Submit Feedback