Tag
This paper proposes Saliency-Aware Regularized Quantization Calibration (SARQC), a unified framework that improves Post-Training Quantization (PTQ) for LLMs by adding a regularization term to preserve weight proximity, enhancing generalization and performance.
Researchers from MIT CSAIL and other institutions introduced CompreSSM, a technique that compresses state-space AI models during training by removing unnecessary components early, resulting in faster training and smaller models without sacrificing performance.