介绍 cyankiwi AWQ 4-bit 量化——26.05 更新

Reddit r/LocalLLaMA 2026/05/14 18:49 模型

quantization awq 4-bit llama open-source model-optimization benchmark

摘要

Cyankiwi 推出了其 AWQ 4-bit 量化方法的更新版本，该方法联合优化缩放因子和量化范围，在 Llama-3 模型上实现了比现有方法更低的 KL 散度。

在标准 AWQ 中，逐通道的缩放因子和量化范围是分两步选择的：先选缩放因子，再选量化参数。但它们并不是独立的，即舍入误差的取值依赖于另一个的选择，因此顺序优化会损失质量。我们的 cyankiwi AWQ 26.05 更新版将缩放因子和量化范围联合拟合至重建目标。我们以 Llama-3 为例，将 cyankiwi AWQ 26.05 更新版与每种主流 4-bit 方法进行了基准测试，在 GPQA Diamond 响应上测量了相对于 BF16 基线的 KL 散度。结果：cyankiwi 在所有三个基础模型上都取得了最低的 KLD。数值越低越好。 # Llama-3.2-3B-Instruct |量化模型|方法|KLD| |:-|:-|:-| |**cyankiwi/Llama-3.2-3B-Instruct-AWQ-INT4**|**cyankiwi AWQ INT4**|**0.00510**| |unsloth/Llama-3.2-3B-Instruct-unsloth-bnb-4bit|unsloth BNB NF4|0.00785| |unsloth/Llama-3.2-3B-Instruct-bnb-4bit|BNB NF4|0.00896| |nvidia/Meta-Llama-3.2-3B-Instruct-ONNX-INT4|AWQ INT4|0.01494| |casperhansen/llama-3.2-3b-instruct-awq|AWQ INT4|0.02437| # Llama-3.1-8B-Instruct |量化模型|方法|KLD| |:-|:-|:-| |**cyankiwi/Llama-3.1-8B-Instruct-AWQ-INT4**|**cyankiwi AWQ INT4**|**0.00478**| |RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w4a16|GPTQ INT4|0.00729| |unsloth/Meta-Llama-3.1-8B-Instruct-unsloth-bnb-4bit|unsloth BNB NF4|0.00769| |unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit|BNB NF4|0.00835| |RedHatAI/Llama-3.1-8B-Instruct-NVFP4|SmoothQuant NVFP4|0.01059| |nvidia/Llama-3.1-8B-Instruct-NVFP4|NVFP4|0.01190| # Llama-3.3-70B-Instruct |量化模型|方法|KLD| |:-|:-|:-| |**cyankiwi/Llama-3.3-70B-Instruct-AWQ-INT4**|**cyankiwi AWQ INT4**|**0.02826**| |unsloth/Llama-3.3-70B-Instruct-unsloth-bnb-4bit|unsloth BNB NF4|0.04444| |casperhansen/llama-3.3-70b-instruct-awq|AWQ INT4|0.04859| |unsloth/Llama-3.3-70B-Instruct-bnb-4bit|BNB NF4|0.06879| |nvidia/Llama-3.3-70B-Instruct-NVFP4|NVFP4|0.08307| |RedHatAI/Llama-3.3-70B-Instruct-quantized.w4a16|GPTQ INT4|0.09272| https://preview.redd.it/uicubbg6951h1.png?width=6400&format=png&auto=webp&s=2f7f1d4e46c9953f00c68518b3c2aa058fc34e32

查看原文

介绍 cyankiwi AWQ 4-bit 量化——26.05 更新

相似文章

Qwen3.6-27B 量化基准测试

在24GB显存环境中运行Qwen 3.6 27B的配置：后端对比、量化选择与设置（llama.cpp, ik_llama.cpp, BeeLlama, vllm）

@anirudhbv_ce: 介绍 SpectralQuant.. 来拯救您的 KV 缓存 :)

这是我的 llama.cpp NVFP4/MXFP6 GGUF 量化工具

Qift: 移位友好的无零点W2训练后量化，用于旋转W2A4/KV4大语言模型推理

提交意见反馈