Zyphra/ZAYA1-8B

Hugging Face Models Trending 2026/05/04 19:45 模型

zyphra zaya1-8b mixture-of-experts reasoning open-weight mathematics

摘要

Zyphra 发布了 ZAYA1-8B，这是一款拥有 84 亿参数的混合专家模型（Mixture-of-Experts），其中活跃参数为 7.6 亿。该模型在数学和代码推理任务中展现出极高的效率与卓越的性能。

标签：safetensors, zaya, 许可证:apache-2.0, eval-results, 区域:us

查看原文导出为 Word 导出为 PDF

查看缓存全文

缓存时间: 2026/05/08 08:52

Zyphra/ZAYA1-8B · Hugging Face

来源：https://huggingface.co/Zyphra/ZAYA1-8B ZAYA1-8B 是一个小型混合专家（Mixture of Experts, MoE）语言模型，拥有 7.6 亿活跃参数和 84 亿总参数，由 Zyphra 端到端训练而成。ZAYA1-8B 通过结合新颖的架构以及在预训练和训练后阶段的创新，为同量级参数模型树立了新的智能效率标准。

ZAYA1-8B 在详细的长文本推理方面表现出色，特别是在数学和代码任务中。在这些领域，它的表现远超其体量水平，并且由于其推理效率和小巧的体积，可以在测试时计算（test-time compute）机制中发挥高效作用。

由于其总参数量较小，ZAYA1-8B 还可以部署在设备上，用于本地大型语言模型（LLM）应用。

更多详情请参阅我们的技术报告 (https://arxiv.org/abs/2605.05365) 和博客文章 (https://www.zyphra.com/post/zaya1-8b)。

这是 ZAYA1-8B 的训练后推理版本。预训练基础模型可在此处找到 (https://huggingface.co/Zyphra/ZAYA1-reasoning-base)。

https://huggingface.co/Zyphra/ZAYA1-8B#performancePerformance

ZAYA1-8B 表现极其强劲，尤其是在具有挑战性的数学、推理和代码基准测试中。ZAYA1-8B 在数学基准测试中与体量数倍于自身的前沿规模推理模型相比也具备竞争力。

ZAYA_ttc_paper_light_no_dsv32_lcb_no_o4_hmmt_feb_dsv32_925_claude45_base_labels_matched_gap_transparent (https://cdn-uploads.huggingface.co/production/uploads/65c05e75c084467acab2f84a/f5tbexK3BumixnJuBZxo_.png)

western_os_comparison_transparent_barchart (https://cdn-uploads.huggingface.co/production/uploads/65c05e75c084467acab2f84a/W8bn6ZAocWKFuicjtjesv.png)

首先，我们将 ZAYA1-8B 与参数量大致相同的 SOTA Qwen3 和 Qwen3.5 模型系列，以及最近发布的 Gemma4 模型进行比较；其次，将其与各种更大的开源权重模型进行比较。

https://huggingface.co/Zyphra/ZAYA1-8B#in-class-comparison-against-open-source-reasoning-modelsIn-class comparison against open-source reasoning models

类别基准ZAYA1-8B (0.7B / 8.0B)Qwen3-4B-Thinking-2507 (4.0B / 4.0B)Qwen3.5-4B (4.0B / 4.0B)Gemma-4-E4B-it (4.0B / 8.0B*)MathAIME’2689.177.584.550.3MathHMMT Feb.’2671.660.863.632.1MathIMO-AnswerBench59.350.948.727.3MathAPEX-shortlist32.216.9–6.1CodeLiveCodeBench-v665.854.2–54.2KnowledgeGPQA-Diamond71.066.576.257.4KnowledgeMMLU-Pro74.274.379.170.2InstructionIFEval85.5886.889.888.50InstructionIFBench52.5652.959.242.67Style & chatEQBench72.9579.679.580.15Style & chatCreative Writing v362.9758.672.983.75AgenticBFCL-v439.2249.745.231.7Agenticτ243.1252.982.137.7

https://huggingface.co/Zyphra/ZAYA1-8B#scaling-comparison-against-larger-open-source-reasoning-modelsScaling comparison against larger open-source reasoning models

模型活跃参数总参数AIME’26HMMT’26LCB-v6IFEvalGPQA-DMMLU-ProZAYA1-8B0.7B8B89.171.663.885.871.074.2Arcee-Trinity-Mini3B26B59.636.933.362.046.870.6N3-Nano-30B3B30B90.175.564.692.875.178.9OLMo-3.1-32B-Think32B32B78.950.658.393.259.675.8Qwen3-Next-80B-A3B-Think3B80B90.279.367.888.576.782.6Intellect-312B106B86.372.266.881.274.682.3Mistral-Small-4-119B6B119B86.470.657.984.077.281.6 所有数据均在 Zyphra 评估平台上运行得出。模型按总参数量排序。

https://huggingface.co/Zyphra/ZAYA1-8B#quickstartQuickstart

https://huggingface.co/Zyphra/ZAYA1-8B#prerequisitesPrerequisites

我们建议在全新的 Python 环境中安装以下库（已在 Python 3.12 上测试）。

要使用 ZAYA1-8B，请从我们 fork 的 vllm 库中安装 zaya1-pr 分支（该命令将触发从源码完整构建 vLLM）：

pip install "vllm @ git+https://github.com/Zyphra/vllm.git@zaya1-pr"

如果您想使用 transformers 运行，请同时从我们 fork 的 transformers 库中安装 zaya1 分支：

pip install "transformers @ git+https://github.com/Zyphra/transformers.git@zaya1"

https://huggingface.co/Zyphra/ZAYA1-8B#deploymentDeployment

要启动 vLLM 服务器，请运行以下命令：

vllm serve Zyphra/ZAYA1-8B --port 8010 \
   --mamba-cache-dtype float32 --dtype bfloat16 \
   --reasoning-parser qwen3 --enable-auto-tool-choice --tool-call-parser zaya_xml

对于并行部署，我们建议使用数据并行（DP）结合专家并行（EP），因为上述分支不支持针对上下文缓存架构（CCA）的张量并行（TP）。如果在 8 个 GPU 上运行，请设置额外标志 -dp 8 -ep 以使用 DP=EP=8 运行。

在我们的评估和一般使用中，我们推荐 temperature 1.0, top-p 0.95, top-k -1。对于 Agent 和代码用例，我们推荐 top-p 0.6。

服务器启动后，您可以像以下示例一样使用 curl 查询模型：

curl http://localhost:8010/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "Zyphra/ZAYA1-8B",
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Hello. How is it going?"}
        ]
    }'

Zyphra/ZAYA1-8B

Zyphra/ZAYA1-8B · Hugging Face

https://huggingface.co/Zyphra/ZAYA1-8B#performancePerformance

https://huggingface.co/Zyphra/ZAYA1-8B#in-class-comparison-against-open-source-reasoning-modelsIn-class comparison against open-source reasoning models

https://huggingface.co/Zyphra/ZAYA1-8B#scaling-comparison-against-larger-open-source-reasoning-modelsScaling comparison against larger open-source reasoning models

https://huggingface.co/Zyphra/ZAYA1-8B#quickstartQuickstart

https://huggingface.co/Zyphra/ZAYA1-8B#prerequisitesPrerequisites

https://huggingface.co/Zyphra/ZAYA1-8B#deploymentDeployment

相似文章

ZAYA1-8B 技术报告

ZAYA1-74B-Preview：在AMD上扩展预训练

Qwen/Qwen3.6-35B-A3B

@ProTekkFZS：在 3090 上用 Q4_K_M 3.6 35B、768k 上下文加 YaRN，爽到飞起

Qwen/Qwen3.6-27B-FP8

提交意见反馈