meituan-longcat/LongCat-2.0

Hugging Face Models Trending 2026/06/30 03:47 模型

large-language-model mixture-of-experts long-context ai-asic model-release open-source

摘要

LongCat-2.0 是一个大规模 MoE 语言模型，总参数量 1.6 万亿，每个 token 激活约 48B 参数，使用 AI ASIC 超级计算集群和 1M 上下文数据训练而成。在编码和智能体任务上表现出色。

标签: region:us

查看原文

查看缓存全文

缓存时间: 2026/06/30 23:30

meituan-longcat/LongCat-2.0 · Hugging Face

来源：https://huggingface.co/meituan-longcat/LongCat-2.0 LongCat-2.0

Hugging Face (https://huggingface.co/meituan-longcat)

微信公众号 (https://github.com/meituan-longcat/LongCat-2.0/blob/main/figures/wechat_official_accounts.png)Twitter 关注 (https://x.com/Meituan_LongCat)

许可证 (https://huggingface.co/meituan-longcat/LICENSE)

技术博客📄 (https://longcat.chat/blog/longcat-2.0)

我们推出了 LongCat-2.0，这是一个大规模 MoE 语言模型，拥有 1.6 万亿总参数量，每个 token 激活约 480 亿参数——相比之前的 LongCat 模型有显著提升，并伴随多项架构改进。

完整的训练流程及大规模部署完全基于 AI ASIC 超级计算集群 完成。预训练阶段总计消耗数百万加速器小时，处理超过 35 万亿个 token，期间未出现回滚或无法恢复的损失尖峰——这证明了我们在替代硬件平台上进行前沿规模训练的能力。

为了增强模型在长程任务上的表现，我们引入了 LongCat 稀疏注意力机制，并使用包含数千亿 token 的 1M 上下文 数据训练 LongCat-2.0。结合专门的后期训练优化，LongCat-2.0 在编程和智能体任务上展现出强劲性能。

🏋️ 模型权重即将发布——敬请期待！