LiquidAI/LFM2.5-230M

Hugging Face Models Trending 2026/06/24 23:14 模型

liquid-ai small-language-model edge-inference on-device hybrid-model open-source reinforcement-learning

摘要

Liquid AI发布了LFM2.5-230M，一款紧凑的230M参数混合模型，针对设备端部署进行了优化，边缘推理速度快（在Galaxy S25 Ultra上达到213 tok/s），并通过强化学习构建，适用于智能体任务。

任务: text-generation 标签: transformers, safetensors, lfm2, text-generation, liquid, lfm2.5, edge, conversational, en, ar, zh, fr, de, ja, ko, es, pt, it, arxiv:2511.23404, base_model:LiquidAI/LFM2.5-230M-Base, base_model:finetune:LiquidAI/LFM2.5-230M-Base, license:other, endpoints_compatible, region:us

查看原文

查看缓存全文

缓存时间: 2026/06/26 05:21

LiquidAI/LFM2.5-230M · Hugging Face

来源：https://huggingface.co/LiquidAI/LFM2.5-230M Liquid AI

LFM2.5 是一个专为端侧部署设计的混合模型系列。它基于 LFM2 架构，并扩展了预训练和强化学习。

我们迄今为止最紧凑的模型：230M 参数，但表现超出其规模，为最严苛的内存和计算预算带来真正的能力。
快速边缘推理：从低成本 CPU 到生产级 GPU，均能实现最佳吞吐量，在 Galaxy S25 Ultra 上解码速度达 213 tok/s，在 Raspberry Pi 5 上为 42 tok/s。
专为智能体任务打造：从 LFM2.5-350M 蒸馏而来，并通过多阶段强化学习优化，非常适合工具使用和数据提取。

关于 LFM2.5-230M 的更多信息，请参阅我们的博客文章 (https://www.liquid.ai/blog/lfm2-5-230m)。

lfm2_5_230m_benchmarks (https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/4UpNxlgfKjfgT5ByIVph0.png)

🗒️ 模型详情

模型	参数	描述
LFM2.5-230M-Base (https://huggingface.co/LiquidAI/LFM2.5-230M-Base)	230M	用于微调的预训练基础模型
LFM2.5-230M (https://huggingface.co/LiquidAI/LFM2.5-230M)	230M	通用指令微调模型

LFM2.5-230M 是一个通用的纯文本模型，具有以下特点：

参数数量：230M
层数：14（8个双门控LIV卷积块 + 6个GQA块）
训练预算：19T tokens
上下文长度：32,768 tokens
词汇表大小：65,536
知识截止日期：2024年中
支持语言：英语、阿拉伯语、中文、法语、德语、意大利语、日语、韩语、葡萄牙语、西班牙语
生成参数：
- temperature: 0.1
- top_k: 50
- repetition_penalty: 1.05

模型	描述
LFM2.5-230M (https://huggingface.co/LiquidAI/LFM2.5-230M)	原生格式的原始模型检查点。最适合使用 Transformers、vLLM 和 SGLang 进行微调或推理。
LFM2.5-230M-GGUF (https://huggingface.co/LiquidAI/LFM2.5-230M-GGUF)	用于 llama.cpp 及兼容工具的量化格式。针对边缘推理和本地部署进行了优化。
LFM2.5-230M-ONNX (https://huggingface.co/LiquidAI/LFM2.5-230M-ONNX)	用于跨平台部署的 ONNX Runtime 格式。
LFM2.5-230M-MLX (https://huggingface.co/LiquidAI/LFM2.5-230M-MLX-8bit)	用于 Apple Silicon 的 MLX 格式。针对 Mac 设备上的快速推理进行了优化。

我们建议将其用于数据提取和轻量级端侧智能体流程。不建议用于推理密集型任务，如高级数学、代码生成或创意写作。

聊天模板

LFM2.5 使用类似 ChatML 的格式。详情请参见聊天模板文档 (https://docs.liquid.ai/lfm/key-concepts/chat-template)。示例：

<|startoftext|><|im_start|>system
You are a helpful assistant trained by Liquid AI.<|im_end|>
<|im_start|>user
What is C. elegans?<|im_end|>
<|im_start|>assistant

你可以使用 tokenizer.apply_chat_template() (https://huggingface.co/docs/transformers/en/chat_templating#using-applychattemplate) 自动格式化消息。

工具使用

LFM2.5 支持函数调用，分为四步：

函数定义：在系统提示中以 JSON 对象形式提供工具列表，或使用 tokenizer.apply_chat_template() (https://huggingface.co/docs/transformers/en/chat_extras#passing-tools) 并设置 tools=...。
函数调用：默认情况下，LFM2.5 会编写 Python 风格的函数调用（位于 <|tool_call_start|> 和 <|tool_call_end|> 特殊标记之间的 Python 列表）作为助手回答。你可以通过在系统提示中要求模型输出 JSON 函数调用来覆盖此行为。
函数执行：执行调用，并使用 tool 角色返回结果。
最终答案：LFM2.5 解释工具输出，并返回针对原始提示的纯文本答案。

完整指南请参阅工具使用文档 (https://docs.liquid.ai/lfm/key-concepts/tool-use)。示例：

<|startoftext|><|im_start|>system
List of tools: [{"name": "get_candidate_status", "description": "Retrieves the current status of a candidate in the recruitment process", "parameters": {"type": "object", "properties": {"candidate_id": {"type": "string", "description": "Unique identifier for the candidate"}}, "required": ["candidate_id"]}}]<|im_end|>
<|im_start|>user
What is the current status of candidate ID 12345?<|im_end|>
<|im_start|>assistant
<|tool_call_start|>[get_candidate_status(candidate_id="12345")]<|tool_call_end|>Checking the current status of candidate ID 12345.<|im_end|>
<|im_start|>tool
[{"candidate_id": "12345", "status": "Interview Scheduled", "position": "Clinical Research Associate", "date": "2023-11-20"}]<|im_end|>
<|im_start|>assistant
The candidate with ID 12345 is currently in the "Interview Scheduled" stage for the position of Clinical Research Associate, with an interview date set for 2023-11-20.<|im_end|>

🏃 推理

LFM2.5 受到许多推理框架的支持。完整列表请参阅推理文档 (https://docs.liquid.ai/lfm/inference/transformers)。

使用 Transformers 快速开始（兼容 transformers>=5.0.0）：

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

model_id = "LiquidAI/LFM2.5-230M"
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    dtype="bfloat16",
#   attn_implementation="flash_attention_2" <- uncomment on compatible GPU
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

prompt = "What is C. elegans?"

input_ids = tokenizer.apply_chat_template(
    [{"role": "user", "content": prompt}],
    add_generation_prompt=True,
    return_tensors="pt",
    tokenize=True,
)["input_ids"].to(model.device)

output = model.generate(
    input_ids,
    do_sample=True,
    temperature=0.1,
    top_k=50,
    repetition_penalty=1.05,
    max_new_tokens=512,
    streamer=streamer,
)

🔧 微调

我们建议针对你的特定用例对 LFM2.5 进行微调，以获得最佳效果。

📊 性能

基准测试

模型	GPQA Diamond	MMLU-Pro	IFEval	IFBench	Multi-IF
LFM2.5-230M	25.41	20.25	71.71	38.40	37.70
LFM2.5-350M	30.64	20.01	76.96	40.69	44.92
LFM2-350M	27.58	19.29	64.96	18.20	32.92
Granite 4.0-H-350M	22.32	13.14	61.27	17.22	28.70
Granite 4.0-350M	25.91	12.84	53.48	15.98	24.21
Qwen3.5-0.8B (Instruct)	27.41	37.42	59.94	22.87	41.68
Gemma 3 1B IT	23.89	14.04	63.49	20.33	44.25

模型	CaseReportBench	BFCLv3	BFCLv4	τ2-Bench Telecom	τ2-Bench Retail
LFM2.5-230M	22.51	43.26	21.03	5.26	13.68
LFM2.5-350M	32.45	44.11	21.86	18.86	17.84
LFM2-350M	11.67	22.95	12.29	10.82	5.56
Granite 4.0-H-350M	12.44	43.07	13.28	13.74	6.14
Granite 4.0-350M	0.84	39.58	13.73	2.92	6.14
Qwen3.5-0.8B (Instruct)	13.83	35.08	18.70	12.57	6.14
Gemma 3 1B IT	2.28	16.61	7.17	9.36	6.43

CPU 推理

图片 (https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/TCR-MfPtX3YTPvRzxWcG3.png)

GPU 推理

图片 (https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/emlcz4gf2wendPhKQWEBN.png)

📬 联系

有问题或想联系？加入我们的 Discord 社区 (https://discord.com/invite/liquid-ai)
如果您对包含边缘部署的定制解决方案感兴趣，请联系我们的销售团队 (https://www.liquid.ai/contact)。

引用

@article{liquidAI2026230M,
  author = {Liquid AI},
  title = {LFM2.5-230M: Built to Run Anywhere},
  journal = {Liquid AI Blog},
  year = {2026},
  note = {www.liquid.ai/blog/lfm2-5-230m},
}

@article{liquidai2025lfm2,
  title={LFM2 Technical Report},
  author={Liquid AI},
  journal={arXiv preprint arXiv:2511.23404},
  year={2025}
}

LiquidAI/LFM2.5-230M

LiquidAI/LFM2.5-230M · Hugging Face

🗒️ 模型详情

聊天模板

工具使用

🏃 推理

🔧 微调

📊 性能

基准测试

CPU 推理

GPU 推理

📬 联系

引用

相似文章

@liquidai：推出LFM2.5-230M：这是我们最小的模型，专为快速运行而设计，可在任何地方（CPU、NPU和GPU）上运行，以实现代理型任务…

Liquid AI 发布 LFM2.5-8B-A1B

LiquidAI/LFM2.5-8B-A1B-GGUF

当你没有数据中心GPU时

Liquid AI 发布基于 38T 训练的 8B-A1B MoE 模型

提交意见反馈