@liquidai: Introducing LFM2.5-230M: our smallest model yet, built to run fast anywhere (CPUs, NPUs, and GPUs) to enable agentic ta…
Summary
Liquid AI releases LFM2.5-230M, a small 230M parameter model optimized for fast inference on CPUs, NPUs, and GPUs, targeting agentic tasks on devices like phones and robots.
View Cached Full Text
Cached at: 06/25/26, 03:25 PM
Introducing LFM2.5-230M: our smallest model yet, built to run fast anywhere (CPUs, NPUs, and GPUs) to enable agentic tasks on phones, robots, home and network automation devices.
230M parameters, built on the LFM2 architecture Pre-trained on 19T tokens, with a 32K context extension Post-trained with distillation from LFM2.5-350M 213 tok/s decode speed on Galaxy S25 Ultra (CPU) 42 tok/s on a Raspberry Pi 5 (CPU) Competes with and often beats models more than twice its size on instruction following, data extraction, and tool use. use it for large-scale data extraction pipelines or lightweight on-device agentic workloads.
Similar Articles
Liquid AI releases LFM2.5-8B-A1B
Liquid AI released LFM2.5-8B-A1B, an edge model with a 128K context window, 38T tokens of pre-training, and large-scale reinforcement learning, capable of tool calling and complex tasks while fitting on an entry-level laptop.
@noctus91: I recently switched from Qwen 3.5 9B to LFM2.5-8B-A1B by @liquidai, and it's quickly become my default local model in H…
A user shares their positive experience switching from Qwen 3.5 9B to Liquid AI's new LFM2.5-8B-A1B model, praising its speed and reliability for agentic tasks while noting coding remains a weakness. The model is an 8B MoE with 1.5B active parameters and 128K context, optimized for devices and server-side use.
LiquidAI/LFM2.5-8B-A1B-GGUF
LiquidAI releases a GGUF quantized version of their LFM2.5-8B-A1B model, with instructions for use across multiple inference engines.
@LottoLabs: A very cool model for the GPU poor bros Trained on an ungodly amount of tokens for a 8b a1b model Gonna be super fast e…
LottoLabs announces LiquidAI's LFM2.5-8B-A1B-GGUF model, an 8B parameter model trained on a massive token count and optimized for fast inference on limited GPU hardware, with support for llama.cpp, Ollama, vLLM, and more.
New LFM2.5 8b A1b model!!
Introducing LFM2.5 8b A1b, a new AI model with performance on par with Nemotron 3 Nano but at higher speed. Support is being added to SmallCode for non-standard tool calls.