@seclink: Xiaomi announces MiMo-V2.5-Pro-UltraSpeed internal test - full performance, 1000 tokens/s peak speed. Completely unleashes the productivity of Coding Agent. Limited trial resources, daily capped approvals, priority for professional institutions.
Summary
Xiaomi launches internal test of MiMo-V2.5-Pro-UltraSpeed model, with peak speed of 1000 tokens/s, aiming to boost the productivity of Coding Agent. Trial resources are limited and directed to professional institutions.
View Cached Full Text
Cached at: 06/10/26, 03:53 PM
Xiaomi dropped a big one: Apply for MiMo-V2.5-Pro-UltraSpeed internal beta
Full performance, peak speed of 1000 tokens/s. Completely unleashing the productivity limits of Coding Agent.
Trial resources are limited, daily approvals are capped, and priority is given to professional institutions.
https://t.co/lg7NsRTwKp
Similar Articles
Xiaomi just claimed 1,000+ tps on a 1T model using a standard 8-GPU server
Xiaomi released MiMo-V2.5-Pro-UltraSpeed in collaboration with TileRT, achieving over 1000 tokens/s decode speed on a 1-trillion-parameter model, enabling real-time AI interaction and accelerating coding agents and reasoning tasks.
@zephyr_z9: This is super big I think this is the first useful speculative decoding method deployed on a big quasi frontier model M…
Xiaomi MiMo releases MiMo-V2.5-Pro-UltraSpeed, achieving over 1,000 tokens per second on a 1 trillion parameter model using speculative decoding, the first practical deployment of such speed at scale.
@heyshrutimishra: OH MY GOD CHINA JUST MATCHED USA FRONTIER CODING AI AT 40-60% LOWER TOKEN COST. XIAOMI JUST DROPPED MiMo-V2.5-Pro score…
Xiaomi released MiMo-V2.5-Pro, a coding AI scoring 73.7 on SWE-Bench Pro (near Claude Opus 4.6's 77.1) at 40-60% lower token cost than US frontier models.
@seclink: 小米发布了 mimo - code ,可以下载代码用起来
Xiaomi released MiMoCode, an open-source AI coding agent with cross-session memory, available on GitHub and installable via one-line command or npm.
China's Xiaomi MiMo Is Now 15X Faster Than ChatGPT and Claude (4 minute read)
Xiaomi achieved over 1,000 tokens per second inference on its trillion-parameter MiMo-V2.5-Pro-UltraSpeed model using commodity 8-GPU nodes via FP4 quantization and DFlash speculative decoding, outpacing GPT-5.5 and Claude Opus by over 10x.