Tag
Xiaomi has released updates to its MiMo model series, including mimo-v2.5-asr (supporting multiple dialects and lyric transcription), mimo-v2.5-pro (trillion parameters, 1M context), mimo-v2.5 (full-modal perception), and a TTS series, significantly improving agent performance and recognition capability in complex acoustic scenarios.
Introducing an open-source voice synthesis model with 2 billion parameters and 2 million hours of training. It supports 30 languages and 9 Chinese dialects, allows voice description via natural language, can clone voices from a 3-second recording, delivers 48kHz studio-quality audio, and is free for commercial use under the Apache-2.0 license.