OpenMOSS-Team/MOSS-TTS-Nano-100M

Hugging Face Models Trending 04/02/26, 03:13 PM Models

text-to-speech multilingual lightweight open-source cpu-friendly speech-generation

Summary

MOSS-TTS-Nano is an open-source multilingual speech generation model with only 0.1B parameters, designed for real-time TTS that runs directly on CPU without GPU. Released by OpenMOSS team and MOSI.AI, it enables simple local deployment for web serving and product integration.

Task: text-to-speech Tags: pytorch, moss_tts_nano, text-to-speech, custom_code, zh, en, de, es, fr, ja, it, he, ko, ru, fa, ar, pl, pt, cs, da, sv, hu, el, tr, arxiv:2603.18090, arxiv:2602.10934, license:apache-2.0, region:us

Original Article

View Cached Full Text

Cached at: 04/20/26, 02:45 PM

OpenMOSS-Team/MOSS-TTS-Nano-100M · Hugging Face

Source: https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M

MOSS-TTS-Nano is an open-sourcemultilingual tiny speech generation modelfromMOSI.AIand theOpenMOSS team. With only0.1B parameters, it is designed forrealtime speech generation, can run directly onCPU without a GPU, and keeps the deployment stack simple enough for local demos, web serving, and lightweight product integration.

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#newsNews

2026.4.10: We releaseMOSS-TTS-Nano. A demo Space is available atOpenMOSS-Team/MOSS-TTS-Nano. You can also view the demo and more details atopenmoss.github.io/MOSS-TTS-Nano-Demo/.

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#demoDemo

Online Demo:https://openmoss.github.io/MOSS-TTS-Nano-Demo/
Hugging Face Space:OpenMOSS-Team/MOSS-TTS-Nano

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#contentsContents

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#introductionIntroduction

MOSS-TTS-Nano concept

MOSS-TTS-Nano focuses on the part of TTS deployment that matters most in practice:small footprint,low latency,good enough quality for realtime products, andsimple local setup. It uses a pure autoregressiveAudio Tokenizer + LLMpipeline and keeps the inference workflow friendly for both terminal users and web-demo users.

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#main-featuresMain Features

Tiny model size: only0.1B parameters
Native audio format:48 kHz,2-channeloutput
Multilingual: supportsChinese, English, and more
Pure autoregressive architecture: built onAudio Tokenizer + LLM
Streaming inference: low realtime latency and fast first audio
CPU friendly: streaming generation can run on a4-core CPU
Long-text capable: supports long input with automatic chunked voice cloning
Open-source deployment: directpython infer\.py,python app\.py, and packaged CLI support

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#supported-languagesSupported Languages

MOSS-TTS-Nano currently supports20 languages:

LanguageCodeFlagLanguageCodeFlagLanguageCodeFlagChinesezh🇨🇳Englishen🇺🇸Germande🇩🇪Spanishes🇪🇸Frenchfr🇫🇷Japaneseja🇯🇵Italianit🇮🇹Hungarianhu🇭🇺Koreanko🇰🇷Russianru🇷🇺Persian (Farsi)fa🇮🇷Arabicar🇸🇦Polishpl🇵🇱Portuguesept🇵🇹Czechcs🇨🇿Danishda🇩🇰Swedishsv🇸🇪Greekel🇬🇷Turkishtr🇹🇷

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#quickstartQuickstart

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#environment-setupEnvironment Setup

We recommend a clean Python environment first, then installing the project in editable mode so themoss\-tts\-nanocommand becomes available locally. The examples below intentionally keep arguments minimal and rely on the repository defaults. By default, the code loadsOpenMOSS\-Team/MOSS\-TTS\-NanoandOpenMOSS\-Team/MOSS\-Audio\-Tokenizer\-Nano.

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#using-condaUsing Conda

conda create -n moss-tts-nano python=3.12 -y
conda activate moss-tts-nano

git clone https://github.com/OpenMOSS/MOSS-TTS-Nano.git
cd MOSS-TTS-Nano

pip install -r requirements.txt
pip install -e .

IfWeTextProcessingfails to install fromrequirements\.txt, try installing it manually in the same environment:

conda install -c conda-forge pynini=2.1.6.post1 -y
pip install git+https://github.com/WhizZest/WeTextProcessing.git

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#voice-clone-with-inferpyVoice Clone with`infer\.py`

This repository keeps the direct Python entrypoint for local inference. The example below usesvoice clone mode, which is the main recommended workflow for MOSS-TTS-Nano.

python infer.py \
  --prompt-audio-path assets/audio/zh_1.wav \
  --text "欢迎关注模思智能、上海创智学院与复旦大学自然语言处理实验室。"

This writes audio togenerated\_audio/infer\_output\.wavby default.

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#local-web-demo-with-apppyLocal Web Demo with`app\.py`

You can launch the local FastAPI demo for browser-based testing:

python app.py

Then openhttp://127\.0\.0\.1:18083in your browser.

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#cli-command-moss-tts-nano-generateCLI Command:`moss\-tts\-nano generate`

Afterpip install \-e \., you can call the packaged CLI directly:

moss-tts-nano generate \
  --prompt-speech assets/audio/zh_1.wav \
  --text "欢迎关注模思智能、上海创智学院与复旦大学自然语言处理实验室。"

Useful notes:

moss\-tts\-nano generatewrites togenerated\_audio/moss\_tts\_nano\_output\.wavby default.
\-\-prompt\-speechis the friendly alias for the reference audio path used by voice cloning.
\-\-text\-fileis supported for long-form synthesis.

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#cli-command-moss-tts-nano-serveCLI Command:`moss\-tts\-nano serve`

You can also launch the web demo through the packaged CLI:

moss-tts-nano serve

This command forwards toapp\.py, keeps the model loaded in memory, and serves the local browser demo plus HTTP generation endpoints.

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#moss-audio-tokenizer-nanoMOSS-Audio-Tokenizer-Nano

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#introduction-1Introduction

MOSS-Audio-Tokenizeris the unified discrete audio interface for the entire MOSS-TTS family. It is built on theCat(CausalAudioTokenizer withTransformer) architecture, a CNN-free audio tokenizer composed entirely of causal Transformer blocks. It serves as the shared audio backbone for MOSS-TTS, MOSS-TTS-Nano, MOSS-TTSD, MOSS-VoiceGenerator, MOSS-SoundEffect, and MOSS-TTS-Realtime, providing a consistent audio representation across the full product family.

To further improve perceptual quality while reducing inference cost, we trainedMOSS-Audio-Tokenizer-Nano, a lightweight tokenizer with approximately20 million parametersdesigned for high-fidelity audio compression. It supports48 kHzinput and output as well asstereo audio, which helps reduce compression loss and improve listening quality. It can compress48 kHz stereo audiointo a12.5 Hztoken stream and usesRVQ with 16 codebooks, enabling high-fidelity reconstruction across variable bitrates from0.125 kbps to 4 kbps.

To learn more about setup, advanced usage, and evaluation metrics, please visit theMOSS-Audio-Tokenizer Repository

MOSS-Audio-Tokenizer-Nano architecture Architecture of MOSS-Audio-Tokenizer-Nano

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#model-weightsModel Weights

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#licenseLicense

This repository will follow the license specified in the rootLICENSEfile. If you are reading this before that file is published, please treat the repository asnot yet licensed for redistribution.

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#citationCitation

If you use the MOSS-TTS work in your research or product, please cite:

@misc{openmoss2026mossttsnano,
  title={MOSS-TTS-Nano},
  author={OpenMOSS Team},
  year={2026},
  howpublished={GitHub repository},
  url={https://github.com/OpenMOSS/MOSS-TTS-Nano}
}

@misc{gong2026mossttstechnicalreport,
  title={MOSS-TTS Technical Report},
  author={Yitian Gong and Botian Jiang and Yiwei Zhao and Yucheng Yuan and Kuangwei Chen and Yaozhou Jiang and Cheng Chang and Dong Hong and Mingshu Chen and Ruixiao Li and Yiyang Zhang and Yang Gao and Hanfu Chen and Ke Chen and Songlin Wang and Xiaogui Yang and Yuqian Zhang and Kexin Huang and ZhengYuan Lin and Kang Yu and Ziqi Chen and Jin Wang and Zhaoye Fei and Qinyuan Cheng and Shimin Li and Xipeng Qiu},
  year={2026},
  eprint={2603.18090},
  archivePrefix={arXiv},
  primaryClass={cs.SD},
  url={https://arxiv.org/abs/2603.18090}
}

@misc{gong2026mossaudiotokenizerscalingaudiotokenizers,
  title={MOSS-Audio-Tokenizer: Scaling Audio Tokenizers for Future Audio Foundation Models}, 
  author={Yitian Gong and Kuangwei Chen and Zhaoye Fei and Xiaogui Yang and Ke Chen and Yang Wang and Kexin Huang and Mingshu Chen and Ruixiao Li and Qingyuan Cheng and Shimin Li and Xipeng Qiu},
  year={2026},
  eprint={2602.10934},
  archivePrefix={arXiv},
  primaryClass={cs.SD},
  url={https://arxiv.org/abs/2602.10934}, 
}

OpenMOSS-Team/MOSS-TTS-Nano-100M

OpenMOSS-Team/MOSS-TTS-Nano-100M · Hugging Face

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#newsNews

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#demoDemo

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#contentsContents

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#introductionIntroduction

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#main-featuresMain Features

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#supported-languagesSupported Languages

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#quickstartQuickstart

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#environment-setupEnvironment Setup

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#using-condaUsing Conda

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#voice-clone-with-inferpyVoice Clone with`infer\.py`

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#local-web-demo-with-apppyLocal Web Demo with`app\.py`

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#cli-command-moss-tts-nano-generateCLI Command:`moss\-tts\-nano generate`

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#cli-command-moss-tts-nano-serveCLI Command:`moss\-tts\-nano serve`

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#moss-audio-tokenizer-nanoMOSS-Audio-Tokenizer-Nano

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#introduction-1Introduction

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#model-weightsModel Weights

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#licenseLicense

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#citationCitation

Similar Articles

OpenMOSS-Team/MOSS-TTS-v1.5 · Hugging Face

@lmsysorg: SGLang-Omni now serves MOSS-TTS-Local Transformer v1.5 from @Open_MOSS on day 0! This is an open 48 kHz stereo TTS mode…

@MosiAI_Official: MOSS-TTS Local Transformer v1.5 is here. Clone any voice. Speak any language. Hear every detail. 30+ languages, 48 kHz …

GitHub - pwilkin/openmoss: OpenMOSS pure C++ pipeline based on GGML

@Gorden_Sun: ZONOS2: Open-source MoE TTS model. 8B total parameters, 0.9B activated parameters. Supports multilingual, voice cloning, Chinese, and Chinese results are good. Model:

Submit Feedback

Similar Articles

OpenMOSS-Team/MOSS-TTS-v1.5 · Hugging Face

@lmsysorg: SGLang-Omni now serves MOSS-TTS-Local Transformer v1.5 from @Open_MOSS on day 0! This is an open 48 kHz stereo TTS mode…

@MosiAI_Official: MOSS-TTS Local Transformer v1.5 is here. Clone any voice. Speak any language. Hear every detail. 30+ languages, 48 kHz …

GitHub - pwilkin/openmoss: OpenMOSS pure C++ pipeline based on GGML

@Gorden_Sun: ZONOS2: Open-source MoE TTS model. 8B total parameters, 0.9B activated parameters. Supports multilingual, voice cloning, Chinese, and Chinese results are good. Model:

OpenMOSS-Team/MOSS-TTS-Nano-100M · Hugging Face

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#newsNews

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#demoDemo

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#contentsContents

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#introductionIntroduction

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#main-featuresMain Features

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#supported-languagesSupported Languages

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#quickstartQuickstart

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#environment-setupEnvironment Setup

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#using-condaUsing Conda

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#voice-clone-with-inferpyVoice Clone withinfer\.py

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#local-web-demo-with-apppyLocal Web Demo withapp\.py

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#cli-command-moss-tts-nano-generateCLI Command:moss\-tts\-nano generate

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#cli-command-moss-tts-nano-serveCLI Command:moss\-tts\-nano serve

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#moss-audio-tokenizer-nanoMOSS-Audio-Tokenizer-Nano

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#introduction-1Introduction

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#model-weightsModel Weights

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#licenseLicense

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#citationCitation

Similar Articles

OpenMOSS-Team/MOSS-TTS-v1.5 · Hugging Face

@lmsysorg: SGLang-Omni now serves MOSS-TTS-Local Transformer v1.5 from @Open_MOSS on day 0! This is an open 48 kHz stereo TTS mode…

@MosiAI_Official: MOSS-TTS Local Transformer v1.5 is here. Clone any voice. Speak any language. Hear every detail. 30+ languages, 48 kHz …

GitHub - pwilkin/openmoss: OpenMOSS pure C++ pipeline based on GGML

@Gorden_Sun: ZONOS2: Open-source MoE TTS model. 8B total parameters, 0.9B activated parameters. Supports multilingual, voice cloning, Chinese, and Chinese results are good. Model:

Submit Feedback

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#voice-clone-with-inferpyVoice Clone with`infer\.py`

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#local-web-demo-with-apppyLocal Web Demo with`app\.py`

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#cli-command-moss-tts-nano-generateCLI Command:`moss\-tts\-nano generate`

https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Nano-100M#cli-command-moss-tts-nano-serveCLI Command:`moss\-tts\-nano serve`