@FeitengLi: A 99M parameter TTS runs on CPU, faster than a 2B model on A100. Supertone's newly open-sourced supertonic-3 with ONNX Runtime, fully local, can run in browser, on phone, and even on Raspberry Pi.

X AI KOLs Timeline Models

Summary

Supertone released Supertonic 3, an open-source TTS model with 99M parameters that runs faster on CPU than a 2B model on A100, supporting 31 languages and ONNX Runtime for fully local inference.

A 99M parameter TTS runs on CPU, faster than a 2B model on A100. Supertone's newly open-sourced supertonic-3 ONNX Runtime, fully local, can run in browser, on phone, and even on Raspberry Pi. https://t.co/brEESjEY0t
Original Article
View Cached Full Text

Cached at: 05/15/26, 11:08 PM

A 99M-parameter TTS runs on CPU, faster than a 2B large model running on A100. Supertone’s newly open-sourced supertonic-3 ONNX Runtime is fully local, runs in a browser, on a phone, and even on a Raspberry Pi. https://t.co/brEESjEY0t — # Supertone/supertonic-3 · Hugging Face Source: https://huggingface.co/Supertone/supertonic-3 ## https://huggingface.co/Supertone/supertonic-3#supertonic-3–lightning-fast-on-device-accurate-ttsSupertonic 3 | Lightning Fast, On-Device, Accurate TTS Supertonic 3 Preview (https://huggingface.co/Supertone/supertonic-3/blob/main/img/Supertonic3_HeroImage.png) Demo (https://huggingface.co/spaces/Supertone/supertonic-3)Code (https://github.com/supertone-inc/supertonic)Python SDK (https://pypi.org/project/supertonic/) Supertonic is a lightweight text-to-speech system for local inference. It runs with ONNX Runtime entirely on your device, with no cloud call required for synthesis. Supertonic 3 expands the open-weight release from 5 to 31 languages, improves reading stability, and reduces repeat/skip failures. ## https://huggingface.co/Supertone/supertonic-3#quick-startQuick Start Install the Python SDK and generate speech immediately. On first run, the SDK downloads the model assets from Hugging Face. pip install supertonic from supertonic import TTS tts = TTS(auto_download=True) style = tts.get_voice_style(voice_name="M1") text = "A gentle breeze moved through the open window while everyone listened to the story." wav, duration = tts.synthesize(text, voice_style=style, lang="en") tts.save_audio(wav, "output.wav") print(f"Generated {duration:.2f}s of audio") ## https://huggingface.co/Supertone/supertonic-3#whats-new-in-supertonic-3What’s New in Supertonic 3 - 31 languages: expanded from the 5-language Supertonic 2 release. - More stable reading: fewer repeat and skip failures, especially on short and long utterances. - Higher speaker similarity: improved similarity across the shared-language set compared with Supertonic 2. - Expression tags: supports simple tags such as , , and ``. ## https://huggingface.co/Supertone/supertonic-3#performance-highlightsPerformance Highlights Supertonic 3 is designed for practical on-device inference: compact enough to run locally, while staying competitive with much larger open TTS systems. ### https://huggingface.co/Supertone/supertonic-3#reading-accuracyReading Accuracy Supertonic 3 reading accuracy compared with measured model ranges and VoxCPM2 Across measured languages, Supertonic 3 stays within a competitive WER/CER range against much larger open TTS models such as VoxCPM2, while preserving a lightweight on-device deployment path. Asterisked languages use CER; the others use WER. ### https://huggingface.co/Supertone/supertonic-3#supertonic-2-to-supertonic-3Supertonic 2 to Supertonic 3 Supertonic 2 and Supertonic 3 comparison Compared with Supertonic 2, Supertonic 3 reduces repeat and skip failures, improves speaker similarity across the shared-language set, and expands language coverage from 5 to 31 languages. ### https://huggingface.co/Supertone/supertonic-3#runtime-footprintRuntime Footprint Supertonic CPU runtime compared with GPU baselines Supertonic 3 runs fast on CPU, even compared with larger baselines measured on A100 GPU, and uses substantially less memory. It does not require a GPU, which makes local, browser, and edge deployment much easier. ### https://huggingface.co/Supertone/supertonic-3#model-sizeModel Size Model size comparison At about 99M parameters across the public ONNX assets, Supertonic 3 is much smaller than 0.7B to 2B class open TTS systems. The smaller model size is a practical advantage for download size, startup time, and on-device inference. ## https://huggingface.co/Supertone/supertonic-3#supported-languagesSupported Languages CodeLanguageCodeLanguageCodeLanguageCodeLanguageenEnglishkoKoreanjaJapanesearArabicbgBulgariancsCzechdaDanishdeGermanelGreekesSpanishetEstonianfiFinnishfrFrenchhiHindihrCroatianhuHungarianidIndonesianitItalianltLithuanianlvLatviannlDutchplPolishptPortugueseroRomanianruRussianskSlovakslSloveniansvSwedishtrTurkishukUkrainianviVietnamese ## https://huggingface.co/Supertone/supertonic-3#licenseLicense This project’s sample code is released under the MIT License. See theGitHub repository (https://github.com/supertone-inc/supertonic)for details. The accompanying model is released under the OpenRAIL-M License. See theLICENSE (https://huggingface.co/Supertone/supertonic-3/blob/main/LICENSE)file in this repository for details. This model was trained using PyTorch, which is licensed under the BSD 3-Clause License but is not redistributed with this project. See thePyTorch license (https://docs.pytorch.org/FBGEMM/general/License.html)for details. Copyright (c) 2026 Supertone Inc.

Similar Articles

@GoJun315: Open-source TTS that runs locally and beats ElevenLabs. Supertonic, a speech synthesis model that runs entirely on-device, no internet required, zero API costs. - Only 99M parameters, 167x faster than real-time on M4 Pro, runs on Raspberry Pi - Supports 31 languages, covering…

X AI KOLs Timeline

Supertonic is a lightning-fast, on-device TTS model with 99M parameters, supporting 31 languages. It runs locally with no API costs, outperforms cloud TTS on accuracy for numbers, phone numbers, and technical terms, and can be installed via Python, Node.js, Rust, Go, and more.

@FeitengLi: Hy-MT2 - a new open-source multilingual translation model that matches top-tier large models in capability, supports translation between 33 languages, and offers flexible instruction capabilities. It achieves 2-bit quantization under 500MB, making it well-suited for on-device deployment. https://modelsc…

X AI KOLs Timeline

Hy-MT2 is a new open-source multilingual translation model from Tencent Hy that supports 33 languages, offers flexible instruction capabilities, and achieves 2-bit quantization under 500MB for on-device deployment.

Supertone/supertonic-3

Hugging Face Models Trending

Supertonic 3 is a lightweight, open-weight text-to-speech model designed for fast on-device inference, expanding support to 31 languages with improved stability and expression tags.