Supertone/supertonic-3

Hugging Face Models Trending 2026/05/06 20:46 模型

text-to-speech on-device-ai open-source multilingual tts-model hugging-face

摘要

Supertonic 3 是一个轻量级的开权重文本转语音模型，专为快速设备端推理而设计，支持的语言扩展至 31 种，并提升了稳定性及表情标签支持。

任务：文本转语音标签：supertonic, onnx, text-to-speech, speech-synthesis, tts, multilingual, on-device, en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi, license:openrail, region:us

查看原文

查看缓存全文

缓存时间: 2026/05/10 12:34

Supertone/supertonic-3 · Hugging Face 来源：https://huggingface.co/Supertone/supertonic-3 ## https://huggingface.co/Supertone/supertonic-3#supertonic-3–lightning-fast-on-device-accurate-ttsSupertonic 3 | 闪电般快速、端侧运行、高精度的 TTS Supertonic 3 预览 (https://huggingface.co/Supertone/supertonic-3/blob/main/img/Supertonic3_HeroImage.png) 演示 (https://huggingface.co/spaces/Supertone/supertonic-3)代码 (https://github.com/supertone-inc/supertonic)Python SDK (https://pypi.org/project/supertonic/) Supertonic是一个轻量级的文本转语音（TTS）系统，用于本地推理。它完全在您的设备上通过 ONNX Runtime 运行，合成过程无需任何云端调用。Supertonic 3将开源权重版本的支持语言从 5 种扩展至31 种，提升了朗读稳定性，并减少了重复/跳读错误。 ## https://huggingface.co/Supertone/supertonic-3#quick-start快速入门安装 Python SDK 并立即生成语音。首次运行时，SDK 会从 Hugging Face 下载模型资源。 `pip install supertonic` `from supertonic import TTS tts = TTS(auto_download=True) style = tts.get_voice_style(voice_name="M1") text = "A gentle breeze moved through the open window while everyone listened to the story." wav, duration = tts.synthesize(text, voice_style=style, lang="en") tts.save_audio(wav, "output.wav") print(f"Generated {duration:.2f}s of audio")` ## https://huggingface.co/Supertone/supertonic-3#whats-new-in-supertonic-3Supertonic 3 的新特性 - 31 种语言：相比 Supertonic 2 的 5 种语言版本进行了扩展。 - 更稳定的朗读：减少了重复和跳读错误，特别是在短语句和长语句方面。 - 更高的说话人相似度：相比 Supertonic 2，在共享语言集上的相似度得到提升。 - 表达标签：支持简单的标签，如`<break>`、`<emphasis>`和`<say-as>`。 ## https://huggingface.co/Supertone/supertonic-3#performance-highlights性能亮点 Supertonic 3 旨在实现实用的端侧推理：体积小巧足以在本地运行，同时与更大规模的开源 TTS 系统相比仍具备竞争力。 ### https://huggingface.co/Supertone/supertonic-3#reading-accuracy朗读准确度 Supertonic 3 的朗读准确度与测量模型范围及 VoxCPM2 的对比在测量的语言范围内，与 VoxCPM2 等规模大得多的开源 TTS 模型相比，Supertonic 3 保持了具有竞争力的 WER/CER（词错误率/字错误率）范围，同时保留了轻量级的端侧部署路径。带星号的语言使用 CER；其他语言使用 WER。 ### https://huggingface.co/Supertone/supertonic-3#supertonic-2-to-supertonic-3从 Supertonic 2 到 Supertonic 3 Supertonic 2 与 Supertonic 3 的对比与 Supertonic 2 相比，Supertonic 3 减少了重复和跳读错误，提升了共享语言集上的说话人相似度，并将语言覆盖范围从 5 种扩展至 31 种。 ### https://huggingface.co/Supertone/supertonic-3#runtime-footprint运行时占用 Supertonic CPU 运行时与 GPU 基线的对比即使在 A100 GPU 上测量到的较大基线模型面前，Supertonic 3 在 CPU 上也能快速运行，并且使用的内存要少得多。它不需要 GPU，这使得本地、浏览器和边缘部署变得更加容易。 ### https://huggingface.co/Supertone/supertonic-3#model-size模型大小模型大小对比在公开的 ONNX 资产中，Supertonic 3 的参数数量约为 99M，远小于 0.7B 到 2B 级别的开源 TTS 系统。更小的模型大小在下载体积、启动时间和端侧推理方面具有实际优势。 ## https://huggingface.co/Supertone/supertonic-3#supported-languages支持的语言代码语言代码语言代码语言代码语言`en`英语`ko`韩语`ja`日语`ar`阿拉伯语`bg`保加利亚语`cs`捷克语`da`丹麦语`de`德语`el`希腊语`es`西班牙语`et`爱沙尼亚语`fi`芬兰语`fr`法语`hi`印地语`hr`克罗地亚语`hu`匈牙利语`id`印尼语`it`意大利语`lt`立陶宛语`lv`拉脱维亚语`nl`荷兰语`pl`波兰语`pt`葡萄牙语`ro`罗马尼亚语`ru`俄语`sk`斯洛伐克语`sl`斯洛文尼亚语`sv`瑞典语`tr`土耳其语`uk`乌克兰语`vi`越南语 ## https://huggingface.co/Supertone/supertonic-3#license许可证本项目的示例代码采用 MIT 许可证发布。详情请参见GitHub 仓库 (https://github.com/supertone-inc/supertonic)。附带的模型采用 OpenRAIL-M 许可证发布。详情请参见本仓库中的LICENSE (https://huggingface.co/Supertone/supertonic-3/blob/main/LICENSE)文件。本模型使用 PyTorch 训练，PyTorch 采用 BSD 3-Clause 许可证，但不随本项目重新分发。详情请参见PyTorch 许可证 (https://docs.pytorch.org/FBGEMM/general/License.html)。版权所有 (c) 2026 Supertone Inc.

相似文章

supertone-inc/supertonic

GitHub Trending (daily)

Supertonic 是一个开源的端内文本转语音（TTS）系统，专为本地推理设计，具有极低的开销。现已发布第 3 版，支持 31 种语言，并提升了准确性。

@akshay_pachaar: 这个TTS模型生成语音的速度比人耳听到快167倍。Supertonic 是一款通过ONNX实现跨平台推理的设备端TTS引擎…

X AI KOLs Following

Supertonic 是一款新的开源TTS引擎，通过ONNX在设备上运行，支持31种语言，在速度上超越ElevenLabs，即使在无GPU的树莓派上也能运行。

@JafarNajafov：Supertonic 彻底击败了 ElevenLabs。一个完全在本地运行的文本转语音模型。无云端。无 API 密钥。无...

X AI KOLs Timeline

本文重点介绍了 Supertonic，这是一个开源的文本转语音模型，完全在本地设备运行。与 ElevenLabs 和 OpenAI 等基于云端的服务相比，它在速度和格式准确性方面均表现出色。

@GoJun315: 本地跑的开源 TTS，把 ElevenLabs 干掉了。 Supertonic，完全跑在本地的语音合成模型，不联网、零 API 费用。 - 仅 99M 参数，M4 Pro 上比实时快 167 倍，树莓派也能跑 - 支持 31 种语言，覆盖…

X AI KOLs Timeline

Supertonic is a lightning-fast, on-device TTS model with 99M parameters, supporting 31 languages. It runs locally with no API costs, outperforms cloud TTS on accuracy for numbers, phone numbers, and technical terms, and can be installed via Python, Node.js, Rust, Go, and more.

@FeitengLi: 99M 参数的 TTS 跑在 CPU 上，比 2B 大模型跑在 A100 上还快。 Supertone 新开源的 supertonic-3 ONNX Runtime，完全本地，浏览器能跑，手机能跑，树莓派也能跑。

X AI KOLs Timeline

Supertone released Supertonic 3, an open-source TTS model with 99M parameters that runs faster on CPU than a 2B model on A100, supporting 31 languages and ONNX Runtime for fully local inference.

相似文章

supertone-inc/supertonic

@akshay_pachaar: 这个TTS模型生成语音的速度比人耳听到快167倍。Supertonic 是一款通过ONNX实现跨平台推理的设备端TTS引擎…

@JafarNajafov：Supertonic 彻底击败了 ElevenLabs。一个完全在本地运行的文本转语音模型。无云端。无 API 密钥。无...

@GoJun315: 本地跑的开源 TTS，把 ElevenLabs 干掉了。 Supertonic，完全跑在本地的语音合成模型，不联网、零 API 费用。 - 仅 99M 参数，M4 Pro 上比实时快 167 倍，树莓派也能跑 - 支持 31 种语言，覆盖…

@FeitengLi: 99M 参数的 TTS 跑在 CPU 上，比 2B 大模型跑在 A100 上还快。 Supertone 新开源的 supertonic-3 ONNX Runtime，完全本地，浏览器能跑，手机能跑，树莓派也能跑。

提交意见反馈