@LinearUncle: Recommending an open-source voice cloning repository from a Chinese company called Mosi: MOSS-TTS. You read a passage, it clones your voice, then you can use your voice to read any text. Check the post details to see how I used it in practice—it works great and can be indistinguishable from the real thing. https://github.com/OpenMOS…
Summary
MOSS-TTS is an open-source voice cloning model introduced by Mosi Company. Users can clone a voice by reading a small amount of text, and then use the cloned voice to generate any speech with realistic results.
View Cached Full Text
Cached at: 06/20/26, 06:21 PM
Architecture of MOSS-TTS-Nano
Architecture of MOSS Audio Tokenizer
Similar Articles
@FakeMaidenMaker: Explosive! This open-source project converts text to human-like voice for free, can clone anyone's voice, and adjust timbre with text! GitHub has garnered 30K stars, from Mianbao Intelligent OpenBMB, VoxCPM previously topped both GitHub and HuggingFace charts. Do...
VoxCPM2 is an open-source speech synthesis model from OpenBMB, using a tokenizer-free diffusion autoregressive architecture, supporting 30 languages, voice design, and controllable voice cloning. It can clone a voice with just one sentence, or create a brand new voice using text, outputting 48kHz high-quality audio, and is commercially usable.
@Honcia13: Open-source TTS is going crazy! New weapons for industrial park scams? Tsinghua OpenBMB just released VoxCPM2: 20 billion parameters + 2 million hours of multilingual data training, 48kHz studio-quality sound! The most intense part is—no Tokenizer needed at all, performing diffusion autoregression directly in continuous latent space, maximizing detail retention!
Tsinghua University's OpenBMB has released VoxCPM2, an open-source multilingual TTS model with 20 billion parameters. It supports continuous latent space diffusion autoregressive generation without a Tokenizer, offering 48kHz studio-quality audio and powerful voice cloning and design capabilities.
@Gorden_Sun: NetEase Youdao open-sources Confucius4-TTS, a 1.3B TTS model, supports multilingual, supports voice cloning, good results, very fast. Github: https://github.com/netease-youdao/Confucius4-TTS… Online demo: …
NetEase Youdao open-sourced the 1.3B parameter Confucius4-TTS model, supporting zero-shot voice cloning and cross-lingual speech synthesis in 14 languages, fast and with excellent results.
@denziideng: Another AI voice cloning 'dimensional reduction attack'... The CosyVoice I shared before can clone in 3 seconds, which I thought was already scary enough. But today's tool is even more lethal — after casually recording 1 minute of my own voice for training, it directly replicates tone, mannerisms, emotions, breathing, and pauses. It's almost like the soul of the original person possessed it! C...
GPT-SoVITS is an open-source AI voice cloning tool that supports zero-shot (5-second voice) and few-shot (1-minute training) high-fidelity voice cloning, cross-lingual inference, and comes with a complete WebUI toolchain. It has garnered 57.8k stars on GitHub, becoming the leading open-source project in the voice cloning field.
@hisevenih: The AI voice community is blown away. This GitHub open-source black tech takes AI voice to an insane level, truly achieving: one sentence, one voice. Remember this project name: VoxCPM2. It has already gained 20K stars on GitHub. Most incredibly, it doesn't even need a reference audio…
GitHub open-source project VoxCPM2 achieves AI voice cloning without reference audio, generating target voice precisely with just one sentence, has gained 20K stars.