@LinearUncle: Recommending an open-source voice cloning repository from a Chinese company called Mosi: MOSS-TTS. You read a passage, it clones your voice, then you can use your voice to read any text. Check the post details to see how I used it in practice—it works great and can be indistinguishable from the real thing. https://github.com/OpenMOS…

X AI KOLs Timeline Models

Summary

MOSS-TTS is an open-source voice cloning model introduced by Mosi Company. Users can clone a voice by reading a small amount of text, and then use the cloned voice to generate any speech with realistic results.

Recommend an open-source voice cloning repository from a Chinese company called MOSS-TTS: MOSS-TTS You read a passage, it clones your voice, and then you can use your voice to read any text. Check the post details to see how I used it in practice – the result is impressive and can fool people. https://github.com/OpenMOSS/MOSS-TTS… Model download: https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Local-Transformer-v1.5… Practical steps: 1. Enter the following prompt in codex to help me install and run: ``` read https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Local-Transformer-v1.5…, install and run it locally ``` 2. Codex starts downloading the model, but it's very slow – need to tell it to use aria2 for download. 3. After download completes, ask it to provide steps for cloning my voice, and follow them. 4. After cloning the voice, ask it to generate an audio file of it reading Li Bai's "Quiet Night Thought" in my voice. I listened to the result – it's excellent! Sounds very much like my voice! Follow me for actual hands-on practice, not just simple reposts!!
Original Article
View Cached Full Text

Cached at: 06/20/26, 06:21 PM

Architecture of MOSS-TTS-Nano

Architecture of MOSS Audio Tokenizer

Similar Articles

@FakeMaidenMaker: Explosive! This open-source project converts text to human-like voice for free, can clone anyone's voice, and adjust timbre with text! GitHub has garnered 30K stars, from Mianbao Intelligent OpenBMB, VoxCPM previously topped both GitHub and HuggingFace charts. Do...

X AI KOLs Timeline

VoxCPM2 is an open-source speech synthesis model from OpenBMB, using a tokenizer-free diffusion autoregressive architecture, supporting 30 languages, voice design, and controllable voice cloning. It can clone a voice with just one sentence, or create a brand new voice using text, outputting 48kHz high-quality audio, and is commercially usable.

@Honcia13: Open-source TTS is going crazy! New weapons for industrial park scams? Tsinghua OpenBMB just released VoxCPM2: 20 billion parameters + 2 million hours of multilingual data training, 48kHz studio-quality sound! The most intense part is—no Tokenizer needed at all, performing diffusion autoregression directly in continuous latent space, maximizing detail retention!

X AI KOLs Timeline

Tsinghua University's OpenBMB has released VoxCPM2, an open-source multilingual TTS model with 20 billion parameters. It supports continuous latent space diffusion autoregressive generation without a Tokenizer, offering 48kHz studio-quality audio and powerful voice cloning and design capabilities.

@denziideng: Another AI voice cloning 'dimensional reduction attack'... The CosyVoice I shared before can clone in 3 seconds, which I thought was already scary enough. But today's tool is even more lethal — after casually recording 1 minute of my own voice for training, it directly replicates tone, mannerisms, emotions, breathing, and pauses. It's almost like the soul of the original person possessed it! C...

X AI KOLs Timeline

GPT-SoVITS is an open-source AI voice cloning tool that supports zero-shot (5-second voice) and few-shot (1-minute training) high-fidelity voice cloning, cross-lingual inference, and comes with a complete WebUI toolchain. It has garnered 57.8k stars on GitHub, becoming the leading open-source project in the voice cloning field.