Higgs Audio v3 TTS 4B. Built for voice chat. Support 100 languages and inline control.

Reddit r/LocalLLaMA 06/04/26, 10:26 PM Models

Summary

Higgs Audio v3 is a 4B parameter TTS model designed for voice chat applications, supporting 100 languages with inline control capabilities.

No content available

Original Article

Similar Articles

@Prince_Canuma: mlx-audio v0.4.3 is here A massive release across models, server, and DX → 6 new TTS models: Higgs Audio v2 (voice clon…

X AI KOLs Timeline

mlx-audio v0.4.3 releases with 6 new TTS models including Higgs Audio v2 and OmniVoice (646+ languages), plus server improvements like concurrent requests and continuous batching, ~3x faster Voxtral Realtime on 4-bit, and slimmer dependencies for Apple Silicon.

Aratako/Irodori-TTS-500M-v3

Hugging Face Models Trending

Irodori-TTS-500M-v3 is a Japanese TTS model based on Rectified Flow Diffusion Transformer, supporting zero-shot voice cloning and unique emoji-based style/sound effect control.

Gemini 3.1 Flash TTS

Simon Willison's Blog

Google released Gemini 3.1 Flash TTS, a new text-to-speech model accessible via the Gemini API that supports advanced prompt-based control for detailed voice direction, accents, and speaking styles. The model enables sophisticated audio generation including multi-speaker conversations and character-specific vocal performances.

Gemini 3.5 Live Translate

Product Hunt

Gemini 3.5 Live Translate is a new audio model for real-time speech-to-speech translation.

OpenMOSS-Team/MOSS-TTS-v1.5 · Hugging Face

Reddit r/LocalLLaMA

MOSS-TTS v1.5 is an updated open-source text-to-speech model with improved multilingual synthesis (supporting 31 languages), more stable zero-shot voice cloning, and explicit inline pause control.