Higgs Audio v3 TTS 4B. Built for voice chat. Support 100 languages and inline control.
Summary
Higgs Audio v3 is a 4B parameter TTS model designed for voice chat applications, supporting 100 languages with inline control capabilities.
Similar Articles
@Prince_Canuma: mlx-audio v0.4.3 is here A massive release across models, server, and DX → 6 new TTS models: Higgs Audio v2 (voice clon…
mlx-audio v0.4.3 releases with 6 new TTS models including Higgs Audio v2 and OmniVoice (646+ languages), plus server improvements like concurrent requests and continuous batching, ~3x faster Voxtral Realtime on 4-bit, and slimmer dependencies for Apple Silicon.
Aratako/Irodori-TTS-500M-v3
Irodori-TTS-500M-v3 is a Japanese TTS model based on Rectified Flow Diffusion Transformer, supporting zero-shot voice cloning and unique emoji-based style/sound effect control.
Gemini 3.1 Flash TTS
Google released Gemini 3.1 Flash TTS, a new text-to-speech model accessible via the Gemini API that supports advanced prompt-based control for detailed voice direction, accents, and speaking styles. The model enables sophisticated audio generation including multi-speaker conversations and character-specific vocal performances.
Gemini 3.5 Live Translate
Gemini 3.5 Live Translate is a new audio model for real-time speech-to-speech translation.
OpenMOSS-Team/MOSS-TTS-v1.5 · Hugging Face
MOSS-TTS v1.5 is an updated open-source text-to-speech model with improved multilingual synthesis (supporting 31 languages), more stable zero-shot voice cloning, and explicit inline pause control.