multi-speaker

#multi-speaker

@MosiAI_Official: MOSS-Transcribe-Diarize-0.9B is now open source on @huggingface. Built with an end-to-end audio-to-structured-transcrip…

X AI KOLs Following ↗ · 2026-07-09 Cached

MOSS-Transcribe-Diarize-0.9B is an open-source end-to-end audio understanding model for long-form multi-speaker transcription, diarization, and timestamp generation, released by Mosi AI under Apache 2.0.

0 favorites 0 likes

#multi-speaker

Fish Audio S2 Technical Report

Papers with Code Trending ↗ · 2026-03-09 Cached

Fish Audio S2 is an open-source text-to-speech system featuring multi-speaker capabilities, multi-turn generation, and instruction-following control, backed by a production-ready inference engine with low latency.

0 favorites 0 likes

#multi-speaker

VibeVoice Technical Report

Papers with Code Trending ↗ · 2025-08-26 Cached

VibeVoice is a new model from Microsoft that synthesizes long-form multi-speaker speech using next-token diffusion and a highly efficient continuous speech tokenizer. It achieves superior fidelity and compression, supporting up to 90 minutes of audio with multiple speakers.

0 favorites 0 likes

multi-speaker

@MosiAI_Official: MOSS-Transcribe-Diarize-0.9B is now open source on @huggingface. Built with an end-to-end audio-to-structured-transcrip…

Fish Audio S2 Technical Report

VibeVoice Technical Report

Submit Feedback