@NielsRogge: Just added this blog as a project page to https://paperswithcode.co/paper/2410.00037…! Hope that more people can learn …

X AI KOLs Following 06/18/26, 04:11 PM Papers

full-duplex-voice voice-models open-source moshi blog tutorial papers-with-code

Summary

NielsRogge added a blog explaining the Moshi full-duplex voice model as a project page on Papers With Code, aiming to increase accessibility to the state-of-the-art architecture.

Just added this blog as a project page to https://t.co/6yQIMR6Ltn! Hope that more people can learn about state-of-the-art full-duplex voice models this way :) https://t.co/IcZwVCjIp2

Original Article

View Cached Full Text

Cached at: 06/18/26, 06:10 PM

Just added this blog as a project page to https://t.co/6yQIMR6Ltn!

Hope that more people can learn about state-of-the-art full-duplex voice models this way :) https://t.co/IcZwVCjIp2

rohit (@bicro_): Moshi is one of the best open source full-duplex voice models out there. The architecture is dense, so we spent a few days studying it and wrote up what we learned, with diagrams to make it click faster.

Let us know if it was helpful 🤠

Similar Articles

@kyutai_labs: New paper: Multi-Faceted Interactivity Alignment in Full-Duplex Speech Models We use RL to post-train speech models (Mo…

X AI KOLs Following

Kyutai Labs released a new paper on using reinforcement learning to post-train speech models (Moshi and PersonaPlex) for more human-like interaction, including when to respond, wait, or give listening cues.

@NielsRogge: Introducing a revival of PapersWithCode! As @ilyasut said, we're back to the "age of research". Hence, it's important t…

X AI KOLs Following

NielsRogge announces a revival of PapersWithCode, featuring SOTA per domain, leaderboards, and methods parsed at scale using AI agents.

MoshiRAG: Asynchronous Knowledge Retrieval for Full-Duplex Speech Language Models

arXiv cs.CL

MoshiRAG combines a compact full-duplex speech language model with asynchronous retrieval-augmented generation to improve factuality while maintaining real-time interactivity. The approach leverages natural temporal gaps in conversation to retrieve external knowledge without disrupting the natural flow of dialogue.

@NielsRogge: Introducing Conference Papers on http://paperswithcode.co! Browse all accepted @CVPR 2026 papers easily by domain, from…

X AI KOLs Following

Papers With Code introduces a feature to browse all CVPR 2026 accepted papers by domain, with links to code, project pages, and Hugging Face artifacts, including Oral and Spotlight papers.

Synchronization and Turn-Taking in Full-Duplex Speech Dialogue Models

arXiv cs.CL

This paper analyzes synchronization and turn-taking dynamics in full-duplex speech dialogue models by simulating conversations between two instances of the Moshi model, measuring representational alignment via CKA and predicting turn boundaries with LSTM probes.

Similar Articles

@kyutai_labs: New paper: Multi-Faceted Interactivity Alignment in Full-Duplex Speech Models We use RL to post-train speech models (Mo…

@NielsRogge: Introducing a revival of PapersWithCode! As @ilyasut said, we're back to the "age of research". Hence, it's important t…

MoshiRAG: Asynchronous Knowledge Retrieval for Full-Duplex Speech Language Models

@NielsRogge: Introducing Conference Papers on http://paperswithcode.co! Browse all accepted @CVPR 2026 papers easily by domain, from…

Synchronization and Turn-Taking in Full-Duplex Speech Dialogue Models

Submit Feedback