@rohanpaul_ai: Thinking Machines is replacing turn-taking AI with always-present AI. They just announced TML-Interaction-Small, a 276B…

X AI KOLs Following Models

Summary

Thinking Machines announced TML-Interaction-Small, a 276B MoE model designed for real-time, always-on interaction with sub-0.4s latency and integrated multimodal processing.

Thinking Machines is replacing turn-taking AI with always-present AI. They just announced TML-Interaction-Small, a 276B-parameter MoE model with 12B active parameters that treats conversation as a live stream instead of a stop-start chat box. Most AI voice systems still behave like walkie-talkies: you speak, they wait, they answer, then their view of the world freezes while they talk. Thinking Machines changes that by slicing audio, video, and text into 200ms micro-turns, so the model can listen, watch, speak, draw, search, and call tools while the interaction is still happening. This is why the demos feel different: the model can interrupt when context demands it, keep talking while listening, react to visual cues, track elapsed time, and hand harder work to a background model without vanishing from the conversation. The architecture is also cleaner than many current real-time systems because interactivity is trained into the model itself rather than patched together with voice detectors, turn detectors, separate speech models, and timing rules. The early numbers are strong: 0.40s turn-taking latency, 77.8 on FD-bench V1.5 interaction quality, and 43.4% on Audio MultiChallenge, which means it is not just fast, it still retains useful reasoning and instruction-following ability. The model can notice timing, silence, overlap, gestures, screen changes, and uncertainty as part of the same context.
Original Article

Similar Articles

Interaction Models

Hacker News Top

Thinking Machines AI announces a research preview of interaction models, a new architecture designed for native, real-time human-AI collaboration across audio, video, and text. By replacing turn-based interfaces with a multi-stream, micro-turn design, the model aims to keep humans actively in the loop while delivering state-of-the-art intelligence and responsiveness.

Micro Language Models Enable Instant Responses

Hugging Face Daily Papers

Researchers introduce 8M-30M parameter micro language models that instantly generate the first few words on-device before cloud models complete responses, enabling responsive AI on ultra-constrained devices like smartwatches.

tencent/HY-Embodied-0.5

Hugging Face Models Trending

Tencent releases HY-Embodied-0.5, a suite of foundation models designed for embodied AI agents featuring a Mixture-of-Transformers (MoT) architecture with efficient 2B and powerful 32B variants for real-world robot control and spatial-temporal reasoning.