@FinanceYF5: Alibaba just released Wan Streamer. AI agents can now see you, hear you, and respond to you in real time via video. This is no longer just a "voice mode."

X AI KOLs Following Products

Summary

Alibaba has launched Wan Streamer, an AI agent capable of seeing, hearing, and responding in real time via video.

Alibaba just released Wan Streamer. AI agents can now see you, hear you, and respond to you in real time via video. This is no longer just a "voice mode" 🤯. https://t.co/isZOeoTyaD
Original Article
View Cached Full Text

Cached at: 06/30/26, 03:35 AM

Alibaba has just released Wan Streamer.

AI agents can now see you, hear you, and respond to you in real time via video.

This is no longer just “voice mode” 🤯. https://t.co/isZOeoTyaD

Similar Articles

@FinanceYF5: Meta AI is transforming from a 'chat box' into an always-on perception layer. Alexandr Wang mentioned that the Muse Spark update includes voice conversations, real-time camera AI, and a gradual transition into glasses. The point is not just another voice assistant, but AI beginning to see, hear, and understand the world in front of you.

X AI KOLs Following

Meta AI is evolving from a chat box into an always-on perception layer, adding voice conversations, real-time camera AI capabilities, and gradually moving into glasses form, enabling AI to see, hear, and understand the world in front of the user.

@IndieDevHailey: Tencent's new Marvis truly embeds AI into the operating system! My computer finally understands human language. Previously, I had to manually search for files, spend ages tweaking settings, and keep an eye on scheduled tasks… Now with just one sentence, the AI does it all for me. This is Tencent's latest Marvis — a true system-level AI assistant.

X AI KOLs Timeline

Tencent has released a system-level AI assistant called Marvis, which can directly access operating system resources. It performs file searches, system settings, scheduled tasks, etc., via natural language commands, and supports multi-agent collaboration and real-time token consumption display.

@FinanceYF5: OpenAI's new voice model Bidi 1 first test exposure - Bidirectional voice design: while you speak, it listens; you can interrupt mid-way to switch tasks immediately, no longer grabbing the conversation when you pause. It also supports real-time translation, and context memory is much stronger than the current Advanced Voice. It's now being pushed to a small group, ChatGPT …

X AI KOLs Following

OpenAI's new voice model Bidi 1 first test exposure, supports bidirectional voice design, real-time translation, and stronger context memory, currently being pushed to a small group on ChatGPT.