@FinanceYF5: Alibaba just released Wan Streamer. AI agents can now see you, hear you, and respond to you in real time via video. This is no longer just a "voice mode."

X AI KOLs Following 06/29/26, 09:05 AM Products

Summary

Alibaba has launched Wan Streamer, an AI agent capable of seeing, hearing, and responding in real time via video.

Alibaba just released Wan Streamer. AI agents can now see you, hear you, and respond to you in real time via video. This is no longer just a "voice mode" 🤯. https://t.co/isZOeoTyaD

Original Article

View Cached Full Text

Cached at: 06/30/26, 03:35 AM

Alibaba has just released Wan Streamer.

AI agents can now see you, hear you, and respond to you in real time via video.

This is no longer just “voice mode” 🤯. https://t.co/isZOeoTyaD

Similar Articles

@FinanceYF5: Meta AI is transforming from a 'chat box' into an always-on perception layer. Alexandr Wang mentioned that the Muse Spark update includes voice conversations, real-time camera AI, and a gradual transition into glasses. The point is not just another voice assistant, but AI beginning to see, hear, and understand the world in front of you.

X AI KOLs Following

Meta AI is evolving from a chat box into an always-on perception layer, adding voice conversations, real-time camera AI capabilities, and gradually moving into glasses form, enabling AI to see, hear, and understand the world in front of the user.

@IndieDevHailey: Tencent's new Marvis truly embeds AI into the operating system! My computer finally understands human language. Previously, I had to manually search for files, spend ages tweaking settings, and keep an eye on scheduled tasks… Now with just one sentence, the AI does it all for me. This is Tencent's latest Marvis — a true system-level AI assistant.

X AI KOLs Timeline

Tencent has released a system-level AI assistant called Marvis, which can directly access operating system resources. It performs file searches, system settings, scheduled tasks, etc., via natural language commands, and supports multi-agent collaboration and real-time token consumption display.

@FinanceYF5: 1/ Voice Agent Upgraded: OpenAI Launches GPT-Realtime-2, Bringing GPT-5-Level Reasoning to Real-Time Voice API. Voice assistants no longer just "understand and respond" — they can think while listening, and problem-solve while chatting.

X AI KOLs Following

OpenAI has launched GPT-Realtime-2, integrating GPT-5-level reasoning into the real-time voice API, enabling voice assistants to think and solve problems in real time during conversations.

@FinanceYF5: OpenAI's new voice model Bidi 1 first test exposure - Bidirectional voice design: while you speak, it listens; you can interrupt mid-way to switch tasks immediately, no longer grabbing the conversation when you pause. It also supports real-time translation, and context memory is much stronger than the current Advanced Voice. It's now being pushed to a small group, ChatGPT …

X AI KOLs Following

OpenAI's new voice model Bidi 1 first test exposure, supports bidirectional voice design, real-time translation, and stronger context memory, currently being pushed to a small group on ChatGPT.

@FinanceYF5: This AI is impressive. LingBot-Map can convert real-time video streams into real-time 3D reconstruction. 20 FPS code + model

X AI KOLs Following

LingBot-Map is an AI model capable of converting real-time video streams into real-time 3D reconstruction, running at 20 FPS with complete code and model provided.

Similar Articles

@FinanceYF5: 1/ Voice Agent Upgraded: OpenAI Launches GPT-Realtime-2, Bringing GPT-5-Level Reasoning to Real-Time Voice API. Voice assistants no longer just "understand and respond" — they can think while listening, and problem-solve while chatting.

@FinanceYF5: This AI is impressive. LingBot-Map can convert real-time video streams into real-time 3D reconstruction. 20 FPS code + model

Submit Feedback