voice-assistant

Tag

Cards List
#voice-assistant

MIST: Multimodal Interactive Speech-based Tool-calling Conversational Assistants for Smart Homes

arXiv cs.CL · 3d ago Cached

The paper introduces MIST, a synthetic dataset and framework for training multimodal voice assistants to control IoT devices in smart homes. It highlights significant performance gaps between open and closed-weight models in handling complex, speech-based tool-calling tasks.

0 favorites 0 likes
#voice-assistant

Built a practical voice-first AI tool for ADHD/executive dysfunction — one-tap brain dump → structured reminders & tasks (not a full autonomous agent)

Reddit r/AI_Agents · 3d ago

The author introduces SAVI, an iOS app designed for ADHD users that converts voice brain dumps into structured tasks and reminders using on-device AI like Whisper and GPT-4o.

0 favorites 0 likes
#voice-assistant

Built a JARVIS-style assistant with wake word, vision mode, local voice cloning, and LLM-generated system commands

Reddit r/ArtificialInteligence · 5d ago

A developer built a JARVIS-style personal assistant called CYBER with wake word activation, local voice cloning via XTTS v2, vision mode, and LLM-generated system commands, all running locally without cloud dependencies.

0 favorites 0 likes
#voice-assistant

Cardamom

Product Hunt · 6d ago

Cardamom is an AI-powered phone ordering system designed for takeout-heavy restaurants.

0 favorites 0 likes
#voice-assistant

Parloa builds service agents customers want to talk to

OpenAI Blog · 6d ago Cached

Parloa has evolved its platform to an AI Agent Management Platform (AMP) using GPT-5.4, enabling enterprises to design, simulate, and deploy voice and text service agents without coding.

0 favorites 0 likes
#voice-assistant

EchoChain: A Full-Duplex Benchmark for State-Update Reasoning Under Interruptions

arXiv cs.CL · 2026-04-21 Cached

EchoChain is a new benchmark for evaluating AI models' ability to revise in-progress responses when users interrupt mid-generation. The benchmark identifies three failure patterns (contextual inertia, interruption amnesia, objective displacement) and finds that across evaluated real-time voice models, no system exceeds 50% pass rate.

0 favorites 0 likes
#voice-assistant

ARKAD Wallet

Product Hunt · 2026-04-13

ARKAD Wallet is a product that allows users to talk to their finances to improve personal finance management.

0 favorites 0 likes
← Back to home

Submit Feedback