Tag
OpenLess is an open-source voice input tool that supports macOS and Windows, capable of converting speech to text and automatically polishing it, especially suitable for writing AI prompts.
Sam Altman announces the release of GPT-Realtime-2 to the API, highlighting a significant advancement in voice interaction with AI for handling complex context.
Interfaze AI introduces a specialized model that surpasses general LLMs on deterministic developer tasks including OCR, object detection, web scraping, speech-to-text, and classification.
Harker 2.0 is a private speech-to-text application designed for macOS.
OpenAI introduced next-generation audio models for the API, including improved speech-to-text (gpt-4o-transcribe, gpt-4o-mini-transcribe) and customizable text-to-speech models that enable developers to build more intelligent and expressive voice agents with enhanced accuracy across challenging scenarios.