Tag
Microsoft has released MAI-Voice-2, an expressive text-to-speech system supporting voice cloning in 15 languages.
Ax is a JS/TS library providing high-level abstractions for LLM usage (signatures, agents, flows, optimizers). It now introduces axIR, allowing compilation to Python, Java, C++, and Go, making the same programming model available across multiple languages.
KrillinAI is an open-source tool that integrates the entire workflow of video downloading, subtitle translation, AI dubbing, and video compositing. It supports context-aware translation, voice cloning, auto layout, and cover generation, and is compatible with multiple AI models, suitable for multilingual audio/video content creation and distribution.
Recommend Scribe2SRT, an open-source speech-to-subtitle tool based on PySide6 and ElevenLabs API, supporting multiple languages with optimized formatting for fast generation of high-quality SRT subtitles.
FindMyAI is a free AI search engine that recommends the best AI tool for any task without requiring signup. It supports 18 languages and aims to help users quickly find suitable AI tools.
Supertonic is a lightning-fast, on-device TTS model with 99M parameters, supporting 31 languages. It runs locally with no API costs, outperforms cloud TTS on accuracy for numbers, phone numbers, and technical terms, and can be installed via Python, Node.js, Rust, Go, and more.
OpenAI Codex is a GPT-3 descendant trained on natural language and billions of lines of source code, capable of generating working code across 15+ programming languages with 3.5x more context memory than GPT-3, now available in private beta via API.
OpenAI released GPT Realtime-2 and two accompanying models during Build Hour, enhancing the intelligence and naturalness of voice interaction. It supports 128k context, parallel tool calls, and dynamic voice cloning, demonstrating production-grade applications such as voice-driven shopping assistants and analytics dashboards.