Tag
User recommends Read Frog as a superior alternative to Immersive Translate, highlighting its context-aware translation, cost-efficiency, and support for various AI models and Edge TTS.
Pair Translate is an open-source browser translation extension that supports word-level, paragraph-level, and full-page translation, with multiple translation engines including Google, DeepL, ChatGPT, and Gemini, supporting bilingual display, and available on Chrome, Firefox, and Edge stores.
OpenAI released the GPT-Realtime-2 voice model, featuring GPT-5-level reasoning capabilities and a 128,000 token context window. It supports real-time translation from over 70 input languages to 13 output languages, achieving 96.6% accuracy on the Big Bench Audio Intelligence benchmark. Greg Brockman called it a milestone in voice translation.
Lingo.dev v1 is a localization engineering platform designed to ensure consistent translations.
A user documents how closed models (GPT-4o→5.3, Gemini) degraded and censored Chinese novel translations, while local Gemma 4 31B now outperforms them with natural, uncensored output.
Researchers from National Taiwan University propose replacing fixed translation-based prompting strategies in multilingual LLMs with lightweight learned classifiers that route each instance to either native or translation-based prompting. Their analysis across 10 languages and 4 benchmarks shows no single strategy is universally optimal, with translation benefiting low-resource languages most, and the learned routing achieving statistically significant improvements over fixed strategies.
Google has updated Gemini 2.5 Flash Native Audio to improve live voice agent capabilities, including sharper function calling, better instruction following, and smoother conversation context retrieval. The update also introduces live speech translation in the Google Translate app beta, preserving intonation across 70+ languages.
Minnesota's Enterprise Translation Office has successfully deployed ChatGPT to improve translation speed and accuracy while maintaining cultural relevance, with a full rollout completed in July and promising pilot projects underway for real-time voice interpretation.
OpenAI introduces Whisper, an end-to-end encoder-decoder Transformer model trained on large-scale diverse audio data for robust multilingual speech recognition, language identification, and speech-to-English translation. Whisper achieves 50% fewer errors than specialized models on diverse datasets and outperforms supervised benchmarks on speech translation despite not being fine-tuned to specific datasets.