Tag
Jason has released a brand-new real-time translation model, which can currently be trialed via the API.
Sam Altman announces the release of GPT-Realtime-2 to the API, highlighting a significant advancement in voice interaction with AI for handling complex context.
Google introduces Gemini 3.1 Flash-Lite, a high-speed, cost-efficient AI model available in preview via Google AI Studio and Vertex API, designed for high-volume developer workloads.
OpenAI releases GPT-5.1, a new model in the GPT-5 series that dynamically adapts thinking time based on task complexity, offering 2-3x faster performance than GPT-5 while maintaining frontier intelligence. The release includes extended prompt caching (24-hour retention), new coding tools (apply_patch and shell), and a 'no reasoning' mode for latency-sensitive applications.
OpenAI releases GPT-5.1 Instant and GPT-5.1 Thinking, upgraded versions of the GPT-5 series with improved conversational abilities, better instruction following, adaptive reasoning, and enhanced tone controls. The models are rolling out to ChatGPT users starting with paid subscribers, with API availability coming later this week.
Google releases Gemini 2.5 Computer Use model via the Gemini API, enabling developers to build AI agents that can interact with user interfaces through clicking, typing, and scrolling. The model outperforms alternatives on web and mobile control benchmarks with lower latency and is available in preview on Google AI Studio and Vertex AI.
OpenAI releases GPT-5-Codex, an optimized version of GPT-5 specialized for agentic software engineering tasks, available via API and across Codex's integrated development environment with improved code review capabilities and long-form task execution.
OpenAI releases GPT-5 in their API platform, a state-of-the-art model achieving 74.9% on SWE-bench Verified and excelling at coding, agentic tasks, and long-context reasoning. The release includes three model sizes (gpt-5, gpt-5-mini, gpt-5-nano) and new API features like verbosity control, minimal reasoning mode, and custom tools.
OpenAI announces GPT-5, their most advanced model yet, unifying capabilities from GPT-4o, o-series reasoning, agents, and advanced math, with immediate rollout to Team users and API access for developers. The release marks a major milestone with 700 million weekly ChatGPT users and 5 million paid business users already leveraging OpenAI's technology.
DeepMind introduces AlphaGenome, an AI model that predicts how DNA sequence variants impact gene regulation and biological processes across diverse cell types and tissues. The model processes up to 1 million base pairs and is available via API for non-commercial research, with the full paper published in Nature.
Google releases early access to Gemini 2.5 Pro Preview (I/O edition) with significantly improved coding capabilities for building interactive web apps, now leading the WebDev Arena Leaderboard with +147 Elo points improvement.
Google announces Gemini 2.5 Flash, a new hybrid reasoning model available in preview through the Gemini API. The model features toggleable thinking capabilities, fine-grained thinking budgets for quality-cost-latency tradeoffs, and maintains fast inference speeds while improving performance over 2.0 Flash.
OpenAI launches GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano models via API with major improvements in coding (54.6% on SWE-bench), instruction following, and 1M token context windows at lower costs. GPT-4.5 Preview will be deprecated on July 14, 2025.
OpenAI introduced next-generation audio models for the API, including improved speech-to-text (gpt-4o-transcribe, gpt-4o-mini-transcribe) and customizable text-to-speech models that enable developers to build more intelligent and expressive voice agents with enhanced accuracy across challenging scenarios.
Google expands Gemini 2.0 Flash native image generation capabilities to all developers, enabling multimodal text and image output for storytelling, conversational image editing, and applications requiring world understanding and text rendering.
OpenAI launches new tools for building agents including the Responses API, built-in tools (web search, file search, computer use), Agents SDK, and observability features designed to simplify agentic application development.
Google announces general availability of Gemini 2.0 Flash-Lite with improved performance over 1.5 Flash, simplified pricing, and a 1 million token context window. The model is now available in Google AI Studio and Vertex AI for production use, with developers already building voice AI, data analytics, and video editing applications.
Google announces general availability of Gemini 2.0 Flash via API, introduces experimental Gemini 2.0 Pro for advanced coding and reasoning tasks, and releases Gemini 2.0 Flash-Lite as a cost-efficient option. All models support multimodal input with text output and are available through Google AI Studio, Vertex AI, and the Gemini app.
OpenAI releases o3-mini, a cost-efficient reasoning model with strong STEM capabilities, available in ChatGPT and API with support for function calling, structured outputs, and three reasoning effort levels. The model matches o1 performance in math and coding while being faster and cheaper, with free plan users gaining access to a reasoning model for the first time.
OpenAI releases o1 model to API with production-ready features including function calling, structured outputs, vision capabilities, and 60% lower latency than o1-preview. Additional developer tools include Realtime API improvements, Preference Fine-Tuning, and new Go and Java SDKs.