Tag
OpenAI's GPT-5 Pro helped immunologist Derya Unutmaz solve a three-year-old mystery about how glucose affects T cell specialization by suggesting that deoxyglucose interferes with IL-2 protein construction, leading to increased inflammatory Th17 cells.
Multiple AI model releases are delayed: GPT-5.6 now expected mid-July, DeepMind's 3.5 Pro postponed, while OpenAI's Bidi voice model and Claude Sonnet 5 for enterprises see progress.
A user expresses disappointment with GPT-5.6, claiming it is not better than GLM-5.2.
A user asks which upcoming OpenAI feature is more exciting: the rumored GPT-5.6 model or bidirectional voice mode (BiDi), which allows real-time simultaneous listening and speaking.
A comparison of various GPT and Claude Opus model versions on the Minebench (Minecraft) benchmark, with detailed judgments between GPT-5.5 and Fable 5 on specific builds.
This article provides a comprehensive introduction to the features and usage of the OpenAI Codex desktop application, including project management, skill/plugin system, automation, and multi-task parallel development strategies. It also offers practical cases and risk warnings, aiming to help users efficiently utilize AI agents for parallel development.
Jiayuan Zhang shared his initial experience with the M3 model's coding ability, stating that it is a qualitative improvement compared to m2.7, but the 1-shot results are not as comprehensive as Opus 4.6/4.7 and GPT5.5.
Expresses excitement over the upcoming competition between GPT-5.6 and Mythos, asserting GPT-5.6 will outperform on price/performance.
YacineMTB argues that GPT 5.5 (likely a typo) surpasses Anthropic's Opus models, suggesting users are switching away from Opus. Dylan Field criticizes Opus 4.8 for degraded curiosity and increased sycophancy.
Cerebras CFO announces that the company is internally running GPT5.4 and GPT5.5 on its chips and will release the models to the public soon, promising high-speed AI inference.
Chinese students are buying heavily discounted GPT-5.4/5.5 and Claude API access from proxy sellers on Xianyu/Taobao, enabling them to burn 100M+ tokens daily for about $1 and engage in extensive vibecoding.
GPT 5.5 fails to solve Jane Street Puzzles that its predecessor could not handle either, suggesting continued limitations in AI reasoning.
LightOn achieves GPT-5-level deep research retrieval performance using a 150M-parameter late-interaction model, a remarkable feat.
This field report argues that commercial AI hallucinations are structural issues of 'compression' rather than alignment failures, citing high error rates in GPT-5.5, Gemini 3.1, and Claude Opus 4.7. It also details a significant source code leak involving Anthropic's Claude Code, revealing internal benchmarks for unreleased models and hidden features.
Conductor has updated its default coding harness to use Codex with GPT-5.5, making it the standard agent for new users.
The paper introduces IndustryBench, a benchmark evaluating LLMs on industrial procurement QA in Chinese against national standards, highlighting safety compliance gaps. It reveals that extended reasoning often lowers safety-adjusted scores and reshuffles model rankings when safety violations are considered.
Sam Altman teases a future AI model, referred to as '5.5', describing it as having an autistic genius nature and unusual naming conventions.
A Chinese MIT student released WindsurfAPI, an open-source proxy that converts Windsurf's internal AI models (including Claude Opus 4.7 and GPT-5 series) into standard OpenAI and Anthropic compatible APIs for free, featuring account pooling and rate-limit isolation.
DHH praises the performance and efficiency of GPT-5.5 on low reasoning settings, noting it surpasses Opus and Kimi.
OpenAI has launched GPT-Realtime-2, integrating GPT-5-level reasoning into the real-time voice API, enabling voice assistants to think and solve problems in real time during conversations.