Tag
Moonshot AI founder Yang Zhilin released a 40-minute video detailing the training process of the Kimi K2 model, which cost only $4.6 million. In an 8-model real-time programming competition, Kimi K2 took first place, defeating GPT-5.5 and others, demonstrating how a small team can overturn the traditional compute-stacking paradigm through architecture optimization.
The article discusses Google's internal strategic adjustment in the face of competition from OpenAI and Anthropic. Google saw some success with Gemini 3, but realized the decisive battle of large models is in code-writing ability, reflecting the urgency of catching up.
The article discusses how Thoughty Machines has significantly outperformed or redefined competitors like GDM and OpenAI in the realm of real-time AI capabilities.
DeepSeek, a Chinese AI model built by a quant hedge fund, is reportedly competing with GPT-4 level performance at roughly 5% of the training cost, causing significant market disruption including a $600B drop in NVIDIA's market cap. A free 1 hour 50 minute course has been released teaching users how to leverage DeepSeek V4 locally and via API.
Opus 4.7 has taken the #1 spot on the LLM Debate Benchmark, surpassing Sonnet 4.6 by 106 BT points with a perfect record of 51 wins, 4 ties, and 0 losses in side-swapped matchups. The model wins by identifying and controlling the central hinge of debates, forcing opponents onto its terms.
Google has formed a dedicated strike team to improve its coding AI models, ramping up agentic AI efforts amid competitive pressure from Anthropic. This signals an intensifying race in AI coding capabilities between major AI labs.