Tag
Jane Street's Head of Technology presents code that purportedly generates $13B profit, offering a template to build your own AI-powered hedge fund.
A new 3.6-27B release shows MoE closing the performance gap with dense models, especially in coding tasks and large context windows, though dense still leads overall.
Qwen releases Qwen3.6-27B, a 27B dense model claiming flagship-level coding performance surpassing the larger Qwen3.5-397B-A17B MoE, with impressive SVG generation demos.
Kimi K2.6 open-source model surpasses Opus 4.6 on SWE-Bench, supporting 12+ hour autonomous coding sessions with 4,000+ tool calls.
Developer reports that small-active-parameter MOE models like qwen3.6-35b-A3b exhibit lower coherence and require more guidance than dense qwen3.5-27b, making them hard to slot into agentic workflows.
A drop-in design system tool that lets AI coding agents read and use Figma-based design systems directly.
Kimi K2.6 is released as an open-source model that achieves state-of-the-art performance on long-horizon coding and agent swarm benchmarks.
Alibaba releases Qwen3.6-Max-Preview, a flagship model optimized for agentic coding tasks.
The author argues that running numerous AI agents in parallel and perpetual context-switching is overrated, advocating instead for deep focus on one or two agents at a time to produce finished, high-quality work.
Chamath Palihapitiya argues that AI agents are erasing the '10x engineer' distinction by making the most efficient code paths obvious to everyone, comparing it to how AI removed the mystery from optimal chess moves.
A 30-minute workshop by the creator of Claude Code covering 'vibe-coding' techniques and Claude usage patterns.
Alibaba releases Qwen3.6-35B-A3B-FP8, an open-weight quantized variant of Qwen3.6 with 35B parameters and 3B activated via MoE, featuring improved agentic coding capabilities and thinking preservation for iterative development.
GLM-5.1 is a next-generation flagship AI model optimized for agentic engineering with significantly stronger coding capabilities, achieving state-of-the-art performance on SWE-Bench Pro and demonstrating superior long-horizon task handling through extended iteration and tool use.
OpenAI releases GPT-5.4 mini and nano, smaller, faster variants of GPT-5.4 designed for high-volume workloads with significant improvements in coding, reasoning, and multimodal understanding while maintaining 2x+ faster performance.
OpenAI releases GPT-5.3-Codex, the most capable agentic coding model combining frontier coding performance with advanced reasoning, featuring interactive long-running task execution and novel high-capability safeguards in the cybersecurity domain.
OpenAI releases GPT-5 in their API platform, a state-of-the-art model achieving 74.9% on SWE-bench Verified and excelling at coding, agentic tasks, and long-context reasoning. The release includes three model sizes (gpt-5, gpt-5-mini, gpt-5-nano) and new API features like verbosity control, minimal reasoning mode, and custom tools.
OpenAI announces GPT-5 capabilities for coding and design tasks, demonstrating advanced applications of the latest model across software development and creative design workflows.
Google releases Gemini 2.5 Pro Preview (I/O edition) with significantly improved coding capabilities, ranking #1 on the WebDev Arena leaderboard for frontend development and enabling advanced features like video-to-code generation.
Google releases early access to Gemini 2.5 Pro Preview (I/O edition) with significantly improved coding capabilities for building interactive web apps, now leading the WebDev Arena Leaderboard with +147 Elo points improvement.
OpenAI launches GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano models via API with major improvements in coding (54.6% on SWE-bench), instruction following, and 1M token context windows at lower costs. GPT-4.5 Preview will be deprecated on July 14, 2025.