All articles, most recently crawled first.
A paper introduces a unified recipe (SU-01) that combines reverse-perplexity curriculum, two-stage reinforcement learning, and test-time scaling to achieve gold-medal-level performance on IMO and IPhO problems using a 30B-A3B backbone.
Motus Tracing is a fully open-source observability layer for AI agents that captures every model call, tool call, sandbox interaction, and error, providing a unified span model for local development and cloud deployment with zero setup cost.
Initial DFlash implementation by Zai_org is integrated into ZML AI, with plans to include it in zml/llmd.
Santiago (@svpino) highlights MiniMax-M2.7, a 230B open-weight model that rivals top proprietary models like Opus 4.6 and GPT-5.4, achieving 440+ tokens/s inference on SambaNova at low cost.
Openclaude v0.11.0 has been released, featuring a free frontier-grade LLM accessible via OpenGateway without requiring an API key or signup.
A tutorial on agent hooks that extend frameworks and CLIs with custom controls for deterministic behavior instead of relying on prompt instructions.
Eric Jang releases AutoGo, a from-scratch tutorial for implementing AlphaGo, including code and a playable bot, demonstrating that frontier capabilities can now be replicated affordably.
Steven Brunton announces his new book 'Optimization: A Bootcamp for Machine Learning, Inverse Problems, and Control', with pre-order available and accompanying free PDF, YouTube videos, and Python code.
OpenAgents is an open platform for using and hosting language agents in everyday life, featuring agents for data analysis, plugins, and web browsing, with open code and a demo.
A new tool built on Claude Code enables autonomous testing of iOS apps by navigating every screen, testing flows, reading debug logs, and producing structured bug reports from a single prompt.
The user shares their experience optimizing Qwen3.6-27B inference speed on a Mac using different quantization methods (Unsloth Q5, MLX 6bit + DFlash, MTPLX 4bit), ultimately reaching 43 tok/s.
Five years into his tenure as Amazon CEO, Andy Jassy is aggressively investing in AI infrastructure, committing billions to partnerships with OpenAI and Anthropic while cutting costs and pleasing Wall Street, steering the company through what he calls its greatest challenge yet.
Slash Financial launches Twin, an AI agent that autonomously initiates payments from business accounts, raising liability and data control concerns as agentic commerce advances.
A new AI tool generates 3D objects by generating code, resulting in objects with separate, functional parts rather than monolithic blobs. It is free and open-source on GitHub.
Four student-founded AI startups won $100,000 investments at the Cornell Tech Startup Awards, addressing AI exam fraud, financial AI safety, medical device regulation, and automated contract reasoning.
OpenAI reorganizes, making cofounder Greg Brockman permanent head of product strategy and merging ChatGPT, Codex, and its API into a unified product team, as part of a broader leadership shake-up ahead of a potential IPO.
A developer successfully ran Gemma4 26b MoE on Apple MacBook Air M5 using MLX with turboquant and a custom kernel, achieving faster prompt processing and generation speeds than llama.cpp with lower memory usage. The implementation includes instructions for local deployment.
A method that dynamically allocates compute budget to hard problems using Qwen-35B-A3B achieves performance near GPT-5.4-xHigh on the HLE benchmark.
Introduces Orthrus, a method that injects a trainable diffusion attention module into a frozen autoregressive transformer to achieve up to 7.8× tokens per forward pass and ~6× wall-clock speedup on MATH-500, with provably identical output distribution to the base Qwen3-8B model. The approach requires minimal additional parameters and training, and avoids the TTFT penalty of external drafters.
The article critiques Git, arguing that it is not as fine as commonly perceived, and links to a discussion on Lobste.rs.