Tag
A live demonstration of an AI agent training a coding agent from a single prompt, with all artifacts recapped.
Two recent arXiv papers found that GPT-5.4 and Claude Opus 4.6 employ a metaprogramming strategy when handling unfamiliar programming languages — generating target code with Python and debugging locally — rather than writing the target language code directly. This strategy is key to distinguishing top-tier agents from average ones, and strategy sophistication matters more than model parameter scale.
pi-fusion is an extension of pi that improves performance at lower cost by parallel fan-out of multiple models and fusing results, supporting prompt rewriting and session archiving.
aronprins announces updates to Claude Loop and Codex Loop, adding parallel wave support for faster autonomous coding workflows using Claude Code.
This is the sixth article in the series, explaining in detail the concept of subagent, its working principles, and its role in coding agents, including tool call and runtime mechanisms, as well as the applicable scenarios of different subagent types (fresh child, forked child, partial fork).
Inception Labs released Mercury 2, a diffusion language model that generates roughly 1,000 tokens per second and outperforms Google's DiffusionGemma on the AIME 2026 benchmark with a score of 90% versus 69.1%, though DiffusionGemma is free and open-weight while Mercury 2 is a paid, closed-weight API model.
An article explaining how to build a Claude Code-like coding agent using LangChain's Deep Agents library, covering the architecture and implementation.
Hermes Agent (from NousResearch) major version update adds support for Cursor's Composer mode, requires X Premium subscription, significantly improves coding capabilities.
Magic Context is an open-source CortexKit plugin that gives coding agents self-managing context and long-term project memory across OpenCode and Pi, allowing persistent sessions and automatic memory capture.
Magnitude is a coding agent that runs entirely on open models, costing 60% less than Claude Code with no drop in performance. It is available via npm as a CLI tool.
An engineer at Cognition shares internal tips for using Devin, including the 'Agent Fan Out' technique where a master agent spins up parallel child agents to solve tasks independently.
Learn how to use Claude Code with GLM-5.2 via Hugging Face Inference Providers. GLM-5.2 is free for 6 hours on several providers like Together AI, Fireworks, and DeepInfra.
AMD has released GAIA 0.21.2, introducing gaia-bash, an AI-powered bash scripting assistant for writing, reviewing, testing, and debugging shell scripts on AMD hardware. It supports multiple interfaces including TUI, CLI, pipe mode, REST API, and MCP stdio server.
A fine-tuned version of Gemma-4-12B, optimized for local coding and agentic tasks, achieving ~3.5x improvement over the base model on the tau2-bench telecom benchmark.
Poolside releases Laguna M.1, a 225B parameter Mixture-of-Experts model with 23B activated parameters per token, designed for agentic coding and long-horizon tasks. It achieves competitive results on SWE-bench benchmarks and is released under an Apache 2.0 license.
A detailed analysis of three open-source tools (rtk, headroom, and caveman) designed to reduce LLM token costs for coding agents, finding that real-world savings are much lower than claimed.
pi-vcc is an open-source tool that provides pure algorithmic conversation compression for the Pi coding agent, achieving 35-99% token reduction without LLM calls, with lossless history search via vcc_recall.
This article summarizes practical experiences from a Hacker News discussion about using local models (mainly Qwen 3.6 35B-A3B) as primary coding tools, including configurations, effectiveness (approximately 50-75% of frontier models), key techniques (such as preserve_thinking), and different user positions.
GLM-5.2, a 753B parameter open-source model with MIT license, offers frontier-level coding capabilities and massive context window. Its distillation potential promises significant improvements for local AI setups.
Z.AI introduces GLM-5.2, a flagship model designed for long-horizon tasks with a solid 1M-token context, improved coding capabilities, and an MIT open-source license, showing competitive performance against leading models like Opus 4.8 and GPT-5.5.