Tag
This article comprehensively reviews the complete architectural layering of AI Agent Memory as of mid-2026, including rule files, persistent profiles, historical recall, and evidence chains. It explains the storage methods, loading timings, and governance principles of different memory layers, emphasizing the key role of memory in helping agents achieve cross-session compounding work.
A thread explaining the 5 core mental shifts needed to transition from traditional software engineering to agent engineering, emphasizing why conventional patterns like hard-coded routes and binary tests fail with AI agents.
Many AI agent implementations fail because they treat agents like chatbots, relying on chat history for state rather than using deterministic data structures. The article advocates for separating reasoning (LLM), actions (tools), workflow progress (state machine), and external triggers (webhooks) to build reliable business agents.
A reflection on the landmark 'Attention Is All You Need' paper, highlighting how removing recurrence and relying solely on attention mechanisms revolutionized AI and led to modern LLMs like GPT and Claude.
A developer documents the architecture of an AI agent runtime built for a SaaS company, focusing on safety, tool execution, state management, and separation of reasoning from execution.
This paper introduces Tapered Language Models (TLMs), an architecture principle that allocates more parameters to earlier layers and fewer to later layers, consistently improving perplexity and downstream performance across multiple architectures without extra cost.
A detailed technical article explaining how modern web browsers work, focusing on Chromium's architecture including networking, parsing, rendering, and security features.
A detailed thread explaining the high-level architecture of the SWE AI coding agent, showing how a GitHub issue flows through ingestion, an orchestrator, model gateway, tools, code intelligence, sandbox environment, PR builder, guardrails, and observability to autonomously produce a pull request.
Blog post by Xingyao Wang explaining why OpenHands V1 chose a different architecture from Claude Managed Agents, arguing that reliability comes from implementation details rather than topology.
An exploration of push vs pull memory models for AI agents, arguing that the pull method offers better scalability and efficiency for agent memory management.
The article argues that production agent harnesses should not be monolithic frameworks but rather a stack of independent, replaceable workers connected by a shared trigger primitive, outlining 15 core responsibilities and how the iii engine implements this approach.
Raymond Chen follows up on his previous article about stack limit checking on ARM64, addressing a detail about the unconventional use of the x15 register in stack probe functions and comparing register usage across multiple architectures.
WIRED profiles Stewart Brand, the 87-year-old tech visionary behind the Whole Earth Catalog, and the specially designed house he and his wife built to accommodate his aging and illness, embodying his ethos of self-sufficiency and maintenance.
Architectural Digest and WIRED collaborate on a special issue exploring how homes are changing due to climate, technology, and evolving needs, featuring insights on resilient design, smart home challenges, and aging in place.
This is a panoramic analysis of OpenAI Codex, detailing its architecture (five entry points), three layers of extensibility (MCP, Skills, Plugins), a horizontal comparison with Claude Code, Cursor, and Devin, and seven best practices that can be directly adopted.
Introduces an open-source architecture knowledge base awesome-architecture, containing 25 real system templates (e-commerce, news feed, payments, AI directions such as RAG, Agent, etc.) and supporting tutorials, helping developers shift from coding to architectural thinking.
Proposes a nonuniform width allocation transformer (hourglass shape) that outperforms uniform baselines in language modeling, reducing FLOPs and KV cache size.
Google shares a free, comprehensive example of a long-running AI agent that pauses, resumes, and never loses context, simulating new employee onboarding, teaching three architectural patterns.
The author reflects on the paper 'Self-Revising Discovery Systems for Science' which proposes a new agentic architecture using strongly-typed DAGs, schema migrations via Kan extensions, and an MDL gate to distinguish genuine discovery from simple retrieval or search.
The author argues that an AI agent is best understood as a folder of markdown files containing business knowledge and instructions, separate from the model and harness, enabling portability between rapidly improving harnesses.