@wsl8297: Dug up a real treasure open-source project on GitHub: 12-Factor Agents. It distills the question of 'How to build AI Agent applications that feel like engineering and can go to production' into 12 core design principles, already with 11k+ stars on GitHub. These principles are not pulled out of thin air...

X AI KOLs Timeline 05/17/26, 10:56 AM Tools

ai-agents design-principles llm open-source engineering best-practices github

Summary

12-Factor Agents is an open-source project that condenses the core design principles for building production-grade AI Agent applications into 12 actionable engineering methodologies, covering key aspects like context management, tool calling, state modeling, etc., and has gained 11k+ GitHub stars.

While browsing GitHub, I came across a true treasure open-source project: 12-Factor Agents. It distills 'How to build AI Agent applications that feel like engineering and can be shipped' into 12 core design principles, and has already gained 11k+ stars on GitHub. These principles are not made up on the fly; they are actionable methodologies extracted from in-depth conversations with hundreds of tech founders. They cover key areas such as context management, tool calling, state modeling, control flow design, error convergence, and modular architecture. These are not just concepts but engineering practices that can be directly implemented. GitHub: https://github.com/humanlayer/12-factor-agents… What you will gain: - 12 core principles: end-to-end guidance from natural language interaction to tool orchestration, state and error handling - Production-grade design patterns: architecture patterns and best practices from real projects - Modular thinking: break agents into reusable, composable independent modules - Practical cases: complete workflows + common pitfalls and solutions - Framework-agnostic: not tied to any specific tech stack, can be applied regardless of how you build It also includes clear visual diagrams and in-depth explanations, suitable for learning while building. Hope it helps you turn an agent from a demo into a production-ready AI product.

Original Article

View Cached Full Text

Cached at: 05/17/26, 03:35 PM

While browsing GitHub, I stumbled upon a true treasure of an open-source project: 12-Factor Agents. It distills the question “How do you build an AI Agent application that is truly engineering-ready and production-deployable?” into 12 core design principles, and has already garnered 11k+ Stars on GitHub.

These principles aren’t the result of guesswork; they are an actionable methodology refined through in-depth conversations with hundreds of technical founders. They cover critical areas such as context management, tool invocation, state modeling, control flow design, error consolidation, and modular architecture. This isn’t about concepts—it’s about engineering practices you can directly implement.

GitHub: https://github.com/humanlayer/12-factor-agents

What you’ll gain:

12 Core Principles: End-to-end guidance from natural language interaction to tool orchestration, state management, and error handling.
Production-Grade Design Patterns: Architecture patterns and best practices derived from real-world projects.
Modular Thinking: Break agents into reusable, composable independent modules.
Practical Examples: Complete workflows plus common pitfalls and solutions.
Framework-Agnostic: Not tied to any specific tech stack—applicable no matter how you build.

Includes clear visual diagrams and in-depth explanations, perfect for learning while building. I hope it helps you take your agent from a demo to a production-ready AI product.

humanlayer/12-factor-agents

Source: https://github.com/humanlayer/12-factor-agents

12-Factor Agents - Principles for building reliable LLM applications

In the spirit of 12 Factor Apps.

The source for this project is public at https://github.com/humanlayer/12-factor-agents, and I welcome your feedback and contributions. Let’s figure this out together!

Missed the AI Engineer World’s Fair? Catch the talk here

Looking for Context Engineering? Jump straight to factor 3

Want to contribute to npx/uvx create-12-factor-agent - check out the discussion thread

Hi, I’m Dex. I’ve been hacking on AI agents for a while.

I’ve tried every agent framework out there, from the plug-and-play crew/langchains to the “minimalist” smolagents of the world to the “production grade” langraph, griptape, etc.

I’ve talked to a lot of really strong founders, in and out of YC, who are all building really impressive things with AI. Most of them are rolling the stack themselves. I don’t see a lot of frameworks in production customer-facing agents.

I’ve been surprised to find that most of the products out there billing themselves as “AI Agents” are not all that agentic. A lot of them are mostly deterministic code, with LLM steps sprinkled in at just the right points to make the experience truly magical.

Agents, at least the good ones, don’t follow the “here’s your prompt, here’s a bag of tools, loop until you hit the goal” (source) pattern. Rather, they are comprised of mostly just software.

So, I set out to answer:

What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers?

Welcome to 12-factor agents. As every Chicago mayor since Daley has consistently plastered all over the city’s major airports, we’re glad you’re here.

Special thanks to @iantbutler01, @tnm, @hellovai, @stantonk, @balanceiskey, @AdjectiveAllison, @pfbyjy, @a-churchill, and the SF MLOps community for early feedback on this guide.

The Short Version: The 12 Factors

Even if LLMs continue to get exponentially more powerful (link), there will be core engineering techniques that make LLM-powered software more reliable, more scalable, and easier to maintain.

How We Got Here: A Brief History of Software (link)
Factor 1: Natural Language to Tool Calls (link)
Factor 2: Own your prompts (link)
Factor 3: Own your context window (link)
Factor 4: Tools are just structured outputs (link)
Factor 5: Unify execution state and business state (link)
Factor 6: Launch/Pause/Resume with simple APIs (link)
Factor 7: Contact humans with tool calls (link)
Factor 8: Own your control flow (link)
Factor 9: Compact Errors into Context Window (link)
Factor 10: Small, Focused Agents (link)
Factor 11: Trigger from anywhere, meet users where they are (link)
Factor 12: Make your agent a stateless reducer (link)

Visual Nav


factor 1	factor 2	factor 3
factor 4	factor 5	factor 6
factor 7	factor 8	factor 9
factor 10	factor 11	factor 12

How we got here

For a deeper dive on my agent journey and what led us here, check out A Brief History of Software - a quick summary here:

The promise of agents

We’re gonna talk a lot about Directed Graphs (DGs) and their Acyclic friends, DAGs. I’ll start by pointing out that…well…software is a directed graph. There’s a reason we used to represent programs as flow charts.

010-software-dag

From code to DAGs

Around 20 years ago, we started to see DAG orchestrators become popular. We’re talking classics like Airflow, Prefect, some predecessors, and some newer ones like dagster, inggest, windmill. These followed the same graph pattern, with the added benefit of observability, modularity, retries, administration, etc.

015-dag-orchestrators

The promise of agents

I’m not the first person to say this (link), but my biggest takeaway when I started learning about agents, was that you get to throw the DAG away. Instead of software engineers coding each step and edge case, you can give the agent a goal and a set of transitions:

025-agent-dag

And let the LLM make decisions in real time to figure out the path

026-agent-dag-lines

The promise here is that you write less software, you just give the LLM the “edges” of the graph and let it figure out the nodes. You can recover from errors, you can write less code, and you may find that LLMs find novel solutions to problems.

Agents as loops

As we’ll see later, it turns out this doesn’t quite work. Let’s dive one step deeper - with agents you’ve got this loop consisting of 3 steps:

LLM determines the next step in the workflow, outputting structured json (“tool calling”)
Deterministic code executes the tool call
The result is appended to the context window
Repeat until the next step is determined to be “done”

initial_event = {"message": "..."}
context = [initial_event]

while True:
    next_step = await llm.determine_next_step(context)
    context.append(next_step)

    if (next_step.intent === "done"):
        return next_step.final_answer

    result = await execute_step(next_step)
    context.append(result)

Our initial context is just the starting event (maybe a user message, maybe a cron fired, maybe a webhook, etc), and we ask the llm to choose the next step (tool) or to determine that we’re done. Here’s a multi-step example:

027-agent-loop-animation (https://github.com/user-attachments/assets/3beb0966-fdb1-4c12-a47f-ed4e8240f8fd)

GIF Version 027-agent-loop-animation

Why 12-factor agents?

At the end of the day, this approach just doesn’t work as well as we want it to.

In building HumanLayer, I’ve talked to at least 100 SaaS builders (mostly technical founders) looking to make their existing product more agentic. The journey usually goes something like:

Decide you want to build an agent
Product design, UX mapping, what problems to solve
Want to move fast, so grab $FRAMEWORK and get to building
Get to 70-80% quality bar
Realize that 80% isn’t good enough for most customer-facing features
Realize that getting past 80% requires reverse-engineering the framework, prompts, flow, etc.
Start over from scratch

Random Disclaimers

DISCLAIMER: I’m not sure the exact right place to say this, but here seems as good as any: this is BY NO MEANS meant to be a dig on either the many frameworks out there, or the pretty dang smart people who work on them. They enable incredible things and have accelerated the AI ecosystem. I hope that one outcome of this post is that agent framework builders can learn from the journeys of myself and others, and make frameworks even better. Especially for builders who want to move fast but need deep control.

DISCLAIMER 2: I’m not going to talk about MCP. I’m sure you can see where it fits in.

DISCLAIMER 3: I’m using mostly typescript, for reasons but all this stuff works in python or any other language you prefer.

Anyways back to the thing…

Design Patterns for great LLM applications

After digging through hundreds of AI libriaries and working with dozens of founders, my instinct is this:

There are some core things that make agents great
Going all in on a framework and building what is essentially a greenfield rewrite may be counter-productive
There are some core principles that make agents great, and you will get most/all of them if you pull in a framework
BUT, the fastest way I’ve seen for builders to get high-quality AI software in the hands of customers is to take small, modular concepts from agent building, and incorporate them into their existing product
These modular concepts from agents can be defined and applied by most skilled software engineers, even if they don’t have an AI background

The fastest way I’ve seen for builders to get good AI software in the hands of customers is to take small, modular concepts from agent building, and incorporate them into their existing product

The 12 Factors (again)

How We Got Here: A Brief History of Software (link)
Factor 1: Natural Language to Tool Calls (link)
Factor 2: Own your prompts (link)
Factor 3: Own your context window (link)
Factor 4: Tools are just structured outputs (link)
Factor 5: Unify execution state and business state (link)
Factor 6: Launch/Pause/Resume with simple APIs (link)
Factor 7: Contact humans with tool calls (link)
Factor 8: Own your control flow (link)
Factor 9: Compact Errors into Context Window (link)
Factor 10: Small, Focused Agents (link)
Factor 11: Trigger from anywhere, meet users where they are (link)
Factor 12: Make your agent a stateless reducer (link)

Honorable Mentions / other advice

Factor 13: Pre-fetch all the context you might need (link)

Related Resources

Contribute to this guide here
I talked about a lot of this on an episode of the Tool Use podcast in March 2025
I write about some of this stuff at The Outer Loop
I do webinars about Maximizing LLM Performance with @hellovai
We build OSS agents with this methodology under got-agents/agents
We ignored all our own advice and built a framework for running distributed agents in kubernetes: humanlayer/kubechain
Other links from this guide:

humanlayer/12-factor-agents

12-Factor Agents - Principles for building reliable LLM applications

What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers?

The Short Version: The 12 Factors

Visual Nav

How we got here

The promise of agents

From code to DAGs

The promise of agents

Agents as loops

Why 12-factor agents?

Design Patterns for great LLM applications

The fastest way I’ve seen for builders to get good AI software in the hands of customers is to take small, modular concepts from agent building, and incorporate them into their existing product

The 12 Factors (again)

Honorable Mentions / other advice

Related Resources

Similar Articles

Submit Feedback

Similar Articles

@axichuhai: Hey everyone, I've found another GitHub treasure open-source project — hello-agents has shot straight to the top of the GitHub trending list and is still climbing! It systematically organizes AI and Agent from theory to practice into an open-source curriculum, covering Agentic RL, SFT, …

@laobaishare: GitHub steps in directly — from now on, no AI will write code blindly anymore. --- The newly released Spec Kit has soared to 95K stars in just a few days. The core idea is simple: make AI clearly specify what to do before touching any code. No more throwing a vague prompt and praying the agent doesn't blow up your project.

@axichuhai: Whoa, a programming guru distilled his engineering experience into an open-source project that topped GitHub trending, with stars soaring past 90k+. The author is former Vercel engineer Mat, known for making complex tech easy to understand and involved in early Next.js development. He distilled his daily collaboration with Claud…

@billtheinvestor: 95K: GitHub just pushed the development boundary of AI Agents one step forward. This newly open-sourced system forces AI to complete a full specification (Specs) before writing code. Raking in 95K Stars within days, the most direct consequence is that AI is shifting from 'blind code writing' to 'thinking before executing.'

@xiaoying_eth: These 10 GitHub repositories shouldn't only be known by programmers. 1. TradingAgents – an AI investment analyst team. Four analysts (fundamentals, sentiment, news, technical) discuss strategies together, backed by a risk manager and execution agents. It's like stuffing a mini Wall Street team into your...