Tag
The author describes rewriting their AI agent infrastructure for reliability using DBOS durable execution after facing cascading failures, and asks the community about similar experiences, tool choices, and build-vs-buy decisions.
This blog post argues that SQLite, combined with Litestream for async backups, provides a simple and effective approach to durable execution for many workflow systems, especially AI agents, without needing a separate orchestration tier or network database.
A tutorial guide that teaches how to build a durable execution engine from scratch using Go and Postgres, inspired by Kubernetes the hard way.
Cursor shares key lessons from building cloud agents, emphasizing that providing a full development environment is critical for agent output quality, and that long-running agents require durable execution and enterprise-like infrastructure.
Google introduces Agent Executor, an open-source distributed runtime for reliable long-running agent workflows, featuring durable execution, secure isolation, and session consistency.
The article argues that AI agent development should rely on stable execution primitives rather than rigid frameworks, which frequently change with emerging orchestration patterns. It emphasizes durable steps, persistent state, parallel coordination, event-driven flow, and observability to prevent costly rewrites as best practices evolve.
Armin Ronacher (pocoo) shares his production experience with Absurd, a durable execution system built entirely on Postgres, highlighting improvements like decomposed steps, task results, and a CLI tool called absurdctl.