We hit the retry problem hard enough that we open-sourced a fix

Reddit r/AI_Agents Tools

Summary

Replaysafe is an open-source npm library that ensures idempotent retries by fingerprinting operations, preventing duplicate side effects in AI agent workflows. It integrates with popular frameworks like LangGraph and CrewAI.

If you've been running agents in production, you know the drill: agent crashes mid-task, you retry, and suddenly the customer has two charges, two welcome emails, two CRM entries. The hard part isn't retrying. It's knowing what **already** happened. We are building a small library that wraps any non-idempotent call (charging a card, sending an email, hitting an API) and fingerprints it, `hash(type + target + input)`. Before executing, it checks if that exact operation was already done. If yes, returns the cached result. If not, runs it and remembers. It has circuit breakers for retry storms and rollback hooks for partial failures. Works with LangGraph, CrewAI, Inngest, n8n, Airflow - whichever framework you're using. It's called Replaysafe. Open source (AGPL), just an npm package, no infrastructure needed. Curious what recovery patterns are working for others here, this is still early and we're learning from what people actually need.
Original Article

Similar Articles