Sotis: detect + intercept agent meltdowns (loops, edit storms) live, inside your LangGraph/ReAct loop

Reddit r/AI_Agents 06/03/26, 06:23 AM Tools

agent-loops reliability langgraph react python-library open-source tool-call-monitoring

Summary

Sotis is a Python library that detects and intervenes in agent meltdowns (loops, edit storms) within LangGraph/ReAct loops using entropy and loop detection, rolling back workspace and restarting the agent to recover cleanly.

Anyone who's built agents for long, multi-step tasks has watched one melt down: it loops the same tool call, floods its own context with error traces, thrashes on the same file, and spirals until the run collapses, burning tokens the whole way. I got tired of watching it happen, so I built Sotis to catch it. Sotis is a small Python library that sits inside your agent's loop and watches the tool-call stream in real time. When it detects a meltdown using sliding-window Shannon entropy plus exact and semantic loop detection — it doesn't just log it, it intervenes: rolls the workspace files back to the last good checkpoint, distills the bloated context into a short resumption prompt, and restarts the agent from there. It's training-free, adds under 0.2ms per step, and you wire it in either as a LangGraph guard node (SotisLangGraphGuard) or as a wrapper around a custom ReAct loop (SotisGuard). It works with any OpenAI-compatible provider — I've tested OpenAI, Anthropic, Groq, OpenRouter, and local models via Ollama. A couple of honest caveats so nobody's misled: it's a library for agents you build yourself, not a plugin for closed agents like Claude Code or Codex (they don't expose a hook into their loop, so the rollback can't work there). And it bounds the failure rather than guaranteeing success — in my live runs it reliably caught the spiral and rolled back the damage, but a weak model still won't magically finish the task; you get a clean, recoverable failure instead of an unbounded one. The default entropy threshold (1.5 bits) also false-positives on agents that legitimately use many different tools in a short window — it's a config knob, and I'm genuinely unsure 1.5 is the right default, so I'd love opinions on that. There's a 40-second demo GIF at the top of the repo (a real Llama-3.3-70B agent getting intercepted 3 times live on a dashboard) plus raw unedited transcripts for several models. It's based on the reliability framework in arXiv:2603.29231, MIT licensed, 127 tests. pip install sotis. Would really value feedback from anyone who's fought agent loops in production.

Original Article

Sotis: detect + intercept agent meltdowns (loops, edit storms) live, inside your LangGraph/ReAct loop

Similar Articles

I built a local control system for agent failures, fixes, evals, and gates to make autoresearch-style self-improvement loops work in real agent codebases

Why self-reflection ReAct loops fail on long-horizon tasks, and the AgentOS verification architecture we built to fix it.

AI agent development

Remember, Don't Re-read: Stateful ReAct Agents for Token-Efficient Autonomous Experimentation

I built a multi-agent network that mutates its own software locally. To stop infinite logic loops, I had to code a digital "suffering" threshold.

Submit Feedback

Similar Articles

I built a local control system for agent failures, fixes, evals, and gates to make autoresearch-style self-improvement loops work in real agent codebases

Why self-reflection ReAct loops fail on long-horizon tasks, and the AgentOS verification architecture we built to fix it.

Remember, Don't Re-read: Stateful ReAct Agents for Token-Efficient Autonomous Experimentation

I built a multi-agent network that mutates its own software locally. To stop infinite logic loops, I had to code a digital "suffering" threshold.