Tag
A blog post describing how the author's production AI agent (PiQ) experienced a broken hash-chain after a server restart, and how they built a workflow for detection, human-in-the-loop resolution, and persistent audit trails, turning the failure into a feature.
PatchBoard replaces natural-language dialogue in LLM multi-agent systems with validated JSON Patch mutations over a shared structured state, achieving higher success rates and significantly lower token usage on ALFWorld benchmarks.
This survey examines computational nondeterminism in financial AI systems, covering tabular models, graph networks, and LLM-based workflows, and proposes a layered evaluation framework for auditability.
An experienced practitioner shares hard-won lessons from deploying 25+ AI agents to production, arguing that memory, orchestration, and auditability matter far more than model choice. The article details common failure modes like context loss and silent cost loops, and recommends a stack including Claude Sonnet 4, Pydantic AI, and dedicated memory layers like Octopodas.
Yohei Nakajima publishes a paper proposing ActiveGraph, a runtime where the event log is the source of truth and agents coordinate through persistent replayable state, enabling auditability, forking, and causal lineage.
This paper introduces five governance metrics to quantify policy compliance at the decision rationale level for LLMs in regulated financial workflows, finding that mechanical enforcement (operating outside the model's interpretive loop) reduces non-informative deferrals by 73% and reveals governance-task decoupling: text-only governance degrades on both dimensions under stress, while mechanical enforcement preserves governance quality even as task performance drops.
This paper introduces Preregistered Belief Revision Contracts (PBRC), a protocol-level mechanism for multi-agent systems (including LLM-based agents) that separates open communication from admissible belief changes by publicly fixing evidence triggers and revision operators. The work addresses dangerous conformity effects in agent deliberation and provides formal guarantees that social-only pressure cannot drive false consensus.