Tag
Sakana AI releases Fugu, a multi-agent orchestration system with only 0.6B parameters. By intelligently splitting tasks and coordinating multiple models, it achieves state-of-the-art performance while bypassing traditional parameter scaling. This marks the transition of multi-agent orchestration from a lab curiosity to a practical productivity tool.
Sakana's Fugu Ultra model orchestration system outperformed other models in a live coding test for a trading desk UI, though at 17x higher cost, demonstrating its strength in visual polish and multi-agent coordination.
Sakana Fugu, a multi-agent orchestration system, reportedly matches the performance of established systems Fable and Mythos.
This tweet introduces the 9-step guide for Claude Code Dynamic Workflows, emphasizing structured loops and best practices for multi-agent workflows, including manual review, worktree isolation, and automatic rework, pointing out that this is the key to turning agent swarms from toys into productivity.
This article reviews the design highlights and shortcomings of the OpenClaw Agent framework, and shares the author's experience in designing a better agent framework, FastClaw, emphasizing principles such as cloud-native, lightweight, and multi-tenancy.
A comprehensive guide to 15 AI agent design patterns for production systems, explaining when to use each pattern and common pitfalls.
Sakana AI released Fugu Ultra, a multi-agent orchestration system accessible via a single model API, achieving performance competitive with Fable and Mythos models.
Elie Bakouch critiques Sakana AI's Fugu system as a closed-source orchestration layer over closed-source models, arguing it lacks transparency and true AI sovereignty, with technical limitations in routing and cost efficiency.
Sakana AI announced Sakana Fugu, a multi-agent orchestration system accessible via a single model API, with the Fugu Ultra model matching frontier performance without export control risks.
Sakana Fugu dynamically orchestrates a diverse pool of top models to tackle complex, multi-step tasks via a single API, leveraging their ICLR 2026 papers on learned orchestration to achieve frontier-level performance without single-vendor dependency.
This article shares hard-won lessons from building real-time voice AI agents, highlighting the importance of proper turn-taking, VAD handling, billing awareness, and avoiding echo loops.
A developer built a 6-agent AI system for satellite collision avoidance in 4 days for a hackathon, sharing lessons learned.
This paper evaluates multi-agent orchestration architectures (DAG Plan and Execute, ReAct) at enterprise scales and introduces a Task Manager for continuous event-driven operation, showing improvements in latency and correctness.
Proposes Multi-Agent Transactive Memory (MATM), a framework for population-level storage and retrieval of agent-generated trajectories to improve task performance and reduce interaction steps in interactive environments like ALFWorld and WebArena.
AgentFinVQA is a multi-agent pipeline for financial chart question answering that decomposes queries into planning, OCR, legend grounding, visual inspection, and verification steps, recording each step in a traceable Model Evaluation Packet. It achieves significant accuracy gains over zero-shot baselines while enabling on-premise deployment and auditability.
This paper models multi-agent LLM deliberation as a closed-loop dynamical system where each agent has a hidden internal belief (anchor) that continually pulls its opinion, and shows how this anchor can be recovered from deliberation data alone, explaining phenomena like opinions escaping the convex hull of initial beliefs.
A solo-built multi-agent cognitive architecture uses hyperbolic geometry on a Poincaré ball manifold, variational free energy for belief updating, and wave interference for memory retrieval, allowing personality to emerge from memory interactions rather than scripting.
AgentHub is a local-first multi-agent collaborative desktop app that turns AI collaboration into a chat-like experience, supporting task distribution, file review, and human approval, built on Next.js and Electron.
Google's Gemma team released a demo for Gemma 4 26B that runs 10 parallel agents locally at 100+ tokens/second, enabling tasks like coding SVG galleries and parallel translation, all free and open-source.
This paper investigates when process-level coordination control (leadership) benefits multi-agent LLM teams, using behavioral signatures and ablations. It finds that leadership only improves accuracy under specific conditions (unreliable initial consensus, recoverable tasks, and insufficient undirected interaction), aligning with contingency theory from team science.