@rohanpaul_ai: Sakana Fugu Ultra just beat the other models on visual polish in a live trading-desk coding test, got close to GLM 5.2,…

X AI KOLs Following Models

Summary

Sakana's Fugu Ultra model orchestration system outperformed other models in a live coding test for a trading desk UI, though at 17x higher cost, demonstrating its strength in visual polish and multi-agent coordination.

Sakana Fugu Ultra just beat the other models on visual polish in a live trading-desk coding test, got close to GLM 5.2, but at 17x the cost. Test was done on atomic[.]chat, a desktop app that runs LLMs locally. Fugu produced the richest interface, with multiple panels, watchlists, charts, tape-style activity, status labels, and a more finished product feel. To note that Fugu Ultra is an orchestration layer that assembles and routes subtasks across a pool of models through one OpenAI-compatible endpoint. So Fugu is a learned coordinator model inside a multi-agent system. When you send a prompt, Fugu decides whether to answer alone or hand pieces of the job to other models, then it gathers the outputs and produces one final response.
Original Article
View Cached Full Text

Cached at: 06/23/26, 09:47 AM

Sakana Fugu Ultra just beat the other models on visual polish in a live trading-desk coding test, got close to GLM 5.2, but at 17x the cost.

Test was done on atomic[.]chat, a desktop app that runs LLMs locally.

Fugu produced the richest interface, with multiple panels, watchlists, charts, tape-style activity, status labels, and a more finished product feel.

To note that Fugu Ultra is an orchestration layer that assembles and routes subtasks across a pool of models through one OpenAI-compatible endpoint.

So Fugu is a learned coordinator model inside a multi-agent system.

When you send a prompt, Fugu decides whether to answer alone or hand pieces of the job to other models, then it gathers the outputs and produces one final response.

atomic.chat (@atomic_chat_hq): Sakana Fugu surprisingly performed near GLM 5.2 level but 17× more expensive!

We gave the same prompt to 4 models: build a complete live Trader Desk with both frontend and backend components, real-time market data fetched from external APIs for 8 symbols, and a custom dark-theme

Similar Articles

@sashimikun_void: @serenaa_ge Deepswe benchmark pls

X AI KOLs Following

Sakana AI announced Sakana Fugu, a multi-agent orchestration system accessible via a single model API, with the Fugu Ultra model matching frontier performance without export control risks.

Sakana Fugu

Hacker News Top

Sakana Fugu dynamically orchestrates a diverse pool of top models to tackle complex, multi-step tasks via a single API, leveraging their ICLR 2026 papers on learned orchestration to achieve frontier-level performance without single-vendor dependency.

Sakana Fugu (3 minute read)

TLDR AI

Sakana AI introduces AB-MCTS, an inference-time scaling algorithm that enables multiple frontier AI models (Gemini 2.5 Pro, o4-mini, DeepSeek-R1-0528) to cooperate, significantly outperforming individual models on the ARC-AGI-2 benchmark.