Sakana Fugu: Multi-Agent Orchestration via Single API

Sakana Fugu is a multi-agent system that dynamically orchestrates top models via one OpenAI-compatible API, outperforming frontier models on coding and reasoning benchmarks. It uses learned coordination from ICLR 2026 papers TRINITY and Conductor, avoiding hand-designed workflows.

3 min readJun 22, 2026

Sakana Fugu: Multi-Agent Orchestration via Single API

What Is Sakana Fugu?

Sakana Fugu is a multi-agent system that dynamically orchestrates a pool of expert models to solve complex, multi-step tasks. It exposes a single OpenAI-compatible API, so you don't need to manage multiple providers or endpoints. Fugu learns how to coordinate agents using reinforcement learning, rather than relying on human-designed workflows.

How It Works

Fugu is built on two ICLR 2026 papers: TRINITY and Conductor. TRINITY uses a lightweight evolved coordinator to assign roles (Thinker, Worker, Verifier) to different LLMs across multiple turns. Conductor uses reinforcement learning to discover natural-language coordination strategies, designing agent communication patterns and focused prompts.

This learned orchestration means Fugu can dynamically assemble agents from a pool and coordinate them through non-obvious but efficient collaboration patterns. You don't need domain knowledge to prescribe team organization or roles.

Two Models: Fugu and Fugu Ultra

Fugu comes in two variants:

Fugu: Balanced performance and latency. Ideal for everyday coding, code review (via Codex), and responsive chatbots.
Fugu Ultra: Optimized for performance. Uses a deeper pool of expert agents for hard, high-stakes problems like Kaggle competitions, paper reproduction, and cybersecurity analysis.

Both are accessible via the same API endpoint, and you can opt specific agents out to meet data, privacy, or compliance requirements.

Benchmark Performance

Fugu models surpass publicly accessible frontier models and compete with Fable 5 and Mythos Preview. Here are key benchmark scores:

Benchmark	Fugu	Fugu Ultra	Opus 4.8	Gemini 3.1 Pro	GPT 5.5
SWE Bench Pro*	59.0	73.7	69.2	54.2	58.6
TerminalBench 2.1	80.2	82.1	74.6	70.3	78.2
LiveCodeBench	92.9	93.2	87.8	88.5	85.3
Humanity's Last Exam	47.2	50.0	49.8	44.4	41.4
GPQA-D	95.5	95.5	92.0	94.3	93.6

*Fugu Ultra leads in most benchmarks, especially SWE Bench Pro (73.7) and TerminalBench 2.1 (82.1).

Qualitative Results

In an experiment where an AI agent autonomously improved a small GPT's training recipe (AutoResearch), Fugu Ultra achieved the best mean bits-per-byte (BPB) of 0.9774 ± 0.0019 over 123 experiments on a single H100 GPU, outperforming Model C (0.9781), Model B (0.9793), and Model A (0.9822). Its best single run reached 0.9748.

API Usage

Fugu provides an OpenAI-compatible API. To switch between Fugu and Fugu Ultra, just change the model name in your request:

import openai

client = openai.OpenAI(api_key=&#34;your-key&#34;, base_url=&#34;https://api.sakana.ai/v1&#34;)

response = client.chat.completions.create(
    model=&#34;fugu-ultra&#34;,  # or &#34;fugu&#34;
    messages=[{&#34;role&#34;: &#34;user&#34;, &#34;content&#34;: &#34;Write a Python function to merge two sorted lists.&#34;}]
)

print(response.choices[0].message.content)

Why It Matters

Fugu abstracts away the complexity of multi-agent orchestration. Instead of manually routing tasks to different LLMs, you get a single endpoint that learns the best collaboration pattern for each request. This reduces API complexity and improves cost-performance. It also offers flexibility in agent selection for compliance.

Availability

Currently not available in the EU/EEA due to GDPR compliance work. Available elsewhere via API.

Next Steps

Try the Fugu API today at sakana.ai/fugu/. Start with the Fugu model for everyday tasks, and switch to Fugu Ultra for high-stakes problems. If you're building multi-agent systems, this could replace your hand-rolled orchestration.

Editor's Take

I've spent months hand-crafting multi-agent pipelines with different LLMs, and it's a maintenance nightmare. Fugu's learned orchestration approach is exactly what I've been waiting for. I'll be testing it on my next code review automation project. If it delivers on the benchmarks, it could replace a lot of brittle custom code.

— DevDigest Editorial

Key Takeaways

•Use Fugu for complex multi-step tasks like code review, paper reproduction, and cybersecurity analysis.
•Switch between Fugu and Fugu Ultra by changing the model name in the API request.
•Opt out specific agents from the pool to meet data privacy or compliance requirements.

Why It Matters

Fugu eliminates the need to manually orchestrate multiple LLMs for complex tasks. It offers frontier-level performance through a single API, with learned coordination that adapts to each request. For developers building AI-powered workflows, this reduces integration overhead and improves reliability.

#ai#api#multi-agent#sakana-fugu#llm-orchestration

Get the weekly digest

Every Sunday - top tech stories, industry breakthroughs, and developer tools delivered to your inbox.

No spam, unsubscribe anytime.

Sakana Fugu: Multi-Agent Orchestration via Single API

What Is Sakana Fugu?

How It Works

Two Models: Fugu and Fugu Ultra

Benchmark Performance

Qualitative Results

API Usage

Why It Matters

Availability

Next Steps

Editor's Take

Key Takeaways

Why It Matters

Get the weekly digest

You might also like

AI-Native Orgs: The Middle Layer Is Getting Eaten

CivBench: Testing AI on Civilization VI Reveals Strategic Blind Spots

Qontour Scraped Koenig’s Book, Replaced Art with DALL-E, Added GPT-4 Word Generator

Napkin Math: B200 GPU Serves 300-800 Users with 32B LLM

CivBench: Testing AI on Civilization VI Reveals Strategic Blind Spots

Gravity SMTP Plugin Bug Exposes API Keys on 100K WordPress Sites