Independent guide. Last checked June 22, 2026.
Sakana Fugu: the multi-agent AI orchestration system that works like a single model
Sakana Fugu is a new kind of AI product: one API that coordinates a team of expert models behind the scenes. Instead of picking one model and hoping it does everything well, Sakana Fugu decides which specialist to call for each part of your task, then combines their work into one answer. This guide covers what Sakana Fugu is, how it works, what it costs, and whether it fits your needs.
Sakana Fugu at a glance
What Is Sakana Fugu?
Sakana Fugu is the flagship AI product from Sakana AI, the Tokyo-based research lab. At its core, Sakana Fugu is a multi-agent orchestration system that presents itself as a single model. Instead of forcing you to manually pick one model, one provider, and one fallback strategy for every task, Sakana Fugu gives you one API that coordinates a team of specialized AI agents behind the scenes.
The simplest way to think about it: Sakana Fugu is like a smart project manager. You hand it a task. It decides whether to handle it alone or break it into pieces and delegate each piece to the best available specialist. Then it checks their work, combines the results, and returns one polished answer. All of that complexity stays hidden behind a single, familiar API.
A model that calls other models
Sakana Fugu is itself a trained language model that has learned when to delegate, which agents to call, and how to combine their work.
One API, one endpoint
You point your existing tools at the Sakana Fugu endpoint with an API key. No SDK migration is required.
Not a simple router
Unlike if-else routing rules, Fugu's coordination is learned through training, not hard-coded by a developer.
How Sakana Fugu orchestration works
Every Sakana Fugu request goes through a single intelligent decision process. Here is the lifecycle from your prompt to the final answer.
Your prompt arrives
One request hits one OpenAI-compatible endpoint, just like calling any other model.
Fugu decides
The orchestrator reads your task and decides: answer directly, or break it into subtasks.
Assemble a team
If the task is complex, Fugu picks the best specialist model for each subtask, balancing quality and cost.
Coordinate and verify
Fugu delegates work, checks results, and can even re-do steps if something looks wrong.
One polished answer
All the specialists' work is synthesized into a single, reliable response you can use.
The technology behind Sakana Fugu comes from two ICLR 2026 research papers from Sakana AI: Trinity (an evolved LLM coordinator) and The Conductor (learning to orchestrate agents using reinforcement learning). Fugu can even read its own output and decide whether to try a better coordination strategy, a capability called recursive orchestration, without any retraining.
Fugu vs Fugu Ultra
Sakana Fugu ships as two models behind the same API. Switching between them is a single parameter change. Here is how to choose.
Fugu Balanced
Fugu balances strong performance with low latency, making it the right default for everyday work. Use it for coding help, interactive chat, code review, and tasks where you want a quick, solid answer. You can also opt specific agents out of the pool for compliance needs.
- Everyday coding and interactive work
- Lower latency, responsive replies
- Leads on long-context reasoning tasks
- Opt agents out for compliance
Fugu Ultra Max Quality
Fugu Ultra is tuned for maximum answer quality on hard, multi-step problems. It coordinates a deeper pool of expert agents when accuracy and depth matter most, at the cost of response time. Early users rely on it for AI research, security analysis, and patent investigation.
- Hard, multi-step reasoning tasks
- Deeper expert-agent coordination
- Research, security, and patent work
- Stands shoulder-to-shoulder with top frontier models
Rule of thumb: default to Fugu for interactive, latency-sensitive work. Reach for Fugu Ultra when correctness and depth outweigh speed. Both are included on every subscription tier.
Sakana Fugu Pricing
Sakana Fugu offers both monthly subscriptions and pay-as-you-go token billing. Every plan includes access to both Fugu and Fugu Ultra. Official pricing is subject to change, always confirm current rates before buying.
Standard
$20/mo
Light personal usage. Includes both Fugu and Fugu Ultra. Great for individuals exploring and experimenting.
Pro
$100/mo
10x the usage of Standard. Ideal for regular coding, review, and research sessions across the week.
Max
$200/mo
20x the usage of Standard. Built for heavy, long-running workloads and frequent sessions.
Pay-as-you-go token pricing (Fugu Ultra)
When multiple agents are active, fees are never stacked: you pay a single rate based on the top-tier model involved.
| Per 1M tokens | Standard context | Above 272K context |
|---|---|---|
| Input | $5.00 | $10.00 |
| Output | $30.00 | $45.00 |
| Cached input | $0.50 | $1.00 |
Estimate uses official per-1M-token Fugu Ultra rates. Subscription limits are separate.
Launch offer: subscribe before the end of July 2026 to get a free second month at your initial subscription tier, per Sakana AI's public announcement. Confirm on the official site before committing.
Sakana Fugu Benchmarks
Sakana AI positions Fugu Ultra against the best publicly accessible frontier models and claims it stands shoulder-to-shoulder with export-controlled models like Fable 5 and Mythos Preview. These numbers come from the vendor, use them as directional indicators, not guarantees.
| Benchmark | What it tests | Fugu | Fugu Ultra |
|---|---|---|---|
| SWE Bench Pro | Real software engineering tasks | 59.0 | 73.7 |
| LiveCodeBench v6 | Competitive coding evaluation | 92.9 | 93.2 |
| GPQA Diamond | Graduate-level scientific reasoning | 95.5 | 95.5 |
| Humanity's Last Exam | Broad hard reasoning benchmark | 47.2 | 50.0 |
| AutoResearch (BPB) | Agentic ML research (lower better) | - | 0.9774 |
Read benchmarks critically. Independent commentators note that comparisons sometimes mix evaluation setups, and Fugu's agent pool includes undisclosed mixes of closed and open models. Treat headline numbers as a starting point, then test on your own real tasks before making a decision.
How to Use the Sakana Fugu API
Because Sakana Fugu speaks the standard OpenAI Chat Completions format, anyone already using GPT, Claude, or Gemini can swap in a Fugu endpoint with minimal changes. The official site describes it as OpenAI-compatible.
- Get access from the Sakana AI console and generate an API key.
- Set your base URL from the console, do not copy it from old blog posts.
- Start with Fugu for everyday work, then escalate difficult tasks to Fugu Ultra.
- Log tokens, latency, failures, and answer quality before scaling up.
- Use agent opt-outs if you have data, privacy, or compliance constraints.
from openai import OpenAI
import os
client = OpenAI(
api_key=os.environ["SAKANA_API_KEY"],
base_url=os.environ["SAKANA_BASE_URL"],
)
response = client.chat.completions.create(
model="fugu-ultra-20260615",
messages=[
{"role": "user", "content": "Review this pull request and list risks."}
],
)
print(response.choices[0].message.content)
Sakana Fugu Use Cases
What early users reach for Sakana Fugu to do, drawn from Sakana AI's launch material and developer reports.
Code review and coding
Large diffs, multi-file reasoning, regression risk detection, and test suggestions. One engineer reported Fugu Ultra surfaced 20 plus issues where typical tools found about three.
AI research and paper reproduction
Reproducing papers, running autonomous ML research loops, and experiment planning with deeper verification.
Cybersecurity analysis
Coordinated multi-model reasoning applied to vulnerability investigation where depth beats speed.
Literature and patent search
Deep, multi-step investigation across documents where careful reasoning matters more than a fast answer.
Chatbots and agent products
Stable persona maintenance across long sessions, useful for products that need consistent behavior.
Cost-aware multi-step pipelines
Best when you do not know in advance which model is best per subtask and you are cost-sensitive on a many-call pipeline.
Why multi-agent orchestration matters
Sakana Fugu is not only a performance play. Its central argument is about resilience against relying on a single AI provider. Recent disruptions made this concrete: as export controls were imposed on certain frontier models, access shifted overnight due to changing regulations.
Sakana Fugu is built around an entirely swappable agent pool. If one provider restricts access, Fugu dynamically routes around the disruption. Over time the pool grows as newer, more capable models are added. By orchestrating many models, Sakana AI argues it delivers a realistic blueprint for AI flexibility: frontier capability that does not depend on any one vendor staying available.
No gatekeeping
New capable models can join the pool without needing permission from any single vendor.
Route around outages
If a provider is down or restricted, Fugu reallocates work to available agents automatically.
Future-proof access
Frontier-level results without betting your critical workflow on one company staying available.
Sakana Fugu vs OpenRouter
Sakana Fugu sits in a crowded space of model routers and multi-agent frameworks. The key question is whether you want to buy orchestration or build it yourself.
Sakana Fugu
Model-shaped orchestration API with learned routing. Good when you want one endpoint and coordinated multi-agent results.
OpenRouter
Unified access to many models. Good when you want explicit provider choice and price comparison across models.
LangGraph, CrewAI, AutoGen
Frameworks for building your own agent workflows. Good when control and full observability matter most.
Single frontier model
Simple, predictable, easy to audit. Good for straightforward tasks where one model is consistently good enough.
| Approach | How routing is decided | Visibility | Best when |
|---|---|---|---|
| Sakana Fugu | Learned coordinator LLM (trained, recursive) | Black-box | You want frontier results without vendor lock-in |
| Manual / single model | You pick one model and hope | Full | Governance demands one vendor |
| If-else routers | Hand-coded rules | Full | Simple, predictable routing |
| Agent frameworks | Human-designed agent graphs | Full | You already have a hand-tuned workflow |
| OpenRouter-style | Marketplace plus price and latency routing | Partial | You want broad model access and fallback |
Sakana Fugu security, privacy, and compliance
The important question is not "Is Sakana Fugu enterprise-ready?" but rather: can you control which agents and providers participate, what data is sent, how logs are handled, and what happens when a request needs restricted content?
Sakana Fugu is publicly positioned with flexibility in agent selection, including the ability to opt out of specific providers or models. This matters for regulated teams in healthcare, finance, legal, defense, and education who may need to exclude certain providers or keep sensitive workloads inside approved boundaries.
- Which providers or agents can be excluded from the model pool?
- Can sensitive prompts be separated by environment?
- What logs are retained and for how long?
- How are API keys rotated and scoped?
- Can you reproduce important outputs for audits?
- What human review is required for high-risk decisions?
Recursive orchestration: how Fugu improves its own answers
One of Sakana Fugu's most distinctive features is recursion. Because Fugu is itself a language model, it can call instances of itself, reading its own intermediate output, judging whether its first coordination strategy was effective, and launching a corrective pass if needed.
This is the mechanism behind test-time scaling in Sakana Fugu. A relatively small coordinator, by iterating on what it just produced and re-delegating, can reach answers that neither it nor any single worker could produce in a single pass. The trade-off is honest: recursive scaling means recursive cost. For easy tasks, Fugu can answer directly and cheaply. For hard tasks, it can escalate depth. The art is in Fugu learning when escalation pays off.
Why this matters: the architecture compounds over time. Every new capable model entering the open ecosystem can join Fugu's pool. Unlike a monolithic model that needs expensive retraining to improve, Fugu improves incrementally as the broader ecosystem does.
Getting started with Sakana Fugu
A practical path from zero to your first orchestrated request.
Create an account and get an API key
Sign up via the Sakana AI console and generate a Sakana Fugu API key. Choose a subscription tier or pay-as-you-go depending on expected volume.
Point your client at the Fugu endpoint
Set your OpenAI-compatible client's base URL to the Fugu endpoint and supply your key. No SDK migration is needed.
Pick a model
Start with Fugu for interactive work or Fugu Ultra for hard, multi-step tasks. Switching later is a one-field change.
Send a real task and measure
Run a representative workload, a code review, a research question, a pipeline step, and compare quality, latency, and cost against your current setup.
Tune the pool and scale
If you have compliance needs, opt specific agents out of the pool. Then scale your tier to match real usage patterns.
Sakana Fugu glossary: key terms explained
The vocabulary you need to understand Sakana Fugu and multi-agent AI orchestration.
Sakana Fugu Limitations
Do not integrate only because launch benchmarks look strong. Production fit depends on governance, latency, observability, and region support. Here are the honest trade-offs to weigh.
- Underlying routing details are not exposed like a self-built agent graph, routing is a black box.
- Fugu Ultra can trade response time for deeper coordination, latency varies per request.
- EU and EEA availability is blocked while compliance work continues, confirm region support.
- Simple tasks may be cheaper through direct model calls, orchestration adds overhead.
- Vendor-reported benchmarks need your own workload validation before trusting.
- Per-request cost is harder to predict than a fixed single-model call, dynamic routing varies cost.
- Fugu is a young commercial product, confirm SLAs, uptime, and data residency directly.
The road to Sakana Fugu
Fugu is the product form of a long research thread at Sakana AI, tracing back to evolutionary model merging and agentic AI research.
Evolutionary model merging
Showed diverse open-source models can be combined to produce capabilities none possessed individually.
The AI Scientist and ShinkaEvolve
Demonstrated coordinated AI agents executing the full cycle of scientific research and evolutionary search over LLM-generated programs.
AB-MCTS
Showed multiple frontier models cooperating through tree search can substantially outperform any individual model on hard reasoning.
April 2026: Fugu beta opens
Sakana AI opens beta applications for Sakana Fugu as an OpenAI-compatible API.
ICLR 2026: Trinity and Conductor published
The two papers formalize learned orchestration, the academic grounding under Fugu.
Launch: Fugu and Fugu Ultra go GA
Sakana releases Fugu Ultra, claiming parity with export-controlled frontier models.
Which Sakana Fugu plan fits you?
Match a plan to your real usage pattern. Start small, measure, then scale up only if real usage justifies it.
Solo and hobby
Standard - $20/mo
You experiment a few times a week, build side projects, or want to learn orchestration hands-on. Both models included.
Working professional
Pro - $100/mo
You run regular coding, review, and research sessions across the week and want 10x the headroom of Standard.
Heavy and team
Max - $200/mo
You run long, heavy, frequent workloads and want 20x Standard usage in a predictable monthly bill.
Spiky and enterprise
Pay-as-you-go
Your volume is bursty or very large. Token billing flexes with demand at a single rate based on the top-tier model in play.
Why "Fugu"? The pufferfish behind the name
Sakana means fish in Japanese, fitting for a Tokyo lab. Fugu is the Japanese pufferfish, a delicacy famous for requiring expertise to prepare safely. The metaphor works: a pufferfish inflates by drawing on what is around it, and Sakana Fugu inflates its capability by drawing on a pool of surrounding models.
Behind one calm surface, a single API, sits something that can expand dramatically when the task demands it. The branding encodes Sakana AI's core conviction that powerful AI will be collaborative ecosystems rather than isolated monoliths.
One surface, many parts
Like the pufferfish, Fugu presents a single face to the world while coordinating many internal systems.
Expands on demand
For simple tasks, Fugu stays lean. For complex tasks, it inflates its capabilities by pulling in more agents.
Sakana Fugu: myths vs reality
Myth: It is just an if-else router
Reality: Fugu is a trained language model that learned coordination strategies, not hand-coded routing logic. That is the whole point of the Trinity and Conductor research.
Myth: You need Fable 5 access to get top results
Reality: Sakana claims Fugu Ultra reaches comparable results by orchestrating publicly accessible models, even though Fable 5 and Mythos are not in its pool.
Myth: More agents always means stacked fees
Reality: On pay-as-you-go, when multiple agents are active you pay a single rate based on the top-tier model involved, no stacking.
Myth: It is always cheaper than a single model
Reality: The savings claim depends on how often it recurses. Recursive scaling means recursive cost, test it on your workload before trusting headline savings.
The Sakana Fugu roadmap
Where Sakana AI says Fugu is heading, based on its launch statements. Roadmap items are stated intentions and may change.
- Expand the expert-agent pool, including open models and Sakana AI's own models, to strengthen coordination for long-running tasks.
- Fast model refresh, Sakana aims to train and evaluate updated Fugu models within roughly two weeks of a new frontier model's public release.
- More user control over how Fugu works on your behalf, beyond today's agent opt-outs.
- Incremental improvement, because Fugu uses learned orchestration, it gets better as the ecosystem does, without expensive monolithic retraining.
Sakana Fugu: key takeaways
Sakana Fugu is a learned orchestration model that behaves like a single model while coordinating a pool of frontier LLMs behind one OpenAI-compatible API.
It ships as Fugu (balanced) and Fugu Ultra (max quality), switchable with one parameter, with subscription and pay-as-you-go pricing.
Sakana claims Fugu Ultra reaches frontier-level performance comparable to export-controlled models, without single-vendor lock-in.
The trade-offs are black-box routing, variable cost, and latency overhead, validate on your own workload before committing.
Who Sakana Fugu is for
AI engineers and developers
Building coding assistants, agents, or pipelines who want frontier results without juggling multiple provider keys.
ML researchers
Running agentic research loops, paper reproduction, and hard reasoning tasks where coordinated models can outperform any single one.
Enterprises and CTOs
Concerned about export controls and single-vendor risk who need a resilient, swappable-pool hedge.
Technical founders
Shipping AI products who want one stable API surface that improves as the model ecosystem improves.
Security teams
Applying multi-model reasoning to vulnerability analysis where depth beats speed.
Analysts and IP professionals
Doing literature and patent investigation that rewards careful multi-step reasoning.
The verdict on Sakana Fugu
Sakana Fugu is one of the most interesting product bets of 2026 because it reframes the frontier question. Instead of "how do we build a bigger model," it asks "how do we build a smarter coordinator." If the learned-orchestration thesis holds at production scale, it points toward a future where capability comes from coordination rather than raw size, and where no single provider's restrictions can fully cut you off.
The honest caveats remain: routing you cannot fully audit, costs that flex with recursion, latency that orchestration inevitably adds, and a young product whose marketing claims deserve independent testing. But the direction is genuinely novel, the research grounding is real, and the resilience argument is timely.
Who should try it first: developers and researchers running multi-step coding, reasoning, and research tasks who are already comfortable with multiple models, and who want one stable API that gets better as the ecosystem does.
Sakana Fugu evaluation checklist
Use the launch benchmarks to choose what to test, then judge Sakana Fugu on your own production-shaped tasks.
Define the task class
Separate coding review, research synthesis, security analysis, patent search, and chatbot workloads before comparing models.
Choose real baselines
Compare Fugu and Fugu Ultra against your current single model, manual router, or agent framework, not against a generic leaderboard.
Run messy prompts
Include long context, missing files, contradictory instructions, and follow-up corrections. Fugu is designed for messy multi-step work.
Measure four numbers
Track answer quality, latency, token cost, and human correction rate. A better answer that arrives too late may still lose.
Check governance fit
Confirm provider opt-outs, data-use settings, regional availability, logging, and procurement requirements before production use.
Start with a canary
Route a small percentage of real tasks through Fugu first, then expand only if quality gains justify latency and cost.
Sakana Fugu enterprise checklist
Enterprise value depends less on headline benchmark numbers and more on control, repeatability, and failure behavior.
| Question | Why it matters | Decision signal |
|---|---|---|
| Can you use it in your region? | Official FAQ says EU and EEA are not supported yet. | Skip EU production until availability changes. |
| Can you opt out providers? | Some teams cannot send data to specific underlying providers. | Fugu supports opt-outs; Fugu Ultra uses a fixed pool. |
| Can you audit routing? | Security and regulated teams may need to know which provider handled data. | Fugu does not expose full routing details today. |
| What happens on latency spikes? | Deeper orchestration can improve quality while slowing response time. | Use Fugu for interactive flows and Ultra for hard batch work. |
| How will spend be capped? | Recursive orchestration can raise token volume. | Use per-workflow budgets and log actual input/output tokens. |
| Who owns support? | AI platform incidents need a clear operator, not only a model API key. | Assign escalation paths before routing user-facing traffic. |
Sakana Fugu search intent map
These are the user intents this guide is built to answer, based on the source files and the live launch topic.
Informational
Sakana Fugu, what is Sakana Fugu, Sakana AI Fugu, multi-agent orchestration API, learned model orchestration.
Developer
Sakana Fugu API, OpenAI-compatible Fugu endpoint, Fugu Ultra model name, Sakana Fugu quickstart, code review workflow.
Commercial
Sakana Fugu pricing, Fugu Ultra token cost, subscription plans, pay-as-you-go, Fugu Ultra pricing per 1M tokens.
Technical
TRINITY, Conductor, agent pool, model routing, recursive orchestration, test-time scaling, provider opt-out.
Comparison
Sakana Fugu vs OpenRouter, Sakana Fugu vs LangGraph, Sakana Fugu vs CrewAI, Sakana Fugu vs AutoGen.
Buyer intent
Is Sakana Fugu worth it, best use cases, limitations, enterprise checklist, privacy and compliance questions.
Tips for using Sakana Fugu well
Good orchestration still needs good operating discipline.
- Default to Fugu, escalate to Ultra when quality matters more than response time.
- Measure cost on your own workload, because recursive calls and verification steps change token usage.
- Use provider opt-outs deliberately for compliance, knowing that each restriction can reduce orchestration flexibility.
- Treat routing as a black box and design logs, fallbacks, and QA checks around observable outputs.
- Keep simple tasks simple. Cheap local or single frontier models may beat Fugu for short, clean prompts.
- Save benchmark prompts so future Fugu releases can be retested against the same decision set.
Sakana Fugu source notes
This guide is independent, so it separates official facts from analysis and avoids fake authority signals.
Official product facts
Pricing, model names, OpenAI-compatible API claims, availability notes, and provider controls should be checked against Sakana AI's official product and console pages.
Benchmark caveats
The benchmark table summarizes public launch claims. Treat it as a test plan input, not a guarantee for your own codebase or research workflow.
No invented trust signals
This page does not claim fake reviews, ratings, offices, user counts, or official affiliation. That keeps the schema and page copy defensible.
Explore this Sakana Fugu guide
Jump to any section in this complete Sakana Fugu reference.
Sakana Fugu FAQ
What is Sakana Fugu in simple terms?
Sakana Fugu is a multi-agent AI orchestration system that behaves like a single model. It is a language model trained to call other models, deciding per request whether to answer alone or coordinate a team, all through one OpenAI-compatible API.
How much does Sakana Fugu cost?
Subscriptions are $20/mo (Standard), $100/mo (Pro, 10x usage), and $200/mo (Max, 20x usage), each including both Fugu and Fugu Ultra. Pay-as-you-go bills by tokens; Fugu Ultra is $5 input, $30 output, $0.50 cached input per 1M tokens, with higher rates above 272K context.
What is the difference between Fugu and Fugu Ultra?
Fugu balances performance and latency for everyday work. Fugu Ultra prioritizes answer quality on hard, multi-step reasoning by coordinating a deeper pool of agents, at the cost of speed. Switching between them is one parameter change.
Is the Sakana Fugu API OpenAI-compatible?
Yes. You point an existing OpenAI-format client at the Fugu endpoint with your API key, no SDK migration required.
How does Sakana Fugu compare to Fable 5 and Mythos?
Sakana claims Fugu Ultra stands shoulder-to-shoulder with Fable 5 and Mythos Preview on rigorous benchmarks, while avoiding export-control risk. Neither Fable 5 nor Mythos is in Fugu's pool since they are not publicly accessible.
Is Sakana Fugu worth it?
It is worth it for teams wanting frontier results without single-vendor lock-in, running coding, reasoning, or research workloads, and able to tolerate black-box routing. It is less suited to workloads needing fully auditable routing, sub-100ms latency, or single-vendor governance.
Can I exclude certain models for compliance?
Yes. Fugu lets you opt specific agents out of its pool to meet data, privacy, and compliance constraints. This trades some flexibility for governance control.
Who makes Sakana Fugu?
Sakana AI, a Tokyo-based AI research lab. Fugu is the product form of its research on model merging, agentic AI, and learned orchestration (Trinity and Conductor, ICLR 2026).
Is SakanaFugu.com the official site?
No. SakanaFugu.com is an independent guide and is not affiliated with Sakana AI. Always confirm current pricing, model IDs, and availability on official sources.
Sakana Fugu related topics
The concepts most closely connected to Sakana Fugu and multi-agent AI orchestration.
Understanding the Sakana Fugu cost model
How to reason about what Sakana Fugu actually costs you in practice.
Sakana's economic pitch is that orchestration can be cheaper than calling the single most expensive model for everything, by using a smaller coordinator to decompose a task, calling a frontier model only for the hard subtask, running parallel workers and voting only when uncertainty is high, and avoiding the most expensive model unless necessary. The honest caveat: recursion adds cost. The savings depend on how often Ultra has to recurse to beat the underlying models.
Decompose cheaply
A small coordinator splits the task before any expensive call is made.
Escalate selectively
Frontier models are invoked only for the subtasks that truly need them.
Single-rate billing
When multiple agents run, you pay one rate based on the top-tier model, fees are not stacked.