
Duet vs Claude Code: Cloud Agent vs CLI
Side-by-side comparison covering pricing, tier limits, and how each tool controls per-task spend.
Guide
Duet Team

A developer ran 23 subagents on a code-quality project. Three days later, the bill hit $47,000. Not a typo. Forty-seven thousand dollars on AI-generated code reviews.
That's the extreme end. But the pattern is everywhere. A 49-subagent typescript-checks run clocked $8,000 to $15,000. Someone left Claude Code running overnight with a looping script and woke up to $6,000 gone. One developer burned through $15,000 in eight months on API billing. That same usage would have cost $800 on a Max subscription.
Then there's enterprise scale. Uber burned through its entire 2026 AI coding budget in four months. Per-engineer bills ranged from $150 to $2,000 a month. Their COO called it a "head-exploding moment." Microsoft responded by canceling most of its Claude Code licenses entirely.
The common thread in every one of these stories? People running the most expensive model for every task. Opus for commit messages. Opus for formatting. Opus for boilerplate CRUD that Haiku could handle in its sleep.
You don't need to stop using Claude Code. You need to stop using the wrong model.
Claude comes in three tiers. Each one exists for a reason.
$0.25 per million input tokens. $1.25 per million output tokens. That's 60x cheaper than Opus.
Haiku is built for speed and volume. It handles simple, well-defined tasks where reasoning depth doesn't matter. Commit messages, boilerplate generation, formatting, string manipulation, quick Q&A about your codebase. The kind of work that makes up a surprising chunk of every coding session.
Most developers never even try Haiku for these tasks. They should.
$3 per million input tokens. $15 per million output tokens. Five times cheaper than Opus, and the quality gap is smaller than most people think.
Sonnet handles 70 to 80 percent of typical development work at near-Opus quality. Landing pages, API routes, test writing, refactoring, content writing, standard code review. If the task has a clear goal and doesn't require multi-step architectural reasoning, Sonnet gets it done.
This is the model you should be using by default. Not Opus.
$15 per million input tokens. $75 per million output tokens. The most capable and the most expensive.
Opus earns its price on tasks where getting it wrong costs more than the tokens. Architecture design, debugging subtle multi-file issues, complex database migrations, security audits, framework upgrades. Tasks that require deep reasoning across long context windows.
The mistake isn't using Opus. The mistake is using Opus for everything.
Here's the cheat sheet. Every common task mapped to the right model.
For each task below, Best means it's the right model for the job. Good means it works but you're overpaying. Overkill means you're lighting money on fire.
| Task | Haiku ($0.25/$1.25) | Sonnet ($3/$15) | Opus ($15/$75) |
|---|---|---|---|
| Commit messages | Best (~$0.01) | Overkill (~$0.12) | Way overkill (~$0.60) |
| Formatting/linting | Best | Overkill | Way overkill |
| Boilerplate/CRUD | Good | Best | Overkill |
| Simple Q&A | Best | Overkill | Way overkill |
| Regex/string ops | Best | Good | Overkill |
| Task | Haiku | Sonnet ($3/$15) | Opus ($15/$75) |
|---|---|---|---|
| Landing pages/UI | Weak | Best (~$0.15-0.50) | Good but 5x cost |
| Blog/content writing | Weak | Best | Good for complex long-form |
| Test writing | Weak | Best | Good for integration suites |
| Feature development | Weak | Best (most features) | Best for multi-system features |
| Code review (standard PRs) | Weak | Best | Overkill |
| Bug fixes (clear repro) | Weak | Best | Overkill |
| Refactoring | Weak | Best (targeted) | Best for large-scale refactors |
| Task | Haiku | Sonnet | Opus ($15/$75) |
|---|---|---|---|
| Architecture design | No | Risky | Best |
| Multi-file debugging | No | Struggles | Best |
| Database migrations | No | OK for simple | Best |
| Security audits | No | Misses subtleties | Best |
| Complex reasoning (100K+ context) | No | Degrades | Best |
| Framework upgrades | No | OK for minor bumps | Best |
The pattern is clear. Most tasks fall in the simple or standard category. That means most of your token spend should be on Haiku and Sonnet, not Opus.
Don't want to think about model selection?
Duet analyzes every task and picks the right model automatically. Your code gets Opus when it needs it, Haiku when it doesn't.
Not everyone wants to build a model router. Here's every method, graded by technical skill required. Start at your level and work up.
These work for anyone, even if you've never opened a terminal.
These five steps alone will cut most developers' bills by 50% or more. If you do nothing else, do these.
You know your way around a config file. You've edited settings.json before.
claude --model claude-haiku-4 for simple tasks.You're comfortable writing bash scripts and configuring development tools.
This is a lot of plumbing to build yourself.
Duet ships with model routing, cost tracking, and usage reports out of the box. No hooks, no scripts, no maintenance.
You want automated, intelligent routing with zero manual intervention.
Let's make the savings tangible.
All-Opus approach: 2 million tokens across planning, implementation, and testing. Cost: roughly $120 to $160.
Routed approach: Planning on Opus (300K tokens, ~$25). Implementation on Sonnet (1.2M tokens, ~$22). Boilerplate and test scaffolds on Haiku (500K tokens, ~$0.75). Total: roughly $48. Savings: 65-70%.
All-Opus approach: 5 million tokens. Cost: roughly $350 to $450.
Routed approach: All writing on Sonnet (4.5M tokens, ~$72). Formatting and metadata on Haiku (500K tokens, ~$0.75). Total: roughly $73. Savings: 80%+.
All-Opus approach: 3 million tokens. Cost: roughly $200 to $250.
Routed approach: Standard PRs on Sonnet (2.4M tokens, ~$40). Complex architectural PRs on Opus (600K tokens, ~$50). Total: roughly $90. Savings: 55-65%.
In every scenario, quality stays the same on the tasks that matter. You're not cutting corners. You're cutting waste.
Want these savings without building the router?
Duet's model routing is tuned on thousands of real tasks. You get the cost savings from day one.
Building a model router works. We just showed you how. But building it is one thing. Maintaining it across model updates, pricing changes, new model releases, and evolving team workflows is another.
Duet has model routing built in. Every task gets analyzed in real-time. Simple tasks go to faster, cheaper models. Complex tasks get escalated to Opus when the stakes justify it. No configuration. No classifier to train. No cost-tracking scripts to maintain.
If you'd rather ship than build infrastructure, Duet handles model routing out of the box.
Model routing, handled.
Duet picks the right model for every task. Your team ships faster, your bill stays predictable.
No. For 70 to 80 percent of typical development tasks, Sonnet produces equivalent results. Opus shines on complex reasoning and multi-step tasks. For everything else, you're paying 5x more for the same output.
50 to 70 percent on typical development workflows. Content-heavy workflows can see 80%+ savings. The exact number depends on your task mix, but the pattern holds across every scenario we've tested.
Not if you route correctly. Each model gets full context for its task. The key is matching complexity to capability, not using the same model for everything out of habit.
Yes. Use the --model flag to specify per-session, or build a wrapper script that routes based on task type. Level 3 in our skill breakdown covers this in detail.
Extended thinking uses more tokens but produces better results on complex tasks. The routing principle still applies: route extended thinking tasks to Opus, use standard mode for Sonnet and Haiku tasks. Don't enable extended thinking on simple tasks.
For simple tasks like formatting, boilerplate, commit messages, and quick completions, yes. For anything requiring reasoning about code logic or multi-step problem solving, use Sonnet or Opus.
Duet's router has been tuned on thousands of real tasks and updates automatically with new models and pricing. Building your own works but requires ongoing maintenance, classifier tuning, and keeping up with Anthropic's model releases. Duet handles all of that automatically.

Side-by-side comparison covering pricing, tier limits, and how each tool controls per-task spend.

How Duet stacks up against OpenAI Codex for teams evaluating multi-model coding tools.

How to set up Claude Code for your team with pooled usage, shared skills, and predictable billing.

Speed up client emails and renewal follow-ups with an AI drafting system that keeps communication consistent.

Reduce carrier portal rekeying with AI that extracts ACORD data and powers automated carrier submissions across portals.

Claude Code MCP servers connect your AI agent to GitHub, Slack, Notion, databases, and more. Step-by-step setup, configuration, and the best MCP servers for coding workflows.