Claude Haiku vs Sonnet vs Opus: Pricing, Benchmarks, and Which to Pick (March 2026)

Haiku 4.5 costs $1/$5. Sonnet 4.6 costs $3/$15. Opus 4.6 costs $5/$25. Haiku scores 73.3% on SWE-bench Verified, Sonnet 79.6%, Opus 80.8%. Full three-way comparison with decision framework.

March 5, 2026 · 1 min read

Summary

Anthropic sells three tiers of Claude. Haiku is the cheapest ($1/$5 per MTok), runs at 95-150 tok/s, and scores 73.3% on SWE-bench Verified. Sonnet sits in the middle ($3/$15), runs at 52.8 tok/s, and scores 79.6%. Opus is the top tier ($5/$25), runs at 45.3 tok/s, and scores 80.8%. The sweet spot for most coding work is Sonnet 4.6. Use Haiku for high-volume automated tasks. Use Opus when you need every percentage point of accuracy on complex reasoning.

Haiku 4.5: Best for high-volume pipelines, code review, and subagent tasks where speed and cost matter more than peak accuracy.

Sonnet 4.6: Best for production coding. 97-99% of Opus quality at 60% of the cost.

Opus 4.6: Best for complex multi-file refactoring, architectural decisions, and tasks where accuracy is worth a 5x premium over Haiku.

Pricing at a Glance

Haiku 4.5Sonnet 4.6Opus 4.6
Input (per MTok)$1.00$3.00$5.00
Output (per MTok)$5.00$15.00$25.00
Cache write (5-min)$1.25$3.75$6.25
Cache read$0.10$0.30$0.50
Batch API input$0.50$1.50$2.50
Batch API output$2.50$7.50$12.50

Haiku is 5x cheaper than Opus on output tokens. Sonnet is roughly 1.7x cheaper than Opus. For a coding session generating 50K output tokens: Haiku costs $0.25, Sonnet costs $0.75, Opus costs $1.25. At 1,000 sessions per day, the monthly difference between all-Haiku and all-Opus is $30,000.

Cache reads are the great equalizer

All three models support prompt caching. A cached input costs $0.10/MTok on Haiku, $0.30/MTok on Sonnet, and $0.50/MTok on Opus. If you are sending the same codebase context across sessions, the input cost gap narrows dramatically. The output cost gap remains.

Coding Benchmarks

73.3%
Haiku 4.5 SWE-bench Verified
79.6%
Sonnet 4.6 SWE-bench Verified
80.8%
Opus 4.6 SWE-bench Verified
BenchmarkHaiku 4.5Sonnet 4.6Opus 4.6
SWE-bench Verified73.3%79.6%80.8%
Terminal-Bench 2.041.0%59.1%65.4%
HumanEval92.0%96.8%97.6%
OSWorld-Verified~60%72.5%72.7%

On SWE-bench Verified, the Haiku-to-Opus gap is 7.5 points. On Terminal-Bench 2.0, the gap is 24.4 points. The difference between models is small on code generation tasks (HumanEval is nearly saturated) and large on agentic tasks requiring autonomous problem-solving.

Haiku 4.5 at 73.3% on SWE-bench Verified matches the performance of Claude Sonnet 4 from the previous generation. If Sonnet 4 was good enough for your coding pipeline six months ago, Haiku 4.5 is good enough now at one-third the cost.

General Benchmarks

BenchmarkHaiku 4.5Sonnet 4.6Opus 4.6
GPQA Diamond~62%78.2%83.3%
MATH~80%~88%~91%
MMLU Pro~65%~78%~82%
Agentic tool useGoodStrongStrongest

The reasoning gap is more pronounced than the coding gap. On GPQA Diamond (PhD-level science questions), the Haiku-to-Opus spread is 21 points. This matters for tasks that require domain expertise beyond pure code generation, like scientific computing, financial modeling, or medical informatics.

Speed Comparison

MetricHaiku 4.5Sonnet 4.6Opus 4.6
Output speed95-150 tok/s52.8 tok/s45.3 tok/s
Relative speed3x Opus1.2x Opus1x (baseline)
Time to first token~1s~5s~12s
200K context supportYesYesYes
1M context (beta)NoYesYes

Haiku runs 3x faster than Opus on raw output speed. For multi-agent pipelines where you spawn 10-20 subagents in parallel, the combination of Haiku's speed and cost makes it the natural choice for worker agents. A coordinator agent running Opus delegates to Haiku subagents, getting Opus-level planning with Haiku-level execution speed.

Context Windows

All three models support 200K token context windows. Sonnet 4.6 and Opus 4.6 also support a 1M token extended context window in beta (using the context-1m-2025-08-07 header). Haiku 4.5 is limited to 200K.

Extended context (beyond 200K) uses premium pricing: $10/$37.50 per MTok on Opus, $6/$22.50 on Sonnet. For codebases that exceed 200K tokens, Sonnet's 1M context is more cost-effective than Opus's 1M context while delivering nearly identical coding performance.

When to Use Haiku

High-volume code review

Reviewing every PR in a monorepo. At $1/$5 per MTok and 95-150 tok/s, Haiku can review 100 PRs per hour at a fraction of the cost of running Sonnet or Opus on the same workload.

Subagent tasks in multi-agent systems

File search, code indexing, test execution, linting. Tasks where speed matters more than deep reasoning. Haiku's 3x speed advantage over Opus makes it ideal for the worker layer.

Code completion and suggestions

Inline code completion in IDEs where latency is critical. Haiku's 1s TTFT vs Opus's 12s makes it the only viable option for real-time autocomplete.

Documentation generation

Generating docstrings, README files, and API documentation from code. Haiku performs well on these structured output tasks and processes large codebases quickly.

When to Use Sonnet

Feature implementation

Building new features from a spec. Sonnet's 79.6% SWE-bench score means it solves the vast majority of real coding tasks correctly on the first attempt.

Bug fixing

Given a stack trace and context, Sonnet resolves bugs at nearly the same rate as Opus. The 1.2-point SWE-bench gap is noise for single-issue fixes.

Test generation

Writing unit tests, integration tests, and E2E tests. Sonnet understands test patterns well and generates thorough coverage at a 40% discount vs Opus.

Production API backends

Cost-sensitive applications that need strong coding capability. Sonnet's $3/$15 pricing makes it viable for high-volume production workloads where Opus would be prohibitively expensive.

When to Use Opus

Multi-file refactoring

Renaming types across 15+ files, restructuring modules, migrating APIs. Opus maintains consistency across large change sets better than Sonnet.

Architecture decisions

Evaluating system design trade-offs, choosing between patterns, planning migration strategies. Opus explores more solution paths before committing.

Complex debugging

Race conditions, memory leaks, distributed system failures. Tasks requiring deep causal reasoning across multiple components. Opus's 65.4% Terminal-Bench score (vs Sonnet's 59.1%) reflects this edge.

Agent orchestration

Serving as the coordinator agent in multi-agent pipelines. Opus plans the work, Haiku executes it. The cost of one Opus call is amortized across many cheaper Haiku subagent calls.

Multi-Model Routing

The strongest pattern is not picking one model. It is routing tasks to the right tier based on complexity. Morph's API analyzes each request and selects the model that gives the best quality-to-cost ratio.

Task TypeRecommended ModelWhy
Code completionHaikuSpeed (95-150 tok/s), cost ($0.25/session)
Feature implementationSonnetQuality (79.6% SWE-bench), cost ($0.75/session)
Complex refactoringOpusAccuracy (80.8% SWE-bench, 65.4% Terminal-Bench)
Code reviewHaikuVolume (review 100 PRs/hr cheaply)
Test generationSonnetBalance of quality and throughput
Architecture planningOpusDeep reasoning, multiple solution paths

Teams using this routing approach typically see 40-60% cost reduction compared to all-Opus workflows with less than 2% quality degradation on aggregate task success rates.

Route across all three Claude tiers automatically

Morph selects Haiku, Sonnet, or Opus per task based on complexity signals. Get Opus quality where it matters and Haiku speed everywhere else.

FAQ

What is the difference between Claude Haiku, Sonnet, and Opus?

Claude Haiku is Anthropic's fastest and cheapest model ($1/$5 per million tokens), best for high-volume tasks. Sonnet is the balanced mid-tier ($3/$15), matching or exceeding previous Opus versions. Opus is the most capable model ($5/$25), strongest on complex reasoning and multi-step coding. The current versions are Haiku 4.5, Sonnet 4.6, and Opus 4.6.

How much does each Claude model cost?

Claude Haiku 4.5: $1 input / $5 output per million tokens. Claude Sonnet 4.6: $3 input / $15 output per million tokens. Claude Opus 4.6: $5 input / $25 output per million tokens. All three offer batch API discounts of 50% and prompt caching that reduces repeated input costs by 90%.

Which Claude model is best for coding?

For most coding tasks, Sonnet 4.6 offers the best value. It scores 79.6% on SWE-bench Verified (vs Opus's 80.8%) at 40% lower cost. Opus 4.6 is worth the premium for complex multi-file refactoring and architectural reasoning. Haiku 4.5 scores 73.3% on SWE-bench and works well for code review, simple completions, and high-volume automated tasks.

How fast is each Claude model?

Claude Haiku 4.5 is the fastest at 95-150 tokens per second. Claude Sonnet 4.6 runs at approximately 52.8 tokens per second. Claude Opus 4.6 runs at approximately 45.3 tokens per second. Haiku is 2-3x faster than Sonnet and 3-4x faster than Opus on raw output speed.

Is Claude Haiku good enough for production coding?

Yes. Haiku 4.5 scores 73.3% on SWE-bench Verified, matching the performance of Claude Sonnet 4 (the previous generation mid-tier). At $1/$5 per million tokens and 95-150 tok/s, it's well-suited for code review, test generation, documentation, and as a fast subagent in multi-agent coding pipelines.

Can I mix Claude models in one application?

Yes, and this is the recommended approach for cost optimization. Route simple tasks (code completion, formatting, reviews) to Haiku, medium tasks (feature implementation, bug fixes) to Sonnet, and complex tasks (architecture, multi-file refactoring) to Opus. Morph's routing layer does this automatically based on task complexity signals.

What context window does each Claude model support?

All three current Claude models support 200K tokens by default. Sonnet 4.6 and Opus 4.6 also support a 1M token extended context window in beta (using the context-1m-2025-08-07 header). Haiku 4.5 supports 200K tokens. Extended context uses premium pricing for tokens beyond the 200K threshold.

How do Anthropic's Claude models compare to GPT and Gemini?

On SWE-bench Verified, Claude Opus 4.6 (80.8%) leads GPT-5.2 (80.0%) and Gemini 3 Pro (76.2%). Claude Sonnet 4.6 (79.6%) beats both GPT-5.2 and Gemini at a lower price point. Claude Haiku 4.5 (73.3%) competes with models 3-5x its price. Claude's main advantage in coding is consistent tool use and long-context reliability.

Related Comparisons