Claude AI Model Comparison: Every Model from Haiku to Opus 4.6 (March 2026)

Complete comparison of all Claude AI models: Opus 4.6, Sonnet 4.6, Haiku 4.5, and previous generations. Pricing from $1 to $25 per million tokens. Benchmarks, speed, and which model to use for what.

March 5, 2026 · 1 min read

Summary

Anthropic sells three Claude tiers. Haiku 4.5 is the cheapest at $1/$5 per million tokens, runs at 95-150 tok/s, and handles most automated tasks. Sonnet 4.6 is the sweet spot at $3/$15, scoring within 1.2 points of Opus on SWE-bench Verified. Opus 4.6 is the top tier at $5/$25, leading every benchmark. For most users: start with Sonnet, upgrade to Opus for hard problems, downgrade to Haiku for volume.

Haiku 4.5
$1/$5 per MTok, 95-150 tok/s
Sonnet 4.6
$3/$15 per MTok, 52.8 tok/s
Opus 4.6
$5/$25 per MTok, 45.3 tok/s

Current Models (March 2026)

These are the three models you should use. Previous generations remain available but are outperformed on every metric by these current versions.

Haiku 4.5

Released October 2025

  • 73.3% SWE-bench Verified
  • 41.0% Terminal-Bench 2.0
  • 95-150 tok/s output
  • $1/$5 per MTok
  • 200K context window
  • Best for: volume, speed, subagents

Sonnet 4.6

Released February 2026

  • 79.6% SWE-bench Verified
  • 59.1% Terminal-Bench 2.0
  • 52.8 tok/s output
  • $3/$15 per MTok
  • 200K (1M beta) context
  • Best for: production coding, most tasks

Opus 4.6

Released February 2026

  • 80.8% SWE-bench Verified
  • 65.4% Terminal-Bench 2.0
  • 45.3 tok/s output
  • $5/$25 per MTok
  • 200K (1M beta) context
  • Best for: complex reasoning, architecture

Full Model History

ModelReleasedSWE-bench VerifiedStatus
Claude 3 HaikuMarch 2024~40%Available, superseded
Claude 3 SonnetMarch 2024~45%Available, superseded
Claude 3 OpusMarch 2024~49%Available, superseded
Claude 3.5 SonnetJune 2024~50%Available, superseded
Claude 3.5 HaikuOctober 2024~45%Available, superseded
Claude 4 SonnetMid 2025~72%Available, superseded
Claude 4.5 HaikuOctober 202573.3%Current
Claude 4.5 SonnetLate 202577.2%Available, superseded
Claude 4.6 SonnetFebruary 202679.6%Current
Claude 4.6 OpusFebruary 202680.8%Current

The trajectory is clear. Each generation's Sonnet matches or exceeds the previous generation's Opus. Haiku 4.5 at 73.3% outperforms Claude 4 Sonnet. Sonnet 4.6 at 79.6% exceeds Claude 4.5 Opus. This pattern makes upgrading to current models the obvious choice, as you get better performance at the same or lower price point.

Deprecation warning

Anthropic periodically deprecates older model versions. Claude 3 models and early Claude 3.5 models may be removed from the API. If you are still using these, migrate to the current generation. The newer models are strictly better.

Benchmark Comparison

BenchmarkHaiku 4.5Sonnet 4.6Opus 4.6
SWE-bench Verified73.3%79.6%80.8%
Terminal-Bench 2.041.0%59.1%65.4%
HumanEval92.0%96.8%97.6%
GPQA Diamond~62%78.2%83.3%
OSWorld-Verified~60%72.5%72.7%
MATH~80%~88%~91%
MMLU Pro~65%~78%~82%

On coding-specific benchmarks (SWE-bench, HumanEval), the three models are closer together than on reasoning benchmarks (GPQA, MATH). This makes sense: code generation is a well-constrained task where smaller models can match larger ones. Open-ended reasoning favors models with larger compute budgets.

Pricing Comparison

Haiku 4.5Sonnet 4.6Opus 4.6
Input (per MTok)$1.00$3.00$5.00
Output (per MTok)$5.00$15.00$25.00
Cache read (per MTok)$0.10$0.30$0.50
Batch input$0.50$1.50$2.50
Batch output$2.50$7.50$12.50
Extended context inputN/A$6.00$10.00
Extended context outputN/A$22.50$37.50

The cost difference between tiers is consistent: Haiku is 5x cheaper than Opus, Sonnet is ~1.7x cheaper. With prompt caching enabled, cached input costs drop to $0.10/$0.30/$0.50 per MTok respectively. Caching is the single most impactful cost optimization regardless of which model you choose.

Speed and Latency

Haiku 4.5Sonnet 4.6Opus 4.6
Output (tok/s)95-15052.845.3
Time to first token~1s~5s~12s
Max effort TTFTN/A102.4s12.3s
1K token generation~7s~19s~22s

Haiku is 2-3x faster than Sonnet and 3x faster than Opus on raw output speed. For latency-sensitive applications (code completion, real-time suggestions, chatbots), Haiku is the only viable option. For batch processing where total throughput matters more than per-request latency, Sonnet offers the best throughput-to-quality ratio.

Which Model for What

Use CaseModelReasoning
Code completion / autocompleteHaikuSpeed critical, 1s TTFT
Feature implementationSonnetBest quality-to-cost ratio
Bug fixing (single file)SonnetNear-Opus accuracy, faster
Multi-file refactoringOpusMaintains cross-file consistency
Architecture decisionsOpusDeeper reasoning, more paths explored
Test generationSonnet96% pass rate, 40% cheaper than Opus
Code review (at scale)HaikuHigh volume, low cost per review
DocumentationHaikuStructured output, speed matters
Agent coordinatorOpusPlans work for cheaper subagents
Agent worker / subagentHaikuExecutes tasks from coordinator cheaply
Research / analysisOpusGPQA: 83.3% vs Sonnet's 78.2%
Customer-facing chatbotSonnetBalance of quality and response time

Route between Claude models automatically

Morph selects the optimal Claude model per task. Opus reasoning where it matters, Haiku speed where it doesn't.

Claude vs Competitors

ModelSWE-bench VerifiedPrice (Input/Output per MTok)Speed (tok/s)
Claude Opus 4.680.8%$5/$2545.3
GPT-5.280.0%$2/$10~65
GPT-5.3-Codex~78% (Pro: 56.8%)$2/$10~65
Gemini 3 Pro76.2%$1.25/$5~80
Claude Sonnet 4.679.6%$3/$1552.8
Claude Haiku 4.573.3%$1/$595-150
DeepSeek V4~74%$0.27/$1.10~60

Claude Opus 4.6 leads SWE-bench Verified. GPT-5.2 is the closest competitor at 80.0% and is significantly cheaper per token. Gemini 3 Pro offers the best raw price-to-performance ratio for coding. Claude's advantage is in tool use reliability, long-context behavior, and agentic coding workflows where the model needs to operate autonomously across multiple steps.

FAQ

How many Claude AI models are there?

Anthropic has released models across four Claude generations: Claude 3 (March 2024), Claude 3.5 (June 2024), Claude 4/4.5 (2025), and Claude 4.6 (February 2026). Each generation includes three tiers: Haiku (fast/cheap), Sonnet (balanced), and Opus (most capable). The current recommended models are Haiku 4.5, Sonnet 4.6, and Opus 4.6.

What is the best Claude model in 2026?

Claude Opus 4.6 is the most capable model, scoring 80.8% on SWE-bench Verified and 83.3% on GPQA Diamond. For the best value, Sonnet 4.6 delivers 97-99% of Opus's quality at 40% lower cost ($3/$15 vs $5/$25 per million tokens). For high-volume tasks, Haiku 4.5 is 5x cheaper than Opus with strong performance.

What is the difference between Claude generations (3, 3.5, 4, 4.5, 4.6)?

Each generation brought significant improvements. Claude 3 Opus scored ~49% on SWE-bench. Claude 3.5 Sonnet improved to ~50%. Claude 4 Sonnet reached ~72%. Claude 4.5 Sonnet reached 77.2%, and Claude Sonnet 4.6 now scores 79.6%. The trajectory shows Sonnet-tier models in each new generation matching or exceeding the Opus tier from the previous generation.

How much does each Claude model cost?

Current pricing per million tokens (input/output): Haiku 4.5 costs $1/$5, Sonnet 4.6 costs $3/$15, Opus 4.6 costs $5/$25. All models support prompt caching (90% input cost reduction) and batch API (50% discount). Extended context beyond 200K tokens uses premium pricing.

Should I use the latest Claude model or an older one?

Use the latest models. Sonnet 4.6 is strictly better than Sonnet 4.5 on every benchmark at the same price. Opus 4.6 matches or exceeds Opus 4.5. Haiku 4.5 matches the previous Sonnet 4 at one-third the cost. There is no performance reason to use older models, and Anthropic may deprecate them over time.

What is the context window for Claude models?

All current Claude models support 200K tokens by default. Opus 4.6 and Sonnet 4.6 support an extended 1M token context window in beta. Claude 3 models had 200K context. The 1M context uses premium pricing for tokens beyond 200K.

How does Claude compare to GPT and Gemini?

On SWE-bench Verified (coding): Opus 4.6 (80.8%) leads GPT-5.2 (80.0%) and Gemini 3 Pro (76.2%). On GPQA Diamond (reasoning): Opus 4.6 (83.3%) competes with GPT-5.2 (~85%). Claude's advantages are tool use reliability, long-context performance, and coding agent workflows. GPT's advantages are speed and multimodal breadth.

Can I use multiple Claude models together?

Yes. Multi-model routing is the recommended approach for production systems. Route simple tasks to Haiku ($1/$5), standard tasks to Sonnet ($3/$15), and complex tasks to Opus ($5/$25). Morph's API does this automatically. Teams using this pattern save 40-60% compared to all-Opus workloads.

Related Pages