Summary
Anthropic sells three Claude tiers. Haiku 4.5 is the cheapest at $1/$5 per million tokens, runs at 95-150 tok/s, and handles most automated tasks. Sonnet 4.6 is the sweet spot at $3/$15, scoring within 1.2 points of Opus on SWE-bench Verified. Opus 4.6 is the top tier at $5/$25, leading every benchmark. For most users: start with Sonnet, upgrade to Opus for hard problems, downgrade to Haiku for volume.
Current Models (March 2026)
These are the three models you should use. Previous generations remain available but are outperformed on every metric by these current versions.
Haiku 4.5
Released October 2025
- 73.3% SWE-bench Verified
- 41.0% Terminal-Bench 2.0
- 95-150 tok/s output
- $1/$5 per MTok
- 200K context window
- Best for: volume, speed, subagents
Sonnet 4.6
Released February 2026
- 79.6% SWE-bench Verified
- 59.1% Terminal-Bench 2.0
- 52.8 tok/s output
- $3/$15 per MTok
- 200K (1M beta) context
- Best for: production coding, most tasks
Opus 4.6
Released February 2026
- 80.8% SWE-bench Verified
- 65.4% Terminal-Bench 2.0
- 45.3 tok/s output
- $5/$25 per MTok
- 200K (1M beta) context
- Best for: complex reasoning, architecture
Full Model History
| Model | Released | SWE-bench Verified | Status |
|---|---|---|---|
| Claude 3 Haiku | March 2024 | ~40% | Available, superseded |
| Claude 3 Sonnet | March 2024 | ~45% | Available, superseded |
| Claude 3 Opus | March 2024 | ~49% | Available, superseded |
| Claude 3.5 Sonnet | June 2024 | ~50% | Available, superseded |
| Claude 3.5 Haiku | October 2024 | ~45% | Available, superseded |
| Claude 4 Sonnet | Mid 2025 | ~72% | Available, superseded |
| Claude 4.5 Haiku | October 2025 | 73.3% | Current |
| Claude 4.5 Sonnet | Late 2025 | 77.2% | Available, superseded |
| Claude 4.6 Sonnet | February 2026 | 79.6% | Current |
| Claude 4.6 Opus | February 2026 | 80.8% | Current |
The trajectory is clear. Each generation's Sonnet matches or exceeds the previous generation's Opus. Haiku 4.5 at 73.3% outperforms Claude 4 Sonnet. Sonnet 4.6 at 79.6% exceeds Claude 4.5 Opus. This pattern makes upgrading to current models the obvious choice, as you get better performance at the same or lower price point.
Deprecation warning
Benchmark Comparison
| Benchmark | Haiku 4.5 | Sonnet 4.6 | Opus 4.6 |
|---|---|---|---|
| SWE-bench Verified | 73.3% | 79.6% | 80.8% |
| Terminal-Bench 2.0 | 41.0% | 59.1% | 65.4% |
| HumanEval | 92.0% | 96.8% | 97.6% |
| GPQA Diamond | ~62% | 78.2% | 83.3% |
| OSWorld-Verified | ~60% | 72.5% | 72.7% |
| MATH | ~80% | ~88% | ~91% |
| MMLU Pro | ~65% | ~78% | ~82% |
On coding-specific benchmarks (SWE-bench, HumanEval), the three models are closer together than on reasoning benchmarks (GPQA, MATH). This makes sense: code generation is a well-constrained task where smaller models can match larger ones. Open-ended reasoning favors models with larger compute budgets.
Pricing Comparison
| Haiku 4.5 | Sonnet 4.6 | Opus 4.6 | |
|---|---|---|---|
| Input (per MTok) | $1.00 | $3.00 | $5.00 |
| Output (per MTok) | $5.00 | $15.00 | $25.00 |
| Cache read (per MTok) | $0.10 | $0.30 | $0.50 |
| Batch input | $0.50 | $1.50 | $2.50 |
| Batch output | $2.50 | $7.50 | $12.50 |
| Extended context input | N/A | $6.00 | $10.00 |
| Extended context output | N/A | $22.50 | $37.50 |
The cost difference between tiers is consistent: Haiku is 5x cheaper than Opus, Sonnet is ~1.7x cheaper. With prompt caching enabled, cached input costs drop to $0.10/$0.30/$0.50 per MTok respectively. Caching is the single most impactful cost optimization regardless of which model you choose.
Speed and Latency
| Haiku 4.5 | Sonnet 4.6 | Opus 4.6 | |
|---|---|---|---|
| Output (tok/s) | 95-150 | 52.8 | 45.3 |
| Time to first token | ~1s | ~5s | ~12s |
| Max effort TTFT | N/A | 102.4s | 12.3s |
| 1K token generation | ~7s | ~19s | ~22s |
Haiku is 2-3x faster than Sonnet and 3x faster than Opus on raw output speed. For latency-sensitive applications (code completion, real-time suggestions, chatbots), Haiku is the only viable option. For batch processing where total throughput matters more than per-request latency, Sonnet offers the best throughput-to-quality ratio.
Which Model for What
| Use Case | Model | Reasoning |
|---|---|---|
| Code completion / autocomplete | Haiku | Speed critical, 1s TTFT |
| Feature implementation | Sonnet | Best quality-to-cost ratio |
| Bug fixing (single file) | Sonnet | Near-Opus accuracy, faster |
| Multi-file refactoring | Opus | Maintains cross-file consistency |
| Architecture decisions | Opus | Deeper reasoning, more paths explored |
| Test generation | Sonnet | 96% pass rate, 40% cheaper than Opus |
| Code review (at scale) | Haiku | High volume, low cost per review |
| Documentation | Haiku | Structured output, speed matters |
| Agent coordinator | Opus | Plans work for cheaper subagents |
| Agent worker / subagent | Haiku | Executes tasks from coordinator cheaply |
| Research / analysis | Opus | GPQA: 83.3% vs Sonnet's 78.2% |
| Customer-facing chatbot | Sonnet | Balance of quality and response time |
Route between Claude models automatically
Morph selects the optimal Claude model per task. Opus reasoning where it matters, Haiku speed where it doesn't.
Claude vs Competitors
| Model | SWE-bench Verified | Price (Input/Output per MTok) | Speed (tok/s) |
|---|---|---|---|
| Claude Opus 4.6 | 80.8% | $5/$25 | 45.3 |
| GPT-5.2 | 80.0% | $2/$10 | ~65 |
| GPT-5.3-Codex | ~78% (Pro: 56.8%) | $2/$10 | ~65 |
| Gemini 3 Pro | 76.2% | $1.25/$5 | ~80 |
| Claude Sonnet 4.6 | 79.6% | $3/$15 | 52.8 |
| Claude Haiku 4.5 | 73.3% | $1/$5 | 95-150 |
| DeepSeek V4 | ~74% | $0.27/$1.10 | ~60 |
Claude Opus 4.6 leads SWE-bench Verified. GPT-5.2 is the closest competitor at 80.0% and is significantly cheaper per token. Gemini 3 Pro offers the best raw price-to-performance ratio for coding. Claude's advantage is in tool use reliability, long-context behavior, and agentic coding workflows where the model needs to operate autonomously across multiple steps.
FAQ
How many Claude AI models are there?
Anthropic has released models across four Claude generations: Claude 3 (March 2024), Claude 3.5 (June 2024), Claude 4/4.5 (2025), and Claude 4.6 (February 2026). Each generation includes three tiers: Haiku (fast/cheap), Sonnet (balanced), and Opus (most capable). The current recommended models are Haiku 4.5, Sonnet 4.6, and Opus 4.6.
What is the best Claude model in 2026?
Claude Opus 4.6 is the most capable model, scoring 80.8% on SWE-bench Verified and 83.3% on GPQA Diamond. For the best value, Sonnet 4.6 delivers 97-99% of Opus's quality at 40% lower cost ($3/$15 vs $5/$25 per million tokens). For high-volume tasks, Haiku 4.5 is 5x cheaper than Opus with strong performance.
What is the difference between Claude generations (3, 3.5, 4, 4.5, 4.6)?
Each generation brought significant improvements. Claude 3 Opus scored ~49% on SWE-bench. Claude 3.5 Sonnet improved to ~50%. Claude 4 Sonnet reached ~72%. Claude 4.5 Sonnet reached 77.2%, and Claude Sonnet 4.6 now scores 79.6%. The trajectory shows Sonnet-tier models in each new generation matching or exceeding the Opus tier from the previous generation.
How much does each Claude model cost?
Current pricing per million tokens (input/output): Haiku 4.5 costs $1/$5, Sonnet 4.6 costs $3/$15, Opus 4.6 costs $5/$25. All models support prompt caching (90% input cost reduction) and batch API (50% discount). Extended context beyond 200K tokens uses premium pricing.
Should I use the latest Claude model or an older one?
Use the latest models. Sonnet 4.6 is strictly better than Sonnet 4.5 on every benchmark at the same price. Opus 4.6 matches or exceeds Opus 4.5. Haiku 4.5 matches the previous Sonnet 4 at one-third the cost. There is no performance reason to use older models, and Anthropic may deprecate them over time.
What is the context window for Claude models?
All current Claude models support 200K tokens by default. Opus 4.6 and Sonnet 4.6 support an extended 1M token context window in beta. Claude 3 models had 200K context. The 1M context uses premium pricing for tokens beyond 200K.
How does Claude compare to GPT and Gemini?
On SWE-bench Verified (coding): Opus 4.6 (80.8%) leads GPT-5.2 (80.0%) and Gemini 3 Pro (76.2%). On GPQA Diamond (reasoning): Opus 4.6 (83.3%) competes with GPT-5.2 (~85%). Claude's advantages are tool use reliability, long-context performance, and coding agent workflows. GPT's advantages are speed and multimodal breadth.
Can I use multiple Claude models together?
Yes. Multi-model routing is the recommended approach for production systems. Route simple tasks to Haiku ($1/$5), standard tasks to Sonnet ($3/$15), and complex tasks to Opus ($5/$25). Morph's API does this automatically. Teams using this pattern save 40-60% compared to all-Opus workloads.