You want one answer: which AI coding agent is best. On the public Terminal-Bench 2.1 leaderboard, Codex CLI with GPT-5.5 is #1 at 83.4%, Claude Code with Opus 4.8 is #2 at 78.9%, and Gemini CLI with Gemini 3.1 Pro is at 70.7%. For openness, OpenCode (172,198 stars, MIT) is the most-starred open source agent. The best agent depends on whether you optimize for benchmark ceiling, cost, or running your own model. Below: 11 agents ranked on all three, with exact prices, install commands, and verified scores. Updated June 9, 2026.
Best AI coding agent by goal
Verified June 9, 2026. Scores from Terminal-Bench 2.1 (tbench.ai), prices from vendor pages.
Highest benchmark
Codex CLI + GPT-5.5
83.4% Terminal-Bench 2.1, #1
Deepest reasoning
Claude Code + Opus 4.8
78.9% Terminal-Bench, 69.2% SWE-bench Pro
Most open source
OpenCode (MIT)
172,198 stars, 75-plus providers
Free, model-agnostic: OpenCode, Cline, Aider, Kilo Code, Gemini CLI (60 req/min, 1,000/day free). Best IDE flow: Cursor (Pro $20/mo). Cheapest paid default: GitHub Copilot Pro ($10/mo).
Terminal-Bench 2.1 leaderboard (the agent benchmark that matters)
Terminal-Bench measures an agent driving a real terminal to complete development tasks: editing files, running commands, fixing failures. It tests the agent and model together, which is the right unit, because the same model scores differently inside different agents. Scores below are from the public tbench.ai leaderboard as of June 9, 2026.
Terminal-Bench 2.1 (agent + model)
Percentage of terminal development tasks completed. Higher is better.
Codex CLI with GPT-5.5 leads at 83.4%. Claude Code with Opus 4.8 is second at 78.9%, ahead of Opus 4.7 at 69.7%. Source: tbench.ai, June 9, 2026.
Two leaderboards disagree on the top model, and that is fine because they test different things. Terminal-Bench rewards driving a terminal end to end. SWE-bench Pro rewards fixing real GitHub issues. On SWE-bench Pro, Claude Opus 4.8 scores 69.2% (up 4.9 points from Opus 4.7's 64.3%), outperforming GPT-5.5 and Gemini 3.1 Pro. On the self-reported SWE-bench Verified leaderboard at llm-stats.com, Claude Opus 4.8 sits at 88.6% and Claude Opus 4.7 at 87.6%. Read benchmarks as the agent-plus-model pair, not the model alone.
Pricing, side by side
Open source agents are free as tools; you pay for model tokens. Subscription agents bundle model access into a plan with usage windows or credits. Prices verified from vendor pages on June 9, 2026.
| Agent | License / source | Entry price | How you pay for models |
|---|---|---|---|
| Claude Code | Proprietary (repo for issues) | Pro $17/mo annual or $20/mo; Max from $100/mo | Bundled. 5-hour rolling window plus weekly cap shared across claude.ai and Claude Code |
| OpenAI Codex CLI | Apache-2.0, 89,991 stars | ChatGPT Plus $20/mo; Pro from $100/mo (5x and 20x) | Bundled per 5-hour window, or BYO OpenAI API key at per-token rates |
| Cursor | Proprietary IDE | Hobby $0; Pro $20/mo | Pro includes ~$20 of API-rate usage; Pro+ $60, Ultra $200; separate Auto + Composer pool |
| GitHub Copilot | Proprietary | Free $0; Pro $10/mo | Credit-based since June 1, 2026; Pro = 1,500 credits ($15 value); 1 credit = $0.01 |
| OpenCode | MIT, 172,198 stars | Free | BYOK across 75-plus providers; ChatGPT Plus / Copilot / GitLab Duo usable as backends |
| Cline | Apache-2.0, 62,996 stars | Free | BYOK any provider, or local via Ollama / LM Studio; no markup |
| Aider | Apache-2.0, 45,945 stars | Free | BYOK per run, e.g. anthropic / deepseek / openai-compatible |
| Kilo Code | MIT, 19,968 stars | Free | Kilo Gateway $0/mo at exact provider rates, no markup; Kilo Pass $19/$49/$199/mo |
| Gemini CLI | Apache-2.0, 105,104 stars | Free | 60 req/min, 1,000/day free with personal Google account; or API key |
| Goose | Apache-2.0, 48,542 stars (AAIF) | Free | 15-plus providers; reuse Claude/ChatGPT/Gemini subs via ACP |
| Kiro | Proprietary (AWS) | Free 50 credits/mo; Pro $20/mo | Credit-based: Pro 1,000, Pro+ $40 = 2,000, Power $200 = 10,000; overage $0.04/credit |
GitHub Copilot moved to credit billing on June 1, 2026
Copilot replaced premium request units with GitHub AI Credits (1 credit = $0.01). Pro $10/mo = 1,500 credits ($15 value), Pro+ $39/mo = 7,000 credits ($70), the new Max $100/mo = 20,000 credits ($200). Basic code completions never consume credits and stay unlimited on paid plans. Note: as of June 2026, new sign-ups for Copilot Student, Pro, Pro+, and Max are paused while the billing change rolls out.
Claude Code
Best for reasoning depth on hard problems, in the terminal.
Anthropic's terminal-native agent. With Opus 4.8 it scores 78.9% on Terminal-Bench 2.1 (#2 overall) and Opus 4.8 leads SWE-bench Pro at 69.2%. The repo anthropics/claude-code has 131,380 stars but is proprietary (the repo is for issues and docs, no open-source license).
Install
Install Claude Code
# Native install (recommended)
curl -fsSL https://claude.ai/install.sh | bash # macOS / Linux / WSL
# Windows PowerShell:
# irm https://claude.ai/install.ps1 | iex
# Alternatives
brew install --cask claude-code
winget install Anthropic.ClaudeCode
npm install -g @anthropic-ai/claude-code # Node 18+
# Add an MCP server
claude mcp add --transport http notion https://mcp.notion.com/mcpPricing and limits
Claude Pro is $17/mo billed annually ($200 up front) or $20/mo monthly and includes Claude Code; Max starts from $100/mo. Usage runs on a 5-hour rolling session window plus a weekly cap that covers all models over 7 days, and it is shared across claude.ai, Claude Desktop, and Claude Code on the same subscription. The free Claude.ai plan does not include Claude Code. It also runs via Amazon Bedrock, Google Vertex AI, and Microsoft Foundry. System requirements: macOS 13+, Windows 10 1809+, Ubuntu 20.04+/Debian 10+/Alpine 3.19+, 4GB+ RAM.
Long sessions stay coherent with built-in auto-compaction. Compare directly at Claude Code vs Codex and Claude Code vs Cursor.
OpenAI Codex CLI
Best benchmark ceiling. #1 on Terminal-Bench 2.1.
OpenAI's open-source agent (openai/codex, 89,991 stars, Apache-2.0). With GPT-5.5 it tops Terminal-Bench 2.1 at 83.4%. Surfaces include the CLI, an IDE extension for VS Code / Cursor / Windsurf, the Codex Web cloud agent at chatgpt.com/codex, a desktop app, and iOS, with automatic code review and Slack integration in the cloud.
Install
Install Codex CLI
curl -fsSL https://chatgpt.com/codex/install.sh | sh # macOS / Linux
npm install -g @openai/codex
brew install --cask codex
codex # run, then "Sign in with ChatGPT"
# /model # switch model (GPT-5.4, GPT-5.3-Codex, others)Pricing and limits
Codex requires a ChatGPT Plus, Pro, Business, Edu, or Enterprise account to sign in with ChatGPT. Per 5-hour window: Plus ($20/mo) allows 15 to 80 local messages, 5 cloud tasks, and 5 code reviews; Pro 5x allows 80 to 400; Pro 20x allows 300 to 1,600 (Pro from $100/mo). GPT-5.5 usage averages 5 to 45 credits per message. You can also auth with an OpenAI API key and pay per-token rates, with no cloud features in API-key mode.
Cursor
Best IDE flow, with a separate lower-cost agent pool.
A VS Code fork built around an agent loop. Individual plans: Pro $20/mo includes about $20 of API-rate usage; Pro+ $60/mo includes $70; Ultra $200/mo includes $400. Cursor's in-house Composer line (Composer 2.5) draws from a separate, more generous Auto + Composer pool designed for everyday agentic coding at lower cost than frontier API models. The Hobby tier is free with limited Agent requests and Tab completions, no card required. Paid plans add frontier models, MCPs, cloud agents, and Bugbot reviews on usage-based billing.
GitHub Copilot
Cheapest paid default. Works in every major IDE plus a CLI.
Free $0 with limited chat and agent usage plus 2,000 code completions/mo. Pro $10/mo = 1,500 credits ($15 value), Pro+ $39/mo = 7,000 credits ($70), and the new Max $100/mo = 20,000 credits ($200). Credits consume on token usage at published per-model rates: Claude Opus 4.5 through 4.8 bill at $5 in / $25 out per 1M tokens, Sonnet 4 through 4.6 at $3/$15, GPT-5.5 at $5/$30, Gemini 3.1 Pro at $2/$12. Basic code completions and next-edit suggestions never consume credits and stay unlimited on paid plans.
Install the CLI
Install GitHub Copilot CLI
npm install -g @github/copilot # Node 22+
brew install copilot-cli
winget install GitHub.Copilot
# supports MCP servers and a /model switchCompare at Copilot vs Claude Code and Cline vs Copilot.
OpenCode
The most-starred open source coding agent. 75-plus providers.
anomalyco/opencode (moved from sst/opencode) has 172,198 stars under MIT, ahead of Gemini CLI (105k) and OpenAI Codex (90k). Terminal-native, it supports 75-plus LLM providers via the AI SDK and the Models.dev catalog, plus local models through Ollama, LM Studio, and llama.cpp. OpenCode Zen is the team's curated, tested model list for agentic coding.
Install and add a custom provider
Install OpenCode
curl -fsSL https://opencode.ai/install | bash
npm install -g opencode-ai
brew install anomalyco/tap/opencodeCustom OpenAI-compatible provider (JSON config)
{
"provider": {
"myprovider": {
"npm": "@ai-sdk/openai-compatible",
"options": { "baseURL": "https://api.myprovider.com/v1" },
"models": { "my-model": {} }
}
}
}Subscription backends in OpenCode
Per OpenCode's docs, ChatGPT Plus, GitHub Copilot, and GitLab Duo subscriptions are usable as model backends, while Anthropic explicitly prohibits using Claude Pro or Max subscriptions with third-party tools like OpenCode.
Cline
In-IDE open source agent with Plan and Act approval modes.
cline/cline: 62,996 stars, Apache-2.0, free, every model your choice (Claude, GPT, Gemini, any OpenAI-compatible endpoint, BYOK, or local via Ollama / LM Studio). It runs in VS Code, JetBrains (Early Access), Cursor, and Windsurf, plus a CLI installed with npm i -g cline. Local-inference RAM guidance: 16 to 32GB for small or quantized models, 32 to 64GB for mid-size coding models, 64GB+ for larger models, with the Use Compact Prompt setting recommended for local runs.
See Cline alternatives, Cline vs Cursor, and the head-to-head below.
Aider
Git-native terminal pair programming. Auto-commit per edit.
Aider-AI/aider: 45,945 stars, Apache-2.0. The terminal pair-programming pioneer that thinks in git, every edit a commit. Its last repo push was May 22, 2026, a visibly slower cadence than OpenCode or Cline, which push daily, and its model guidance still recommends 2025-era models (Gemini 2.5 Pro, DeepSeek R1/V3, Claude 3.7 Sonnet, o3/o4-mini, GPT-4.1) rather than current frontier models.
Install and run Aider
python -m pip install aider-install && aider-install
# or one-liner:
curl -LsSf https://aider.chat/install.sh | sh
aider --model sonnet --api-key anthropic=<key>
aider --model deepseek --api-key deepseek=<key>Related: Aider vs Cline, OpenCode vs Aider, Morph vs Aider diff.
Gemini CLI and Google Antigravity
The free terminal agent (1,000 requests/day) and Google's IDE-plus-CLI harness.
google-gemini/gemini-cli: 105,104 stars, Apache-2.0. The free tier allows 60 requests per minute and 1,000 per day with a personal Google account (OAuth login serves a managed Gemini 3 mix of flash and pro; an API key lets you pin a specific model). With Gemini 3.1 Pro it scores 70.7% on Terminal-Bench 2.1.
Install Gemini CLI
npx @google/gemini-cli
npm install -g @google/gemini-cli
brew install gemini-cli
# MCP servers configured in ~/.gemini/settings.jsonGoogle Antigravity 2.0 (announced at I/O 2026, May 19) split into a unified harness with two surfaces: a redesigned desktop app and a new standalone CLI, adding specialized subagents for parallel tasks, terminal sandboxing, credential masking, and hardened Git policies. Gemini 3.5 Flash is the new default model (Terminal-Bench 2.1 = 76.2%, described as 4x faster output than other frontier models). Google AI Pro is $19.99/mo with higher Antigravity rate limits; Google AI Ultra starts at $99.99/mo. In early June 2026 Google reset all quota counters to zero and shipped a refreshed Flash build to fix post-launch issues.
Compare at Gemini CLI vs Claude Code, Gemini CLI vs Codex, and Antigravity vs Claude Code.
Goose, Kilo Code, and Kiro
Goose
aaif-goose/goose: 48,542 stars, Apache-2.0, built in Rust, now governed by the Agentic AI Foundation at the Linux Foundation. Desktop app plus CLI plus API, 15-plus providers, and 70-plus MCP extensions. It can reuse existing Claude, ChatGPT, or Gemini subscriptions via ACP and positions itself as general-purpose: not just code, also research, writing, automation, and data analysis. Install: curl -fsSL https://github.com/aaif-goose/goose/releases/download/stable/download_cli.sh | bash. See Goose vs Claude Code.
Kilo Code
Kilo-Org/kilocode: 19,968 stars, MIT (the domain kilocode.ai now redirects to kilo.ai). The extension is free and open source. Kilo Gateway is $0/mo plus usage at exact provider rates with no markup; Kilo Pass subscriptions run $19/$49/$199/mo with up to 50% bonus credits; Teams is $15/user/mo. BYOK works for Anthropic, OpenAI, Google, Azure, and Bedrock keys with no Kilo plan required. See Kilo Code vs Claude Code.
Kiro
AWS's IDE agent. Free $0 = 50 credits/mo with open-weight models and Claude Sonnet 4.5; Pro $20/mo = 1,000 credits; Pro+ $40/mo = 2,000; Power $200/mo = 10,000; overage $0.04/credit billed month-end, with no rollover. Team plans add centralized billing, usage analytics, and SSO via AWS IAM Identity Center. New users get $20 credited toward a first upgrade. See Kiro vs Claude Code.
Cline vs OpenCode
| Cline | OpenCode | |
|---|---|---|
| GitHub stars / license | 62,996 / Apache-2.0 | 172,198 / MIT |
| Surface | VS Code, JetBrains, Cursor, Windsurf extension + CLI | Terminal-native + CLI |
| Model providers | Any provider, BYOK, local Ollama / LM Studio | 75-plus via AI SDK; local Ollama / LM Studio / llama.cpp |
| Control model | Plan and Act modes, approval before each change | Plan-first, curated OpenCode Zen model list |
| Pick it if | You want the agent inside your IDE with step approval | You want a CLI agent and the widest provider list |
Both are free and BYOK. OpenCode wins on community size and provider breadth; Cline wins if you want the agent embedded in VS Code or JetBrains with explicit per-change approval. Full breakdown: OpenCode vs Cline.
Kilo Code vs OpenCode
Both are MIT-licensed and free. Kilo Code (19,968 stars) is a VS Code and JetBrains extension with BYOK and a $0/mo Kilo Gateway at exact provider rates with no markup, plus optional Kilo Pass subscriptions ($19/$49/$199 per month) and Teams at $15/user/mo. OpenCode (172,198 stars) is terminal-native with 75-plus providers and a far larger community. Choose Kilo Code for in-IDE BYOK with no markup and structured workflow; choose OpenCode for a CLI workflow and the largest provider list. See OpenCode vs Kilo Code.
Aider vs OpenCode
OpenCode (172,198 stars, MIT) pushes code daily and supports 75-plus providers. Aider (45,945 stars, Apache-2.0) is the git-native pioneer, but its last repo push was May 22, 2026 and its model guidance has not been refreshed for 2026 frontier models. Use OpenCode for active development and the widest model choice; use Aider if you specifically want its auto-commit-per-edit git workflow in the terminal. Full comparison: OpenCode vs Aider.
The model backend matters as much as the agent
Most of these agents are BYOK: OpenCode, Cline, Aider, Kilo Code, Goose, and Gemini CLI all let you point at any OpenAI-compatible endpoint. The model and the inference provider behind it set both your cost and your output quality, independent of the agent.
If you run DeepSeek or other open-weight models, where you serve them matters. Morph Open Source Models serve DeepSeek with 16-bit (bf16) activations and no fp8 or int8 quantization. Most serverless providers quantize activations to fp8 to cut cost, which degrades output; keeping full 16-bit activations means responses match the reference weights. That makes Morph the best place to run DeepSeek when output fidelity matters.
For coding agents specifically, Morph runs codegen-tuned speculative decoding (draft and ngram tuned on code) plus custom low-level inference kernels built for code generation, which makes it the fastest and highest-quality option for codegen rather than a general-purpose menu. Verified price: morph-dsv4flash (DeepSeek V4 Flash) is $0.139 per 1M input tokens and $0.278 per 1M output tokens. See pricing.
| Morph DeepSeek V4 Flash | Typical serverless fp8 host | |
|---|---|---|
| Activation precision | 16-bit (bf16), no quantization | fp8 activations (quality loss) |
| Input price / 1M tokens | $0.139 | varies |
| Output price / 1M tokens | $0.278 | varies |
| Codegen tuning | Code-tuned spec decode + custom kernels | General-purpose |
Frequently Asked Questions
What is the best AI coding agent in 2026?
On the public Terminal-Bench 2.1 leaderboard, Codex CLI with GPT-5.5 is #1 at 83.4%, Claude Code with Opus 4.8 is #2 at 78.9%, and Gemini CLI with Gemini 3.1 Pro is at 70.7%. For open source, OpenCode (172,198 stars, MIT) is the most-starred agent. Pick Codex CLI for the benchmark ceiling, Claude Code for reasoning depth (69.2% SWE-bench Pro), and OpenCode or Cline for a free, model-agnostic agent.
Cline vs OpenCode: which is better?
OpenCode has 172,198 stars (MIT) and 75-plus providers; Cline has 62,996 stars (Apache-2.0) and runs as a VS Code, JetBrains, Cursor, and Windsurf extension plus a CLI with Plan and Act approval modes. Choose OpenCode for a terminal agent with the widest provider list; choose Cline for an in-IDE agent with step-by-step approval. See OpenCode vs Cline.
Kilo Code vs OpenCode: which is better?
Both are MIT and free. Kilo Code (19,968 stars) is an IDE extension with BYOK and a $0/mo gateway at exact provider rates, plus Kilo Pass at $19/$49/$199 per month. OpenCode (172,198 stars) is terminal-native with 75-plus providers and a larger community. Pick Kilo Code for in-IDE BYOK with no markup, OpenCode for a CLI workflow.
Aider vs OpenCode: which is better?
OpenCode (172,198 stars) ships daily and supports 75-plus providers. Aider (45,945 stars) is the git-native pioneer but its last repo push was May 22, 2026 and its model guidance is not refreshed for 2026 frontier models. Use OpenCode for active development and provider choice; use Aider for its auto-commit-per-edit git workflow.
How much does an AI coding agent cost?
Open source agents (OpenCode, Cline, Aider, Kilo Code, Gemini CLI) are free as tools; you pay only for model tokens. Gemini CLI allows 1,000 free requests/day. Copilot Pro is $10/mo, Cursor Pro and Claude Code Pro start at $20/mo (Claude $17/mo billed annually), Codex needs ChatGPT Plus at $20/mo. To cut token cost on open-weight models, run DeepSeek V4 Flash on Morph at $0.139/1M input and $0.278/1M output.
Can I use my Claude subscription with a third-party agent?
No. Per OpenCode's docs, Anthropic explicitly prohibits using Claude Pro or Max subscriptions with third-party tools like OpenCode. ChatGPT Plus, GitHub Copilot, and GitLab Duo subscriptions can be used as backends in those tools. Claude Code itself requires a Pro, Max, Team, Enterprise, or Console (API) account.
SWE-bench vs Terminal-Bench: what is the difference?
Terminal-Bench 2.1 tests an agent driving a terminal end to end (Codex CLI + GPT-5.5 leads at 83.4%). SWE-bench Pro and Verified test fixing real GitHub issues (Opus 4.8 leads SWE-bench Pro at 69.2%). Read scores as the agent-plus-model pair, because the same model scores differently in different agents. See context engineering for why scaffolding changes outcomes.
Run your coding agent on faster, full-precision models
Any BYOK agent can point at Morph. Serve DeepSeek and open-weight models at 16-bit precision with codegen-tuned inference, and pair it with WarpGrep semantic search at $0 for 100k requests.