Best AI Coding Agent (2026): Ranked by Terminal-Bench, Price, and Source

On the public Terminal-Bench 2.1 leaderboard, Codex CLI with GPT-5.5 is #1 at 83.4% and Claude Code with Opus 4.8 is the top usable Claude pairing at 78.9%. Claude Fable 5 scores higher (83.1% inside Claude Code) but was suspended on June 12, 2026 under a U.S. export-control order, so it cannot be run. For open source, opencode (180,312 stars, MIT) is the most-starred agent, ahead of Claude Code (134,868), Gemini CLI (105,641), and OpenAI Codex (94,277). The full ranked table, with default models, prices, and scores verified June 28, 2026, is below.

Best AI coding agent by goal

Verified June 28, 2026. Terminal-Bench 2.1 scores from tbench.ai; prices and GitHub stars from vendor pages and the GitHub API.

Highest benchmark

Codex CLI + GPT-5.5

83.4% Terminal-Bench 2.1, #1

Deepest reasoning

Claude Code + Opus 4.8

78.9% Terminal-Bench, 69.2% SWE-bench Pro

Most open source

opencode (MIT)

180,312 stars, any provider

Free, model-agnostic: opencode, Cline, Aider, Kilo Code, Zed (free tier). Free Google option: Antigravity CLI (Gemini CLI's free serving ended June 18, 2026). Best IDE flow: Cursor (Pro $20/mo). Cheapest paid default: GitHub Copilot Pro ($10/mo).

Terminal-Bench 2.1 leaderboard (the agent benchmark that matters)

Terminal-Bench measures an agent driving a real terminal to complete development tasks: editing files, running commands, fixing failures. It tests the agent and model together, which is the right unit, because the same model scores differently inside different agents. Scores below are from the public tbench.ai leaderboard as of June 28, 2026, filtered to models you can actually run.

Terminal-Bench 2.1 (agent + model, usable pairings)

Percentage of terminal development tasks completed. Higher is better. Suspended models excluded.

Codex CLI + GPT-5.5

#1 overall

83.4%

Claude Code + Opus 4.8

#4 overall, top usable Claude

78.9%

Terminus 2 + GPT-5.5

78.2%

Terminus 2 + Opus 4.8

74.6%

Gemini CLI + Gemini 3.1 Pro

70.7%

Claude Code + Opus 4.7

#10

69.7%

Codex CLI with GPT-5.5 leads at 83.4%. Claude Fable 5 ranks #2 (Claude Code, 83.1%) and #3 (Terminus 2, 80.4%) on the raw leaderboard but was suspended June 12, 2026 and is excluded here. Source: tbench.ai, June 28, 2026.

AI coding agents compared (June 28, 2026)

Rank	Agent	Default model	Best public score	Entry price	License / stars
1	Codex CLI	GPT-5.5	83.4% TB 2.1 (#1)	Free $0; Plus $20/mo	Apache-2.0, 94,277
2	Claude Code	Opus 4.8	78.9% TB; 69.2% SWE-Pro	Pro $20/mo ($17 annual)	Proprietary, 134,868
3	Gemini CLI / Antigravity	Gemini 3.1 Pro	70.7% TB 2.1	Free (Antigravity CLI)	Apache-2.0, 105,641
4	Cursor	frontier + Composer	n/a (BYOK / IDE)	Hobby $0; Pro $20/mo	Proprietary IDE
5	GitHub Copilot	Haiku 4.5 / GPT-5 mini (free)	n/a (IDE + CLI)	Free $0; Pro $10/mo	Proprietary
6	Windsurf (Devin Desktop)	SWE 1.6 + OSS models	n/a (BYOK / IDE)	Free $0; Pro $20/mo	Proprietary (Cognition)
7	opencode (BYOK)	any provider; opencode Zen	n/a (BYOK)	Free	MIT, 180,312
8	Cline (BYOK)	any provider	n/a (BYOK)	Free	Apache-2.0, 63,998
9	Aider (BYOK)	any provider	88.0% (Aider polyglot)	Free	Apache-2.0, 46,808
10	Kilo Code (BYOK)	Auto Model (500+)	n/a (BYOK)	Free	MIT, 25,038
11	Zed	BYOK	n/a (editor)	Free $0; Pro $10/mo	OSS Rust, 86,147
12	Amp	GPT-5.5 modes + Oracle	n/a (PAYG)	PAYG, $5 min	Sourcegraph

Ranks 1 to 3 are scored on the public Terminal-Bench 2.1 leaderboard (tbench.ai) as fixed agent-plus-model pairs. Agents below rank 3 are model-agnostic (BYOK), IDE-bundled, or pay-as-you-go and are not submitted as single pairs, so "n/a" means no public score for that pair, not a low score. Aider's 88.0% is gpt-5 (high) on Aider's own polyglot leaderboard. SWE-bench Pro reports Claude Opus 4.8 at 69.2%.

Two leaderboards disagree on the top model, and that is fine because they test different things. Terminal-Bench rewards driving a terminal end to end. SWE-bench Pro rewards fixing real GitHub issues. On the self-reported SWE-bench Pro aggregate at llm-stats.com, Claude Opus 4.8 scores 69.2% (up 4.9 points from Opus 4.7's 64.3%), ahead of GPT-5.5 (58.6%) and Gemini 3.1 Pro (54.2%). Read benchmarks as the agent-plus-model pair, not the model alone.

83.4%

Codex CLI + GPT-5.5, Terminal-Bench 2.1

78.9%

Claude Code + Opus 4.8, Terminal-Bench

69.2%

Opus 4.8, SWE-bench Pro

180,312

opencode GitHub stars (MIT)

The frontier models behind the agents

Every BYOK agent inherits the model you give it. As of June 28, 2026, Claude Opus 4.8 leads agentic coding at 88.6% SWE-bench Verified and 69.2% SWE-bench Pro. GPT-5.5 is the Codex and Amp default. DeepSeek V4, GLM-5.2, Qwen 3.7, MiniMax M3, and Kimi K2.6 are the open-weight options you can self-host or buy by the token. Two frontier models are not available: Claude Fable 5 and Mythos 5 were suspended on June 12, 2026 under a U.S. export-control directive, and GPT-5.6 is a limited preview restricted to about 20 government-approved companies.

Frontier coding models (SWE-bench, vendor self-reported via llm-stats)

Model	Status	SWE-bench Verified	SWE-bench Pro	Price in / out per 1M
Claude Opus 4.8	GA	88.6%	69.2%	$5 / $25
GLM-5.2	GA (open weights)	not published	62.1%	$1.40 / $4.40
Qwen3.7 Max	GA	80.4%	60.6%	$1.25 / $3.75
MiniMax M3	GA (open weights)	80.5%	59.0%	$0.60 / $2.40
GPT-5.5	GA	88.7%	58.6%	$5 / $30
Kimi K2.6	GA (open weights)	80.2%	58.6%	$0.95 / $4.00
DeepSeek V4 Pro	GA (open weights)	80.6%	55.4%	$0.44 / $0.87
Gemini 3.1 Pro	Preview	80.6%	54.2%	$2 / $12
Claude Fable 5	Suspended	95.0%	80.0%	disabled
GPT-5.6 (Sol/Terra/Luna)	Limited preview	not published	not published	gated

How to read these benchmark numbers

SWE-bench Verified is OpenAI's 500-problem human-validated subset; SWE-bench Pro is Scale AI's contamination-resistant 1,865-task set across 41 repositories. The numbers above are vendor self-reported (on the llm-stats leaderboard, all 102 SWE-bench Verified entries are self-reported and 0 are independently verified). Scale's standardized public SWE-bench Pro leaderboard runs much lower and is not directly comparable: GPT-5.4 (xHigh) leads it at 59.1%, with Claude Opus 4.6 at 51.9% and Gemini 3.1 Pro at 46.1%. Treat the self-reported figures as vendor claims, not refereed results.

Pricing, side by side

Open source agents are free as tools; you pay for model tokens. Subscription agents bundle model access into a plan with usage windows or credits. Prices verified from vendor pages on June 28, 2026.

AI coding agent pricing (June 2026)

Agent	License / source	Entry price	How you pay for models
Claude Code	Proprietary (134,868 stars)	Pro $20/mo ($17/mo annual); Max from $100/mo	Bundled. 5-hour rolling window plus weekly cap shared across claude.ai and Claude Code
OpenAI Codex CLI	Apache-2.0, 94,277 stars	Free $0; Go $8/mo; Plus $20/mo; Pro from $100/mo	Bundled per 5-hour window, or BYO OpenAI API key at per-token rates
Cursor	Proprietary IDE	Hobby $0; Pro $20/mo	Pro includes $20 of API-rate usage; Pro+ $60 ($70), Ultra $200 ($400), Teams $40/user/mo
GitHub Copilot	Proprietary	Free $0; Pro $10/mo	Pro $15 monthly credits; Pro+ $39 ($70); Max $100 ($200); on-demand beyond
Windsurf (Devin Desktop)	Proprietary (Cognition)	Free $0; Pro $20/mo	Pro includes SWE 1.6 and OSS models; Max $200/mo; Teams $80/mo + $40/dev seat
opencode	MIT, 180,312 stars	Free	BYOK any provider; ChatGPT Plus / Copilot / GitLab Duo usable as backends; opencode Zen hosted
Cline	Apache-2.0, 63,998 stars	Free	BYOK any provider, or local via Ollama / LM Studio; no markup
Aider	Apache-2.0, 46,808 stars	Free	BYOK per run, e.g. anthropic / deepseek / openai-compatible
Kilo Code	MIT, 25,038 stars	Free	Kilo Gateway at exact provider rates, 0% markup; or BYOK / local (Ollama, LM Studio)
Gemini CLI / Antigravity	Apache-2.0, 105,641 stars	Free (Antigravity CLI)	Gemini CLI free serving ended June 18, 2026; Antigravity CLI is free to everyone; or BYO API key
Zed	OSS Rust, 86,147 stars	Free $0; Pro $10/mo	Free 2,000 edit predictions/mo; Pro unlimited + $5 tokens; BYOK unlimited; Business $30/seat
Amp	Sourcegraph	PAYG, $5 minimum	Pay-as-you-go credits, no markup; Enterprise +50% and a $1,000 one-time purchase

Default models per agent (June 28, 2026)

Claude Code runs Opus 4.8 (the effort parameter defaults to high). Codex CLI recommends GPT-5.5; there is no dedicated gpt-5.5-codex, and the older GPT-5.3-Codex is deprecated. GitHub Copilot Free serves Haiku 4.5 and GPT-5 mini, with Opus on Pro+. Cursor mixes frontier models (Claude, GPT, Gemini) with its in-house Composer. Gemini CLI and Antigravity run Gemini 3 Pro and Gemini 3 Flash, auto-routed, with Gemini 3.1 Pro as the top scorer. Cline, Aider, opencode, Kilo Code, and Zed are model-agnostic and run whatever key you supply.

Claude Code

Best for reasoning depth on hard problems, in the terminal.

Anthropic's terminal-native agent, also available in VS Code and JetBrains, a desktop app, and the web. With Opus 4.8 (the effort parameter defaults to high) it scores 78.9% on Terminal-Bench 2.1, and Opus 4.8 leads SWE-bench Pro at 69.2%. The repo anthropics/claude-code has 134,868 stars but is proprietary (the repo is for issues and docs, with no open-source license). It supports MCP, sub-agents, background and cloud sessions, CLAUDE.md memory, hooks, and skills.

78.9%

Terminal-Bench 2.1 (Opus 4.8)

69.2%

SWE-bench Pro (Opus 4.8)

$20-100+

Pro to Max /mo

134,868

GitHub stars (proprietary)

Install

Install Claude Code

# Native install (recommended)
curl -fsSL https://claude.ai/install.sh | bash      # macOS / Linux / WSL
# Windows PowerShell:
#   irm https://claude.ai/install.ps1 | iex

# Alternatives
brew install --cask claude-code
npm install -g @anthropic-ai/claude-code           # Node 18+

# Add an MCP server
claude mcp add --transport http notion https://mcp.notion.com/mcp

Pricing and limits

Claude Pro is $20/mo monthly or $17/mo billed annually and includes Claude Code; Max starts from $100/mo, with Max 20x at $200/mo. Usage runs on a 5-hour rolling session window plus a weekly cap, shared across claude.ai, Claude Desktop, and Claude Code on the same subscription. The free Claude.ai plan does not include Claude Code. It also runs via Amazon Bedrock, Google Vertex AI, and Microsoft Foundry, and the terminal CLI and VS Code extension support third-party model providers.

Long sessions stay coherent with built-in auto-compaction. Compare directly at Claude Code vs Codex and Claude Code vs Cursor.

OpenAI Codex CLI

Best benchmark ceiling. #1 on Terminal-Bench 2.1.

OpenAI's open-source agent (openai/codex, 94,277 stars, Apache-2.0). With GPT-5.5 it tops Terminal-Bench 2.1 at 83.4%. The recommended model is GPT-5.5; there is no dedicated gpt-5.5-codex, and the older GPT-5.3-Codex is deprecated. Surfaces include the CLI, an IDE extension, the Codex Web cloud agent at chatgpt.com/codex, a desktop app, and iOS, with automatic code review in the cloud.

83.4%

Terminal-Bench 2.1 (GPT-5.5)

$0-20

Free to ChatGPT Plus /mo

GPT-5.5

Default / recommended model

94,277

GitHub stars (Apache-2.0)

Install

Install Codex CLI

curl -fsSL https://chatgpt.com/codex/install.sh | sh   # macOS / Linux
npm install -g @openai/codex
brew install --cask codex

codex            # run, then "Sign in with ChatGPT"
# /model         # switch model (GPT-5.5 default, GPT-5.4, GPT-5.4 mini)

Pricing and limits

Codex ships with the ChatGPT plan ladder: Free $0, Go $8/mo, Plus $20/mo, Pro from $100/mo (5x and 20x), Business $20/user/mo billed annually, and Enterprise custom. Usage is metered in messages per shared 5-hour window, for example 15 to 80 GPT-5.5 messages on Plus. You can also auth with an OpenAI API key and pay per-token rates, with no cloud features in API-key mode.

Cursor

Best IDE flow, with a separate lower-cost agent pool.

A VS Code fork built around an agent loop, now from Anysphere, which also acquired Continue.dev. Individual plans: Pro $20/mo includes $20 of API-rate usage; Pro+ $60/mo includes $70; Ultra $200/mo includes $400; Teams is $40/user/mo. Cursor's in-house Composer line draws from a separate, more generous Auto and Composer pool designed for everyday agentic coding at lower cost than frontier API models. The Hobby tier is free with limited Agent requests and Tab completions, no card required. Paid plans add frontier models, MCPs, cloud agents, and Bugbot reviews on usage-based billing.

Hobby tier

$20

Pro /mo ($20 usage)

$60

Pro+ /mo ($70 usage)

$200

Ultra /mo ($400 usage)

See Cursor alternatives and Cursor vs Windsurf vs Copilot.

GitHub Copilot

Cheapest paid default. Works in every major IDE plus a CLI.

Free $0 gives 2,000 code completions and 50 chat requests per month, with access to Haiku 4.5 and GPT-5 mini. Pro $10/user/mo adds unlimited completions and $15 of monthly model credits plus access to third-party agents (Claude Code and Codex). Pro+ $39/user/mo includes $70 of credits and premium models including Opus; Max $100/user/mo includes $200. Credits consume on token usage at published per-model rates: Claude Opus 4.5 through 4.8 bill at $5 in / $25 out per 1M tokens, Sonnet 4 through 4.6 at $3/$15, GPT-5.5 at $5/$30, Gemini 3.1 Pro at $2/$12. Basic code completions never consume credits and stay unlimited on paid plans.

Install the CLI

Install GitHub Copilot CLI

npm install -g @github/copilot     # Node 22+
brew install copilot-cli
# supports MCP servers and a /model switch

Compare at Copilot vs Claude Code and Cline vs Copilot.

Windsurf (now Devin Desktop)

The Windsurf editor, folded into Cognition's Devin.

Cognition, the maker of Devin, folded Windsurf into Devin Desktop; windsurf.com/pricing now redirects to devin.ai/pricing. The former free Windsurf editor is the Devin Free tier ($0/mo, unlimited Tab completions and inline edits, a light agent quota and limited model availability). Devin Pro is $20/mo with full model availability, free use of SWE 1.6 and leading open-source models, and Devin Cloud agents. Devin Max is $200/mo with much higher quotas, and Devin Teams is $80/mo plus $40/mo per full dev seat.

Devin Free (former Windsurf)

$20

Devin Pro /mo

$200

Devin Max /mo

SWE 1.6

Default agent model

opencode

The most-starred open source coding agent. Any provider, BYOK.

anomalyco/opencode (moved from sst/opencode) has 180,312 stars under MIT, ahead of Claude Code (134,868), Gemini CLI (105,641), and OpenAI Codex (94,277). Terminal-native, with a desktop app and IDE extension, it is model-agnostic: configure any LLM provider with your own keys, plus local models through Ollama and LM Studio. opencode Zen is the team's curated, tested model list for agentic coding. It supports MCP servers, LSP servers, and sub-agents.

180,312

GitHub stars (MIT)

any

LLM provider, BYOK

Tool cost (BYOK)

MCP + LSP

Sub-agents, TUI

Install and add a custom provider

Install opencode

curl -fsSL https://opencode.ai/install | bash
npm install -g opencode-ai
brew install anomalyco/tap/opencode

Custom OpenAI-compatible provider (JSON config)

{
  "provider": {
    "myprovider": {
      "npm": "@ai-sdk/openai-compatible",
      "options": { "baseURL": "https://api.myprovider.com/v1" },
      "models": { "my-model": {} }
    }
  }
}

Subscription backends in opencode

Per opencode's docs, ChatGPT Plus, GitHub Copilot, and GitLab Duo subscriptions are usable as model backends, while Anthropic prohibits using Claude Pro or Max subscriptions with third-party tools like opencode.

Cline

In-IDE open source agent with Plan and Act approval modes.

cline/cline: 63,998 stars, Apache-2.0, free, with usage-based inference only and no subscription. It is model-agnostic via your own API key across Anthropic, OpenAI, Google, OpenRouter, AWS Bedrock, GCP Vertex, Groq, Cerebras, and DeepSeek, or local models via Ollama and LM Studio. It runs as a VS Code and IDE extension plus an SDK and CLI, with explicit Plan and Act modes that require approval before each change.

See Cline alternatives, Cline vs Cursor, and the head-to-head below.

Aider

Git-native terminal pair programming. Auto-commit per edit.

Aider-AI/aider: 46,808 stars, Apache-2.0, free, BYOK and model-agnostic. The terminal pair-programming pioneer that thinks in git, every edit a commit. Aider publishes the polyglot leaderboard (225 Exercism exercises across C++, Go, Java, JavaScript, Python, Rust), where gpt-5 (high) leads at 88.0%, o3-pro (high) at 84.9%, and gemini-2.5-pro at 83.1%. Its last repo push was May 22, 2026, a slower cadence than opencode or Cline, and the leaderboard has not been refreshed with the newest 2026 frontier models.

Install and run Aider

python -m pip install aider-install && aider-install
# or one-liner:
curl -LsSf https://aider.chat/install.sh | sh

aider --model sonnet --api-key anthropic=<key>
aider --model deepseek --api-key deepseek=<key>

Kilo Code

In-IDE BYOK with a 0% markup gateway and a model router.

Kilo-Org/kilocode: 25,038 stars, MIT (the domain kilocode.ai now redirects to kilo.ai). The extension is free and open source for VS Code, JetBrains, and the CLI; AI usage is billed separately. The Kilo Gateway is pay-as-you-go at exact provider rates with 0% markup, routing across 500-plus models from 60-plus providers with an Auto Model selector (Frontier, Balanced, and Free tiers). BYOK works for Anthropic, OpenAI, Google, Azure, and Bedrock keys, and local models run via Ollama or LM Studio. See Kilo Code vs Claude Code.

25,038

GitHub stars (MIT)

Kilo Gateway markup

500+

Models, 60+ providers

Extension cost

Gemini CLI and Google Antigravity

Gemini CLI's free serving ended; Antigravity CLI replaced it.

google-gemini/gemini-cli: 105,641 stars, Apache-2.0. On June 18, 2026 Gemini CLI and the Gemini Code Assist IDE extensions stopped serving requests for free, Google AI Pro, and Ultra users, replaced by Antigravity CLI, which is available to everyone. Enterprise access through Code Assist Standard and Enterprise is unchanged. The former free quota was 60 requests per minute and 1,000 per day with a personal Google account.

Gemini CLI and Antigravity run Gemini 3 models with auto-routing (Gemini 3 Pro and Gemini 3 Flash), a 1M token context, MCP, Google Search grounding, and shell and file tools. With Gemini 3.1 Pro the pairing scores 70.7% on Terminal-Bench 2.1.

Install Gemini CLI

npx @google/gemini-cli
npm install -g @google/gemini-cli
brew install gemini-cli
# MCP servers configured in ~/.gemini/settings.json

Compare at Gemini CLI vs Claude Code, Gemini CLI vs Codex, and Antigravity vs Claude Code.

Zed and Amp

Zed

zed-industries/zed: 86,147 stars, a Rust-based multiplayer editor with agentic AI. The Personal (Free) plan gives 2,000 accepted edit predictions per month and supports external agents (Claude Agent, Codex CLI). Pro is $10/mo with unlimited edit predictions plus $5 of included tokens; Business is $30/seat/mo. BYOK is unlimited across Anthropic, OpenAI, Google AI, Ollama, OpenRouter, and Bedrock.

Amp

Sourcegraph's coding agent: a CLI plus IDE integrations for VS Code, JetBrains, Neovim, and Zed. Pricing is pay-as-you-go credits with no subscription, a $5 minimum, and no markup for individuals or teams; Enterprise costs 50% more and requires a one-time $1,000 purchase. Its modes are deep (GPT-5.5 extended thinking), smart, and rush, with an Oracle tool (GPT-5.5 high-reasoning) for a second opinion. Amp spawns parallel sub-agents and supports MCP. See Cursor alternatives for where these fit.

Roo Code and Continue.dev

Roo Code (24,294 stars, Apache-2.0) is a free, BYOK Cline fork for VS Code; roocode.com now redirects to roomote.dev, the team's separate hosted cloud product (from $99/mo). Continue.dev (34,559 stars, Apache-2.0) was acquired by Cursor, and its open source extension remains available.

Cline vs opencode

	Cline	opencode
GitHub stars / license	63,998 / Apache-2.0	180,312 / MIT
Surface	VS Code and IDE extension + CLI + SDK	Terminal-native + desktop + IDE extension
Model providers	Any provider, BYOK, local Ollama / LM Studio	Any provider via AI SDK; opencode Zen; local Ollama / LM Studio
Control model	Plan and Act modes, approval before each change	Plan-first, curated opencode Zen model list
Pick it if	You want the agent inside your IDE with step approval	You want a CLI agent and the largest community

Both are free and BYOK. opencode wins on community size; Cline wins if you want the agent embedded in VS Code or another IDE with explicit per-change approval. Full breakdown: opencode vs Cline.

Kilo Code vs opencode

Both are MIT-licensed and free. Kilo Code is a VS Code, JetBrains, and CLI extension with a 0% markup gateway and an Auto Model router; opencode is the terminal-native agent with the largest community and any-provider BYOK.

Kilo Code vs opencode

	Kilo Code	opencode
GitHub stars / license	25,038 / MIT	180,312 / MIT
Surface	VS Code, JetBrains, CLI	Terminal-native + desktop + IDE extension
Model providers	500+ models, 60+ providers; Auto Model router; BYOK or local	Any provider via AI SDK; opencode Zen; local Ollama / LM Studio
Pricing	Free extension; Kilo Gateway at provider rates, 0% markup	Free; BYOK any provider
Pick it if	You want in-IDE BYOK with no markup and a router	You want a CLI agent and the largest community

Kilo Code routes across 500-plus models with no gateway markup; opencode has a far larger community (180,312 vs 25,038 stars). Full breakdown: opencode vs Kilo Code.

Aider vs opencode

opencode pushes code daily and is model-agnostic across any provider. Aider is the git-native pioneer, but its last repo push was May 22, 2026 and its polyglot leaderboard has not been refreshed for 2026 frontier models.

Aider vs opencode

	Aider	opencode
GitHub stars / license	46,808 / Apache-2.0	180,312 / MIT
Surface	Terminal, git-native	Terminal-native + desktop + IDE extension
Model providers	BYOK per run (anthropic / deepseek / openai-compatible)	Any provider via AI SDK; local Ollama / LM Studio
Pricing	Free; pay for model tokens	Free; pay for model tokens
Last repo push	May 22, 2026	Daily
Pick it if	You want auto-commit-per-edit git workflow	You want active development and the widest model choice

Full comparison: opencode vs Aider.

The model backend matters as much as the agent

Most of these agents are BYOK: opencode, Cline, Aider, Kilo Code, Zed, and Gemini CLI all let you point at any OpenAI-compatible endpoint. The model and the inference provider behind it set both your cost and your output quality, independent of the agent.

If you run DeepSeek, GLM, Qwen, or MiniMax, where you serve them matters. Morph Open Source Models serve these with 16-bit (bf16) activations and no fp8 or int8 quantization. Most serverless providers quantize activations to fp8 to cut cost; keeping full 16-bit means responses match the reference weights rather than an 8-bit approximation.

For codegen, Morph runs codegen-tuned speculative decoding plus custom low-level inference kernels. morph-v3-fast applies edits at ~10,500 tokens per second; morph-dsv4flash (DeepSeek V4 Flash) is $0.139 per 1M input tokens and $0.278 per 1M output, and morph-glm52-744b (GLM-5.2) is $1.1 per 1M input and $4.1 per 1M output, both at 16-bit bf16 against the fp8 activations typical of serverless hosts. See pricing.

Morph also runs a model router that classifies each request in ~430ms at $0.001 per request and sends it to the cheapest model that passes, which lowers both cost and latency without you pinning a model per call. Pairing any BYOK agent with WarpGrep ($0 for 100k requests, $1 per 1M Pro) lifts retrieval quality independent of the model.

BYOK backend for open-weight models

	Morph (DeepSeek V4 Flash / GLM-5.2)	Typical serverless fp8 host
Activation precision	16-bit (bf16), no quantization	fp8 activations (quality loss)
DeepSeek V4 Flash / 1M tokens	$0.139 in / $0.278 out	varies
GLM-5.2 / 1M tokens	$1.1 in / $4.1 out	varies
Codegen tuning	Code-tuned spec decode + custom kernels	General-purpose

Sources

Primary sources behind the numbers on this page, verified June 28, 2026:

Terminal-Bench 2.1 leaderboard: tbench.ai
SWE-bench Verified and Pro (vendor self-reported aggregate): llm-stats.com Verified, llm-stats.com Pro
SWE-bench Pro public leaderboard (standardized): Scale Labs
GitHub stars: GitHub API for opencode, Cline, Aider, Kilo Code, Codex, Claude Code
Vendor pricing: Claude, Codex, Cursor, Copilot, Devin, Kilo, Zed, Amp
2026 shifts: Gemini CLI to Antigravity CLI, Anthropic Fable and Mythos suspension, Continue.dev acquired by Cursor

Frequently Asked Questions

What is the best AI coding agent in 2026?

On the public Terminal-Bench 2.1 leaderboard, Codex CLI with GPT-5.5 is #1 at 83.4%, Claude Code with Opus 4.8 is the top usable Claude pairing at 78.9%, and Gemini CLI with Gemini 3.1 Pro is at 70.7%. Claude Fable 5 scores 83.1% inside Claude Code but was suspended June 12, 2026 and cannot be run. For open source, opencode (180,312 stars, MIT) is the most-starred agent. Pick Codex CLI for the benchmark ceiling, Claude Code for reasoning depth (69.2% SWE-bench Pro), and opencode or Cline for a free, model-agnostic agent.

Are AI coding agents actually good?

Yes, by 2026 the top agents complete a majority of real tasks. Codex CLI with GPT-5.5 completes 83.4% of Terminal-Bench 2.1 terminal tasks, and Claude Opus 4.8 fixes 69.2% of SWE-bench Pro GitHub issues; two years earlier those numbers were under 40%. They still fail on long, underspecified, or unfamiliar tasks, so they work best with review on each change rather than fully unattended.

Cline vs opencode: which is better?

opencode has 180,312 stars (MIT) and is model-agnostic across any provider; Cline has 63,998 stars (Apache-2.0) and runs as a VS Code and IDE extension plus a CLI with Plan and Act approval modes. Choose opencode for a terminal agent and the largest community; choose Cline for an in-IDE agent with step-by-step approval. See opencode vs Cline.

Kilo Code vs opencode: which is better?

Both are MIT and free. Kilo Code (25,038 stars) is an IDE and CLI extension with BYOK or local models and a 0% markup gateway routing across 500-plus models. opencode (180,312 stars) is terminal-native with any provider and a larger community. Pick Kilo Code for in-IDE BYOK with no markup and a model router, opencode for a CLI workflow.

Aider vs opencode: which is better?

opencode (180,312 stars) ships daily and is model-agnostic across any provider. Aider (46,808 stars) is the git-native pioneer but its last repo push was May 22, 2026 and its polyglot leaderboard is not refreshed for 2026 frontier models. Use opencode for active development and provider choice; use Aider for its auto-commit-per-edit git workflow.

How much does an AI coding agent cost?

Open source agents (opencode, Cline, Aider, Kilo Code, Zed) are free as tools; you pay only for model tokens. GitHub Copilot Free gives 2,000 completions and 50 chat requests per month. Copilot Pro is $10/mo, Cursor Pro and Claude Code Pro are $20/mo ($17/mo annual for Claude), and Codex ships from Free $0 to ChatGPT Plus at $20/mo. To cut token cost on open-weight models, run DeepSeek V4 Flash on Morph at $0.139/1M input and $0.278/1M output.

Is Windsurf still a separate coding agent?

No. Cognition folded Windsurf into Devin Desktop, and windsurf.com/pricing now redirects to devin.ai/pricing. The former free Windsurf editor is the Devin Free tier ($0/mo); Devin Pro is $20/mo with SWE 1.6 and open-source models, and Devin Max is $200/mo. Continue.dev was acquired by Cursor, with its open source codebase still available.

SWE-bench vs Terminal-Bench: what is the difference?

Terminal-Bench 2.1 tests an agent driving a terminal end to end (Codex CLI + GPT-5.5 leads at 83.4%). SWE-bench Pro and Verified test fixing real GitHub issues (Opus 4.8 leads the self-reported SWE-bench Pro aggregate at 69.2%). Read scores as the agent-plus-model pair, because the same model scores differently in different agents. See context engineering for why scaffolding changes outcomes.

Run your coding agent on faster, full-precision models

Any BYOK agent can point at Morph. Serve DeepSeek, GLM-5.2, Qwen, and MiniMax at 16-bit precision with codegen-tuned inference, and pair it with WarpGrep semantic search at $0 for 100k requests.

View Models

Try WarpGrep

Kimi K3

GLM-5.2

Qwen

MiniMax

DeepSeek

Reflex

Fast Apply

WarpGrep

Compact

Model Router

Blog

Startup Credits

Contact Us

About

Careers

Best AI Coding Agent (2026): Ranked by Terminal-Bench, Price, and Source

Best AI coding agent by goal

Terminal-Bench 2.1 leaderboard (the agent benchmark that matters)

Terminal-Bench 2.1 (agent + model, usable pairings)

The frontier models behind the agents

Pricing, side by side

Claude Code

Install

Install Claude Code

Pricing and limits

OpenAI Codex CLI

Install

Install Codex CLI

Pricing and limits

Cursor

GitHub Copilot

Install the CLI

Install GitHub Copilot CLI

Windsurf (now Devin Desktop)

opencode

Install and add a custom provider

Install opencode

Custom OpenAI-compatible provider (JSON config)

Cline

Aider

Install and run Aider

Kilo Code

Gemini CLI and Google Antigravity

Install Gemini CLI

Zed and Amp

Zed

Amp

Roo Code and Continue.dev

Cline vs opencode

Kilo Code vs opencode

Aider vs opencode

The model backend matters as much as the agent

Sources

Frequently Asked Questions

What is the best AI coding agent in 2026?

Are AI coding agents actually good?

Cline vs opencode: which is better?

Kilo Code vs opencode: which is better?

Aider vs opencode: which is better?

How much does an AI coding agent cost?

Is Windsurf still a separate coding agent?

SWE-bench vs Terminal-Bench: what is the difference?

Run your coding agent on faster, full-precision models