Subagents are specialized AI workers that handle specific tasks inside a larger coding agent. Every major tool — Claude Code, Cursor, VS Code Copilot, Windsurf, and Codex — now ships them. This guide covers how each implementation works, when to use subagents, and how to build your own.
What Are Subagents?
A subagent is an isolated AI worker that runs inside a parent agent. The parent agent delegates a specific task — "find the authentication middleware," "review this diff for security issues," "run the test suite and summarize failures" — to a subagent. The subagent works in its own context window with its own tools, finishes the job, and returns only the relevant result.
The key properties of subagents:
- Context isolation. The subagent's work doesn't pollute the parent's context window. If a search subagent reads 50 files to find 3 relevant ones, only those 3 files go back to the parent.
- Tool restriction. Subagents can be limited to specific tools. A search subagent gets grep, glob, and read. A review subagent gets read-only access. No accidental file writes.
- Parallel execution. Multiple subagents can run simultaneously. One searches the codebase while another runs tests while a third reviews documentation.
- Custom system prompts. Each subagent gets its own instructions. A "security reviewer" subagent has different priorities than a "performance optimizer" subagent.
Before subagents, coding agents had one context window for everything. The agent would search for files, read them, think about the edit, make the edit, search for tests, run them, read the output, fix failures — all in one long conversation. By the time it got to the fix, it had burned most of its context on search results it no longer needed.
Subagents fix this by letting the agent delegate the noisy parts. The parent stays clean and focused. The subagents do the dirty work.
Why Subagents Exist: The Context Problem
Context retrieval eats up 60%+ of agent time in most coding workflows. Cognition published this data from their Windsurf and Devin telemetry. The pattern is the same everywhere: the agent spends most of its time finding the right code, not writing it.
The old approaches don't scale:
- RAG embeddings miss multi-hop queries. "Find the auth middleware" works. "Trace how user permissions flow through the API layer" fails. Stale embeddings give wrong answers on actively changing codebases.
- Sequential agentic search is accurate but slow. Each tool call is a network roundtrip. Ten sequential search calls means ten inference passes. At 2-3 seconds per pass, that's 30 seconds of waiting before a single edit happens.
- Stuffing everything into context hits token limits fast on any real codebase. A 200k token window sounds large until you try to fit a React app with 500 files.
Subagents solve this with parallel agentic search. Give the search subagent multiple tool calls per turn. Let it explore different parts of the codebase simultaneously. Limit total turns to force efficient exploration. Return only the relevant files to the parent. The parent's context stays clean while the subagent handles the noisy exploration work.
The 60% Rule
Claude Code Subagents
Claude Code ships three built-in subagents and supports unlimited custom subagents. The built-in ones cover the most common delegation patterns.
Built-in Subagents
Explore subagent. Handles read-only codebase search. Uses Haiku for speed. Limited to Glob, Grep, Read, and read-only Bash commands. When Claude needs to understand a codebase without making changes, it delegates to Explore. This is the most frequently triggered subagent — it fires on questions like "how does the auth system work?" or "find all usages of the UserService class."
Plan subagent. Runs during plan mode. It researches your codebase before presenting a plan. Uses Sonnet for better reasoning. The Plan subagent reads files, traces dependencies, and builds a mental model of the relevant code before proposing changes. It then presents the plan to you for approval before any modifications happen.
General-purpose subagent. Handles complex multi-step tasks that need both exploration and modification. Uses Sonnet. Has access to all tools including file writes, bash execution, and MCP servers. The parent agent delegates entire subtasks — "add input validation to all API routes" — and the general-purpose subagent handles research, implementation, and verification.
Custom Subagents in Claude Code
You can define custom subagents by creating Markdown files in .claude/agents/ (project scope) or ~/.claude/agents/ (global scope). Each file uses YAML frontmatter for configuration and Markdown for the system prompt.
Example: Security Reviewer Subagent
# .claude/agents/security-reviewer.md
---
name: security-reviewer
description: Reviews code changes for OWASP Top 10 vulnerabilities
tools: Read, Grep, Glob
model: sonnet
---
You are a senior security engineer. Review code for:
- SQL injection (parameterized queries only)
- XSS (escape all user input in templates)
- SSRF (validate and allowlist URLs)
- Auth bypass (check middleware on every route)
- Secrets in code (no hardcoded keys/tokens)
Return a structured report: file, line, severity, finding, fix.Claude Code automatically discovers subagent files and invokes them based on the description field. If a user asks "review this PR for security issues," Claude matches that to the security-reviewer subagent and delegates.
Agent Teams (Multi-Agent)
Claude Code also supports agent teams — multiple Claude instances working in parallel on different parts of a task. Unlike subagents (which are spawned and managed by a parent), agent teams are peer agents coordinated by a team lead. The team lead decomposes the work, assigns tasks, and reconciles outputs.
Agent teams are ideal for large tasks: "migrate this Express app to Fastify" can be split across a routes agent, a middleware agent, a testing agent, and a documentation agent all working simultaneously.
Cursor Subagents
Cursor launched subagents in late 2025. The implementation focuses on automatic parallelization — Cursor's agent decomposes complex goals and assigns parts to subagents that run simultaneously.
Cursor includes default subagents for:
- Codebase research. Searches files, reads code, and builds context for the main agent.
- Terminal commands. Runs build commands, tests, and other CLI operations in isolation.
- Parallel work streams. Splits multi-file edits across workers that operate simultaneously.
Custom subagents in Cursor are defined through SKILL.md files, similar to Claude Code's .claude/agents/ approach. You specify the subagent's name, description, tool access, and instructions.
Benchmarks show Cursor subagents increase task completion speed by up to 40% on multi-file operations. The biggest gains come from parallelizing codebase search — instead of the agent sequentially reading files, a search subagent fans out across the codebase while the main agent prepares to edit.
VS Code Copilot Subagents
VS Code added subagents to GitHub Copilot in the January 2026 release (v1.109). The implementation is notable for its multi-provider support — you can run Claude, Codex, and Copilot agents alongside each other, delegating different tasks to different models.
Key features of VS Code Copilot subagents:
- Parallel execution. Subagents run simultaneously when tasks are independent. Previously, VS Code ran them sequentially.
- Tool constraints. You can restrict which tools a subagent can access. Critical for safety — a review subagent shouldn't be able to write files.
- Multi-provider orchestration. Run Claude as a local agent for fast interactive help, delegate to a cloud Codex agent for longer-running tasks, and use Copilot for inline completions — all in the same session.
- Custom agents. Define agents via
agent.jsonfiles with tool access, model selection, and system prompts.
VS Code treats subagents as a first-class orchestration pattern. The coordinator agent decomposes work, spawns subagents, and synthesizes their results. This mirrors the architecture of production multi-agent systems in frameworks like LangChain and CrewAI.
Windsurf SWE-grep: RL-Trained Code Search Subagent
Cognition built SWE-grep and SWE-grep-mini specifically for code search. Unlike general-purpose subagents that use a standard model with tool access, SWE-grep is trained with reinforcement learning on the code search task itself.
Architecture
SWE-grep runs a maximum of 4 turns. Each turn can make up to 8 parallel tool calls. Tools include grep, glob, read, and list_directory. The model learned through RL to use those 32 possible operations efficiently: early turns explore broadly, later turns drill into specific files, and the final turn returns file paths and line ranges.
The reward function uses weighted F1 over file and line retrieval, with precision weighted higher than recall. This is intentional — returning too many irrelevant files (context pollution) hurts the parent agent more than missing a borderline-relevant file.
Performance
SWE-grep-mini was distilled from the full model and runs at 2,800+ tokens/second on Cerebras hardware. On Cognition's CodeSearch benchmark, SWE-grep matches frontier model retrieval quality while being an order of magnitude faster.
Availability
Codex Subagents
OpenAI's Codex supports subagent workflows through the Agents SDK. The Codex CLI exposes itself as an MCP server, which means you can orchestrate multiple Codex agents through a coordinator.
The Codex App provides a command center for multi-agent coding — built-in worktrees and cloud environments where agents work in parallel across projects. The typical pattern is a project manager agent that decomposes requirements, then delegates to frontend, backend, testing, and review agents.
Codex's subagent support is still maturing. The UI doesn't clearly indicate when subagents are spawned (users see a generic "Thinking" indicator), and the community has filed feature requests for better subagent monitoring, coordination, and collaboration patterns.
Building Custom Subagents
The most common custom subagents across all tools fall into these categories:
Code Reviewer
Reads changed files and checks for bugs, security issues, performance problems, and style violations. Read-only tools only. Returns a structured report.
Code Reviewer Subagent (Claude Code)
# .claude/agents/code-reviewer.md
---
name: code-reviewer
description: Reviews code changes for bugs, security, and quality
tools: Read, Grep, Glob, Bash(read-only)
model: sonnet
---
Review code changes for:
1. Logic bugs and edge cases
2. Security vulnerabilities (OWASP Top 10)
3. Performance regressions
4. Missing error handling
5. Test coverage gaps
Format: file:line | severity | finding | suggested fixTest Generator
Reads implementation files, understands the API surface, generates comprehensive test cases including edge cases and error paths.
Documentation Writer
Reads code and existing docs, generates or updates README sections, API docs, and inline comments. Particularly useful as a subagent because documentation tasks generate a lot of context (reading code + existing docs + style guides) that you don't want polluting the main agent's window.
Migration Assistant
Specialized for framework or library migrations. Understands both the old and new API surfaces. Reads old code, maps to new patterns, and applies changes. Works well as a subagent in agent teams where each agent handles a different part of the migration (routes, middleware, models, tests).
Key Design Principles
- Clear description. The description determines when the subagent gets invoked. Be specific: "Reviews Python code for PEP 8 compliance and type safety" beats "Code reviewer."
- Minimal tool access. Give subagents only the tools they need. A reviewer doesn't need write access. A test runner doesn't need grep.
- Structured output. Define the output format in the system prompt. The parent agent needs to parse the result — make it predictable.
- Model selection. Use Haiku for fast, simple tasks (search, lint). Use Sonnet for tasks requiring reasoning (review, planning). Only use Opus for tasks that need the deepest understanding.
WarpGrep: API-First Code Search Subagent
We built WarpGrep at Morph because we wanted SWE-grep-level code search without being locked to a single IDE. WarpGrep is an API you can plug into any agent framework — Claude Code, Cursor, LangChain, CrewAI, AutoGen, or raw API calls.
How It Works
Same core architecture as SWE-grep. 4 turns maximum. 8 parallel tool calls per turn. Tools: grep, read, list_directory, and finish. The agent thinks first, outputs tool calls in parallel, results come back, and it iterates. The final turn calls finish with file paths and line ranges.
To prevent context explosion, tools enforce output limits: grep and list_directory cap at 200 lines, read caps at 800 lines. This forces the agent to be precise about what it reads rather than dumping entire files.
Three Ways to Use WarpGrep
1. MCP Server (Claude Code / Cursor)
// .mcp.json or claude_desktop_config.json
{
"mcpServers": {
"warpgrep": {
"command": "npx",
"args": ["-y", "@anthropic/morph-mcp-server"],
"env": {
"MORPH_API_KEY": "your-key"
}
}
}
}2. TypeScript SDK
import { MorphClient } from '@morphllm/morphsdk';
const morph = new MorphClient({
apiKey: process.env.MORPH_API_KEY
});
const result = await morph.warpGrep.execute({
query: 'Find authentication middleware and trace permission checks',
repoRoot: '.'
});
for (const ctx of result.contexts) {
console.log(`${ctx.file}:${ctx.startLine}-${ctx.endLine}`);
console.log(ctx.content);
}3. Direct API
curl -X POST https://api.getmorph.ai/v1/warpgrep \
-H "Authorization: Bearer $MORPH_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "Find all API routes that handle file uploads",
"repo_root": "."
}'WarpGrep costs $0.40 per 1M tokens. No embeddings required. No index setup. No maintenance. It works on any codebase from the first query — no ingestion step, no stale indexes, no reindexing when code changes.
Subagent Comparison Table
How each tool's subagent implementation compares:
| Feature | Claude Code | Cursor | VS Code Copilot | Windsurf | WarpGrep |
|---|---|---|---|---|---|
| Built-in subagents | Explore, Plan, General | Research, Terminal, Parallel | Configurable per provider | SWE-grep (Fast Context) | Code search (API) |
| Custom subagents | Yes (.claude/agents/) | Yes (SKILL.md) | Yes (agent.json) | No | N/A (API-first) |
| Parallel execution | Yes | Yes | Yes (since Jan 2026) | 8 calls/turn | 8 calls/turn |
| Model selection | Haiku/Sonnet/Opus | Provider default | Multi-provider | Custom RL model | Morph model |
| RL-trained search | No | No | No | Yes | Yes |
| Agent teams | Yes | No | Multi-agent orchestration | No | Via SDK integration |
| API access | No | No | No | No | Yes |
| Tool restriction | Per subagent | Per subagent | Per subagent | Fixed toolset | Fixed toolset |
| Works in any IDE | No (Claude Code only) | No (Cursor only) | No (VS Code only) | No (Windsurf only) | Yes (any framework) |
Subagents vs Skills vs MCP Servers
These three concepts often get confused. Here's how they differ:
Subagents
Isolated workers with their own context window. They run separately from the main conversation. Best for: tasks that generate a lot of intermediate context (search, review, testing) where you only want the final result back.
Skills
Reusable content — prompts, instructions, context — that loads into the current conversation. Skills share the parent's context window. Best for: repeatable workflows like "/commit" or "/review-pr" where you want the instructions available directly to the main agent. Read more in our Claude Code skills guide.
MCP Servers
External tool providers that expose functionality via the Model Context Protocol. MCP servers give agents access to tools (file systems, databases, APIs, browsers) without the agent needing to know the implementation details. Best for: connecting agents to external systems. Read more in our MCP servers guide.
| Property | Subagents | Skills | MCP Servers |
|---|---|---|---|
| Context | Own context window | Shares parent context | Tool results into parent context |
| Purpose | Delegate work | Add knowledge/instructions | Provide tools |
| Runs separately | Yes | No | Yes (external process) |
| Can modify files | If given tool access | Through parent agent | If tools allow it |
| Example | Security reviewer agent | /commit skill | WarpGrep MCP server |
Subagent Best Practices
1. Delegate the Noisy Parts
The best use of subagents is offloading tasks that generate lots of intermediate context you don't need. Code search, test execution, documentation reading — these create pages of output where only a few lines matter. Let a subagent filter to the signal.
2. Keep Subagent Scope Narrow
A subagent that does "everything related to testing" is just another general-purpose agent. Make it specific: "runs the test suite and returns failing test names with error messages" or "generates unit tests for exported functions in a given file."
3. Use the Right Model per Task
Not every subagent needs your most powerful model. Search subagents work well with Haiku — they're doing pattern matching and file reading, not deep reasoning. Save Sonnet and Opus for review and planning subagents that need to understand code semantics.
4. Parallelize When Possible
The biggest performance gain from subagents comes from parallel execution. If your task has independent subtasks (search + lint + test), run them simultaneously. Don't run them sequentially just because that's what a single-agent workflow would do.
5. Define Structured Output
Subagent results flow back to the parent agent, which needs to parse them. Define a clear output format in the system prompt: JSON, markdown tables, or structured reports. Avoid free-form prose — it wastes parent tokens on formatting.
6. Test Subagent Descriptions
The description field determines when a subagent gets automatically invoked. Test different phrasings to make sure the parent agent routes correctly. A description that's too broad gets invoked unnecessarily; one that's too narrow never fires.
Frequently Asked Questions
What are subagents in AI coding?
Subagents are specialized AI workers that operate inside a parent coding agent. Each subagent gets its own context window, tool access, and system prompt. Instead of one agent doing everything — searching, planning, coding, testing — subagents split the work across isolated workers. Claude Code, Cursor, VS Code Copilot, Windsurf, and Codex all support subagents.
How do Claude Code subagents work?
Claude Code ships three built-in subagents: Explore (read-only codebase search using Haiku), Plan (research-then-plan using Sonnet), and a general-purpose subagent (multi-step tasks with full tool access using Sonnet). Custom subagents are Markdown files with YAML frontmatter placed in .claude/agents/ (project scope) or ~/.claude/agents/ (global scope). Claude automatically discovers and invokes them based on the description field.
What is the difference between subagents and skills?
Skills are reusable content (prompts, instructions, context) that load into the current conversation and share the parent's context window. Subagents are isolated workers that run in their own context window. Skills add knowledge to the agent; subagents delegate work to a separate agent. Use skills for repeatable workflows. Use subagents for tasks that generate lots of intermediate context.
Do Cursor and VS Code support subagents?
Yes. Cursor launched subagents in late 2025 with built-in support for codebase research, terminal commands, and parallel work streams. VS Code Copilot added subagents in February 2026, supporting parallel execution, tool constraints, and multi-provider orchestration (Claude + Codex + Copilot in the same session).
What is SWE-grep?
SWE-grep is a retrieval subagent built by Cognition for Windsurf. It uses reinforcement learning to optimize parallel code search — 4 turns max, 8 parallel tool calls per turn. SWE-grep-mini runs at 2,800+ tokens/second on Cerebras. It matches frontier model retrieval quality while being 10-20x faster, but is only available inside Windsurf as the "Fast Context" feature.
Add a Code Search Subagent to Any Framework
WarpGrep gives you SWE-grep-level parallel code search as an API. 4 turns, 8 parallel calls, works with Claude Code, Cursor, LangChain, or raw API calls. No embeddings, no index setup.