In February 2026, every major coding tool shipped multi-agent support in the same two-week window. The tools arrived fast. The practices for using them well did not. This guide covers what actually works: Claude Code Agent Teams in depth, the paradigm shift in how you prompt multi-agent systems, and when orchestration helps versus when it just burns tokens.
Why Orchestration Matters Now
February 2026 was the inflection point. Claude Code launched Agent Teams. Codex CLI added parallel agents through the Agents SDK. Windsurf shipped five parallel Cascade agents via git worktrees. Grok Build went to eight simultaneous agents. Cline CLI 2.0 brought parallel terminal agents to the open-source ecosystem. Devin added multiple parallel sessions in sandboxed environments.
This was not coincidence. The underlying models got good enough to maintain coherent behavior across long autonomous sessions. And git worktrees provided the file isolation that makes parallel editing safe. Those two prerequisites landed at roughly the same time, and every vendor raced to ship.
The result: multi-agent is no longer experimental. It is a core feature of every serious coding tool. But most developers are still using these features the way they used single-agent tools, and that is where things break down.
The February 2026 multi-agent wave
Claude Code Agent Teams, Codex CLI parallel agents, Windsurf Wave 13 (5 agents), Grok Build (8 agents), Cline CLI 2.0, and Devin parallel sessions all shipped within two weeks of each other. Multi-agent coordination went from experimental to table stakes in a single release cycle.
Claude Code Agent Teams Deep Dive
Agent Teams are the most complete multi-agent implementation in any coding tool today. They ship as an experimental feature behind a flag, but the architecture is production-grade. Enable them by setting CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 in your settings.json or environment.
settings.json
{
"env": {
"CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
}
}Architecture
A team has four components: a team lead (the main Claude Code session that creates and coordinates), teammates (separate Claude Code instances with their own context windows), a shared task list (JSON files on disk that all agents read and write), and a mailbox (messaging system for inter-agent communication).
Each teammate is a full Claude Code session. It loads the same project context as a regular session: CLAUDE.md, MCP servers, and skills. But it does not inherit the lead's conversation history. Whatever context a teammate needs, the lead must provide in the spawn prompt.
| Component | What It Does | Where It Lives |
|---|---|---|
| Team Lead | Creates team, spawns teammates, assigns tasks, synthesizes results | Main Claude Code session |
| Teammates | Independent Claude Code instances, each with own context window | Separate processes (in-process or tmux panes) |
| Task List | Shared work items with status, ownership, and dependency tracking | ~/.claude/tasks/{team-name}/ as JSON files |
| Mailbox | Direct messages and broadcasts between agents | In-memory message delivery system |
Core Tools
Three tools power agent team coordination:
TaskCreate
Creates tasks stored as JSON with unique IDs, descriptions, status (pending/in-progress/complete), ownership, and dependency graphs via blocks/blocked-by relationships. Tasks auto-unblock when dependencies complete.
SendMessage
Enables direct messages between specific agents and broadcast messages to all teammates. Messages are delivered automatically. The lead does not need to poll for updates.
Worktree Isolation
Each teammate gets its own git worktree. Parallel file edits never collide. When work is done, changes merge back through normal git workflows.
Starting a Team
Tell Claude what you need and describe the team structure in natural language. Be prescriptive about roles. If you leave it open-ended, Claude might spawn eight teammates when three would do.
Good team spawn prompt
Create an agent team to refactor the authentication module:
- Security reviewer: audit src/auth/ for vulnerabilities in token
handling, session management, and input validation. Write findings
to REVIEW.md Section 1.
- Implementation agent: refactor the auth middleware based on the
security review. Own all files in src/auth/ and src/middleware/.
- Test agent: write integration tests for the refactored auth flow.
Own all files in tests/auth/. Wait for the implementation to finish
before running tests.
Use Sonnet for each teammate. Require plan approval before
any teammate makes changes.Display Modes
Two modes are available: in-process (all teammates run inside your terminal, cycle with Shift+Down) and split panes (each teammate gets its own tmux or iTerm2 pane). Split panes require tmux or iTerm2. The default auto-detects based on your environment.
Subagents vs. Agent Teams
Both parallelize work but operate differently. The choice comes down to whether your workers need to talk to each other.
| Subagents | Agent Teams | |
|---|---|---|
| Communication | Report results back to parent only | Teammates message each other directly |
| Context | Own window; results return to caller | Own window; fully independent |
| Coordination | Main agent manages all work | Shared task list with self-coordination |
| Best for | Focused tasks where only the result matters | Complex work requiring discussion and collaboration |
| Token cost | Lower: results summarized back | Higher: each teammate is a separate instance |
The Prompting Paradigm Shift
This is the part most developers get wrong. When you go from single-agent to multi-agent, the skill is no longer prompting the model. It is prompting subagents for how to communicate with each other. You are writing prompts that tell agents how to coordinate, not how to code.
Single-agent prompting: "Refactor the auth module to use JWT tokens." You tell the model what to do. It does it.
Multi-agent prompting: "Security reviewer writes findings to Section 1 of REVIEW.md. Implementation agent reads Section 1 before starting work. Test agent blocks on implementation completing task-003." You define how agents relate to each other. The coding happens as a side effect of good coordination.
Define Roles Explicitly
Each teammate is a separate Claude instance that costs real money. Don't let Claude over-spawn. Specify exactly who does what. 'Security reviewer audits token handling. Implementation agent refactors middleware. Test agent writes integration tests.'
Assign File Ownership
Two agents editing the same file leads to overwrites. Break work so each agent owns a different set of files. 'Security reviewer writes to REVIEW.md Section 1. Financial analyst writes to Section 2. Competitive analyst writes to Section 3.'
Front-Load Context in Spawn Prompts
Teammates do NOT inherit the lead's conversation history. Whatever context they need must go in the spawn prompt or CLAUDE.md. Include the tech stack, relevant file paths, constraints, and expected output format.
Use Delegate Mode
Delegate mode prevents the team lead from doing analysis or implementation directly. The lead can only coordinate: create tasks, send messages, review results. This forces proper task decomposition instead of the lead doing everything itself.
The organizational chart, not the instruction manual
Your prompt is no longer "do X." It is "Agent A investigates X. Agent B investigates Y. Agent A sends findings to Agent B before Agent B starts implementation. Agent C reviews both outputs and flags contradictions." You are designing an organization, not writing an instruction.
Context Engineering Replaces Prompt Engineering
With multi-agent systems, the challenge is not getting agents to write code. It is ensuring each agent sees the right information at the right time. A teammate that starts work without knowing about a constraint discovered by another teammate will produce code that has to be thrown away.
The successful pattern: make CLAUDE.md the source of truth for project context that every teammate reads automatically. Put agent-specific context in spawn prompts. Use SendMessage for real-time coordination. Use the shared task list for status and dependency tracking.
Practical Patterns
Pattern 1: Parallel Code Review
A single reviewer gravitates toward one type of issue at a time. Split review criteria into independent domains so security, performance, and test coverage all get thorough attention simultaneously.
Parallel review prompt
Create an agent team to review PR #142. Spawn three reviewers:
- Security reviewer: focus on auth, injection, data exposure
- Performance reviewer: check N+1 queries, memory leaks, bundle size
- Test coverage reviewer: validate edge cases, error paths, mocking
Have them each review independently and report findings.
I want all three perspectives before I merge.Pattern 2: Competing Hypotheses for Debugging
Sequential debugging suffers from anchoring: once one theory is explored, subsequent investigation is biased toward it. Multiple independent investigators actively trying to disprove each other converge on the actual root cause faster.
Adversarial debugging prompt
Users report the app exits after one message instead
of staying connected. Spawn 5 teammates to investigate
different hypotheses:
- WebSocket timeout theory
- Auth token expiration theory
- Memory pressure / OOM theory
- Race condition in message handler theory
- Client-side reconnection bug theory
Have them talk to each other to try to disprove each
other's theories, like a scientific debate. Update
findings.md with whatever consensus emerges.Pattern 3: Cross-Layer Feature Implementation
Changes that span frontend, backend, and tests work well with agent teams because each layer can be owned by a different agent with clear boundaries.
Cross-layer implementation prompt
Create an agent team for the new billing dashboard:
- Backend agent: owns src/api/billing/ and src/db/migrations/.
Create the API endpoints and database schema first.
- Frontend agent: owns src/components/billing/ and
src/app/dashboard/billing/. Blocks on backend completing
the API schema task.
- Test agent: owns tests/billing/. Blocks on both backend
and frontend completing their implementation tasks.
Each agent works in its own worktree. 3 teammates total.Pattern 4: Research and Synthesis
Agent teams excel at multi-angle research where each teammate explores a different dimension of a problem, then findings are synthesized by the lead.
Research team prompt
I'm designing a CLI tool for tracking TODO comments.
Create an agent team to explore from different angles:
- UX researcher: investigate existing tools (todo-tree,
fixme, grep patterns), survey developer forums for
pain points, propose the ideal interface.
- Architect: design the technical approach. Consider
AST parsing vs regex, incremental vs full scan,
language server protocol integration.
- Devil's advocate: challenge both proposals. Find edge
cases, scaling issues, and reasons this tool would fail.
Synthesize a final recommendation after all three report.Browser Limitation: CLI Required
Claude Code's browser-based version (the web interface at claude.ai/code and Claude in Chrome) only runs the main worker. Agent Teams, with their spawned teammates, shared task lists, and inter-agent messaging, require the CLI.
This catches developers who start with the web interface and try to scale up to multi-agent workflows. If you see the Agent Teams feature referenced in documentation but cannot access it, this is why. You need the terminal-based Claude Code CLI, not the browser version.
The split-pane display mode (where each teammate gets its own terminal pane) additionally requires tmux or iTerm2. The in-process mode works in any terminal. VS Code's integrated terminal, Windows Terminal, and Ghostty do not support split panes.
| Feature | Browser (claude.ai/code) | CLI (terminal) |
|---|---|---|
| Single agent coding | Yes | Yes |
| Agent Teams | No | Yes |
| Subagent spawning | Limited | Full support |
| Git worktree isolation | No | Yes |
| Split pane display | No | Yes (tmux/iTerm2) |
| MCP server access | Limited | Full support |
| Custom hooks | No | Yes |
How Other Tools Approach Orchestration
Claude Code Agent Teams are not the only option. Each major tool has a different take on multi-agent coordination.
Cursor Background Agents
Cloud-based agents that work on tasks while you continue coding in the foreground. Cursor 2.0 added a subagent system for parallel task processing. Agents run in the cloud, not on your machine, which avoids local resource contention but adds latency.
Codex CLI Parallel Agents
Uses the OpenAI Agents SDK and MCP for orchestration. Each agent runs in its own git worktree. The Codex App Server provides a unified protocol across CLI, VS Code, and web surfaces. Deterministic, reviewable workflows with full traces.
Devin Parallel Sessions
Multiple autonomous Devins run in parallel, each in a fully sandboxed cloud environment with its own IDE, browser, and terminal. Fire-and-forget model. Later revisions added multi-agent dispatch where one agent assigns tasks to others.
Google Antigravity Manager View
A control center for orchestrating multiple agents across workspaces. Editor view for hands-on coding, Manager view for parallel agent coordination. Supports multiple models including Gemini 3.1 Pro and Claude Opus 4.6.
Windsurf Wave 13
Five parallel Cascade agents through git worktrees. Side-by-side panes with a dedicated terminal profile for each agent. Arena Mode runs two agents on the same prompt for blind comparison.
Grok Build
Eight parallel agents working simultaneously. The most aggressive parallelism in any shipping product as of March 2026.
| Tool | Max Parallel Agents | Coordination Model | Requires |
|---|---|---|---|
| Claude Code Agent Teams | No hard limit (3-5 recommended) | Shared task list + direct messaging | CLI + experimental flag |
| Cursor Background Agents | Multiple (cloud-based) | Subagent system, cloud handoff | Cursor Pro+ or Ultra |
| Codex CLI | Multiple (via Agents SDK) | MCP + git worktrees | OpenAI API key |
| Devin | Multiple parallel sessions | Autonomous sandboxed environments | Devin subscription |
| Windsurf | 5 Cascade agents | Git worktrees, dedicated panes | Windsurf Pro |
| Grok Build | 8 agents | Parallel task execution | Grok subscription |
When Orchestration Hurts
Multi-agent orchestration is not always better. The overhead is real, and there are clear cases where a single agent outperforms a team.
Sequential Dependencies
If step 2 requires the output of step 1, and step 3 requires step 2, parallelism buys you nothing. The agents just wait for each other. A single agent with a clear plan is faster and cheaper.
Same-File Edits
Two agents editing the same file leads to overwrites, even with worktrees. If the work cannot be decomposed into separate file ownership, use a single agent.
Simple Tasks
A bug fix in one file does not need three agents. Coordination overhead exceeds the benefit. Token costs scale linearly with each teammate.
Tight Budgets
Each teammate has its own context window and consumes tokens independently. A 5-agent team costs roughly 5x the tokens of a single agent. For routine tasks, a single session is more cost-effective.
The three-agent rule
Start with 3 teammates for most workflows. Having 5-6 tasks per teammate keeps everyone productive without excessive context switching. If you have 15 independent tasks, 3 teammates is a good starting point. Scale up only when the work genuinely benefits from more parallelism. Three focused teammates often outperform five scattered ones.
Decision Framework
Use this to decide whether your task benefits from orchestration, and if so, which approach fits.
| Your Situation | Best Approach | Why |
|---|---|---|
| Bug fix in one file | Single agent | No coordination overhead, fastest resolution |
| Refactor across 3+ modules | Agent Teams (3 teammates) | Each agent owns a module, works in parallel |
| Quick research question | Subagent | Fast, focused, reports back to main session |
| PR review from multiple angles | Agent Teams (3 reviewers) | Security, performance, tests reviewed simultaneously |
| Debugging unknown root cause | Agent Teams (competing hypotheses) | Multiple investigators prevent anchoring bias |
| High-volume simple edits | Single fast agent (Codex) | Speed matters more than depth |
| Architecture decision | Agent Teams (research team) | Multiple perspectives synthesized by lead |
| Sequential migration steps | Single agent | Dependencies between steps prevent parallel gains |
| Cross-layer feature (API + UI + tests) | Agent Teams (3 teammates) | Clear file ownership boundaries per layer |
Orchestration Patterns by Tool
The successful architecture emerging across tools follows three roles:
Planners
Continuously explore the codebase and create tasks. In Claude Code, this is the team lead. In Codex, the orchestrator agent. The planner decomposes work and manages dependencies.
Workers
Execute assigned tasks without coordinating with each other and push changes when done. In Claude Code, these are teammates claiming tasks from the shared list. Each worker owns specific files.
Judges
Determine whether to continue at each cycle end. In Claude Code, this can be implemented with TeammateIdle and TaskCompleted hooks that enforce quality gates before marking work as done.
Frequently Asked Questions
What is AI agent orchestration in coding?
AI agent orchestration is the coordination of multiple AI agents working together on a shared coding task. Instead of one agent handling everything, specialized agents work in parallel on different parts of a codebase, communicating through shared task lists and direct messaging. Claude Code Agent Teams, Codex CLI parallel agents, and Cursor background agents are the leading implementations in 2026.
How do Claude Code Agent Teams work?
Agent Teams consist of a lead session that coordinates work, spawns teammates, and synthesizes results. Teammates work independently in their own context windows and communicate through TaskCreate (shared task lists stored as JSON) and SendMessage (direct inter-agent messaging). Each teammate gets its own git worktree for parallel file editing. Enable them with CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 in settings.json.
What is the difference between subagents and agent teams?
Subagents report results back to the parent and cannot talk to each other. Every insight routes through one bottleneck. Agent teams remove that limitation: teammates share a task list, claim work independently, and communicate directly. Use subagents for quick focused tasks. Use teams when agents need to share findings, challenge assumptions, and coordinate on their own.
Do Claude Code Agent Teams work in the browser?
No. The browser-based Claude Code at claude.ai/code only runs the main worker. Agent Teams require the CLI. This is a common limitation that catches developers who try to run teams from the web interface. The CLI is available via npm install -g @anthropic-ai/claude-code.
When should I NOT use multi-agent orchestration?
Skip orchestration for sequential tasks with step-by-step dependencies, when multiple agents need to edit the same files, when the task is simple enough for one agent, or when token cost matters. Each teammate has its own context window, so costs scale linearly with team size. Three focused teammates often outperform five scattered ones.
How does prompting change with multi-agent systems?
Single-agent prompting tells a model what to do. Multi-agent prompting tells agents how to communicate with each other. You define roles, file ownership boundaries, communication protocols, and escalation paths. Your prompt becomes an organizational chart, not an instruction manual. Teammates do not inherit the lead's conversation history, so all critical context must go in the spawn prompt or CLAUDE.md.
Ship Code Faster with Better Infrastructure
Multi-agent orchestration generates more code changes in parallel, which means more edits to apply reliably. Morph's Fast Apply model merges LLM edits deterministically at 10,500+ tokens per second. The reliability layer your agent pipeline needs.