GitHub Copilot Fleets: Parallel Subagents for Copilot CLI (2026)

GitHub Copilot Fleets dispatches parallel subagents from the /fleet command in Copilot CLI, using a per-session SQLite database for dependency-aware task tracking. We cover the architecture, the /research command, pricing, and how Fleets compares to Codex worktrees and Claude Code Agent Teams.

March 9, 2026 · 1 min read

What Is Copilot Fleets

Copilot CLI version 0.0.382 introduced Fleet mode, turning sequential agent handoffs into concurrent execution. The /fleet command takes your implementation plan, decomposes it into subtasks, and dispatches subagents to work in parallel. Each subagent gets its own context window, isolated from the main agent and from each other.

Evan Boyle, a GitHub engineer, described the core mechanism: a SQLite database per session that the agent uses to model dependency-aware tasks and TODOs. The orchestrator writes tasks to this database, tracks which are blocked on others, and only dispatches subagents for tasks whose dependencies have completed.

Copilot CLI reached general availability on February 25, 2026. Fleet mode is available to all paid Copilot subscribers: Pro, Pro+, Business, and Enterprise tiers.

3x
Speedup on parallelizable tasks
5+
Concurrent subagents observed
6
Models supported
$10/mo
Starting price (Pro)

How Fleets Works: Architecture

The Orchestrator Pattern

When you run /fleet, the main Copilot agent becomes an orchestrator. It analyzes your prompt, determines which parts can be divided into independent subtasks, assesses dependencies between them, and assigns parallelizable work to subagents. Work that is inherently sequential stays with the orchestrator or runs in sequence.

SQLite Task Tracking

Each fleet session creates a SQLite database that stores the full task graph: what needs to be done, what depends on what, and what is currently in progress. This is what separates Fleets from naive parallelism. If task C depends on tasks A and B, the orchestrator waits for both A and B to finish before dispatching C. The database also stores TODO items that subagents can update as they work.

Context Isolation

Each subagent runs with its own context window, separate from the main agent and other subagents. This prevents context contamination: a subagent working on authentication code does not get distracted by the testing code another subagent is writing. The tradeoff is that subagents cannot directly communicate with each other. All coordination flows through the orchestrator.

Built-in Agent Types

Copilot CLI ships four specialized agents that the orchestrator can delegate to automatically:

Explore

Fast codebase analysis. Reads files, searches for patterns, and maps project structure without modifying anything.

Task

General-purpose subtask execution. Inherits parent permissions and can run builds, tests, and shell commands.

Plan

Implementation planning with structured output. Breaks high-level goals into step-by-step execution plans.

Code Review

Automated review of changes with high-signal feedback. Runs after subagents complete their work to catch issues.

Custom Agents in Fleet

You can specify custom agents for subagent work using the @CUSTOM-AGENT-NAME syntax. Custom agents are defined as Markdown files with YAML frontmatter and can be scoped to a user (~/.copilot/agents/), a repository (.github/agents/), or an organization. When a subagent uses a custom agent that specifies a particular model, that model applies to the subagent. By default, subagents use a low-cost model to conserve premium requests.

Monitoring Fleet Progress

The /tasks command shows all running and completed subtasks in the current session. You can view summaries, kill processes, or remove finished entries. The /usage command tracks token consumption across all subagents in the session.

The /research Command

Separate from /fleet, the /research command activates a specialized agent for deep investigation. It gathers information from your local codebase, GitHub repositories (public or private, if authenticated), and the web. The output is a comprehensive Markdown report with citations, plus a brief summary in the CLI. You can save the report as a GitHub gist.

The response format adapts to the question type: how-to questions get step-by-step guides, conceptual questions get architectural overviews, and technical deep-dives get detailed analysis with code references. Unlike the main session, the research agent uses a fixed model that cannot be changed with /model.

The cross-repository capability is notable. When logged in, the research agent can fetch files from any accessible repository in your organization, search across repositories, and combine that with web results. For large codebase onboarding or architectural decisions, this is faster than manually scanning dozens of files.

Fleets vs Codex Worktrees

Both Fleets and OpenAI Codex run parallel agents, but they use fundamentally different isolation strategies.

AspectCopilot FleetsCodex Worktrees
Isolation modelContext window isolation (same terminal)Git worktree + cloud sandbox (full OS)
Execution locationLocal terminal (your machine)Cloud containers (OpenAI servers)
CoordinationSQLite task graph + orchestratorIndependent agents, shared review queue
Code stays localYesNo (uploaded to cloud)
Works offline / laptop closedNo (local process)Yes (cloud execution)
Multi-model supportYes (6 models, switchable)No (GPT-5.3-Codex only)
Cost per taskPremium requests (multiplier varies)Messages per 5-hour window
PlatformmacOS, Linux, WindowsmacOS only (Apple Silicon)
Starting price$10/mo (Copilot Pro)$20/mo (ChatGPT Plus)

Fleets is the better choice when you want fast local parallelism, model flexibility, and cross-platform support. Codex worktrees win when you need full OS-level isolation, async execution that survives closing your laptop, or the specific GPT-5.3-Codex model for Terminal-Bench-heavy workloads.

Fleets vs Claude Code Agent Teams

The coordination model is where these differ most. Fleets uses a centralized orchestrator: the main agent assigns work, subagents report back, and only the orchestrator sees the full picture. Claude Code Agent Teams use a decentralized model where teammates can message each other directly, share findings, and coordinate without routing everything through a lead.

AspectCopilot FleetsClaude Code Agent Teams
CoordinationCentralized orchestratorDecentralized (bidirectional messaging)
Agent communicationSubagents report to orchestrator onlyTeammates message each other directly
User interactionThrough main agentCan interact with individual teammates
Task trackingSQLite database per sessionShared task list with dependency tracking
Context isolationPer-subagent context windowPer-teammate context window
Models available6 models (Claude, GPT, Gemini)Claude models only (Opus, Sonnet, Haiku)
SWE-bench VerifiedVaries by model selected80.8% (Claude Opus 4.6)
Custom agent definitionsMarkdown + YAML frontmatterCLAUDE.md + Hooks + Agent SDK
Starting price$10/mo (Copilot Pro)$20/mo (Claude Pro)

Fleets is simpler. The orchestrator handles all coordination, so you write one prompt and let it decompose the work. Agent Teams are more flexible. Teammates can challenge each other, surface conflicting approaches, and self-organize. For a well-defined implementation plan with clear subtasks, Fleets is faster to set up. For open-ended exploration where agents need to share context and iterate, Agent Teams are the better fit.

Pricing and Plan Requirements

Fleets is included with any paid Copilot subscription. The cost depends on your plan and which models your subagents use. Each subagent interaction consumes premium requests independently, so a fleet task uses more requests than a single-agent task.

PlanMonthly CostPremium RequestsOverage Rate
Pro$10/month300/month$0.04/request
Pro+$39/month1,500/month$0.04/request
Business$19/user/monthPooled$0.04/request
Enterprise$39/user/monthPooled$0.04/request

Model Multipliers

Not all models cost the same number of premium requests. The multiplier determines how many requests a single LLM interaction consumes:

ModelMultiplierEffective Cost on Pro
Claude Opus 4.63x~100 interactions/month
Claude Sonnet 4.61x~300 interactions/month
GPT-5.3-Codex1x~300 interactions/month
GPT-5 miniIncluded in baseUnlimited (base sub)
Claude Haiku 4.50.33x~900 interactions/month
Gemini 3 ProVariesCheck /model for current rate

By default, subagents use a low-cost model to minimize request consumption. On the Pro plan at $10/month, a 5-subagent fleet task using the default model might consume 10-20 premium requests total. Switch subagents to Opus 4.6 and that same task uses 30-60 requests. Monitor usage with /usage during fleet sessions.

When to Use Fleets

Good fit: Decomposable implementation plans

Fleet mode works best when your task has naturally independent subtasks. Updating three separate modules, running parallel test suites, refactoring files that do not import from each other. The orchestrator identifies these and dispatches them concurrently.

Good fit: Large codebases with clear boundaries

When modules are well-separated, subagents can work in parallel without stepping on each other. Microservice repos, monorepos with package boundaries, or projects with clear frontend/backend splits.

Poor fit: Tightly coupled sequential work

If step 2 depends on the output of step 1, and step 3 depends on step 2, Fleets adds orchestration overhead without parallelism. Use single-agent mode or autopilot instead.

Poor fit: Limited premium request budget

Each subagent consumes requests independently. A 5-subagent task uses 5x the requests of a single agent doing the same work sequentially (though in less wall-clock time). On the 300-request Pro plan, heavy fleet usage can exhaust your monthly quota quickly.

The combination of /fleet with autopilot mode (activated with Shift+Tab) is the most autonomous workflow. You accept the plan, select "Accept plan and build on autopilot + /fleet", and the orchestrator handles everything from task decomposition through execution. For background work, prefix commands with & to delegate to cloud agents and free your terminal immediately.

Frequently Asked Questions

What is GitHub Copilot Fleets?

A feature in Copilot CLI that breaks implementation plans into independent subtasks and dispatches parallel subagents to execute them. Each session uses a SQLite database for dependency-aware task tracking. Available with any paid Copilot subscription since the February 25, 2026 GA launch.

How much does it cost?

Included with Copilot Pro ($10/month, 300 premium requests), Pro+ ($39/month, 1,500 requests), Business ($19/user/month), and Enterprise ($39/user/month). Subagents consume additional premium requests, with model multipliers ranging from 0.33x (Haiku 4.5) to 3x (Opus 4.6). Overages cost $0.04 per request.

How is Fleets different from Codex worktrees?

Fleets uses context-window isolation in your local terminal with a SQLite task tracker. Codex worktrees give each agent a full cloud sandbox with Git worktree isolation. Fleets is faster, cheaper, and cross-platform. Codex worktrees run in the cloud and work while your machine is off.

How is Fleets different from Claude Code Agent Teams?

Fleets uses a centralized orchestrator where subagents report back to the main agent. Claude Code Agent Teams use decentralized bidirectional messaging where teammates coordinate directly with each other. Agent Teams are more flexible for open-ended work. Fleets is simpler for well-defined plans.

What is the /research command?

A separate command that activates a research agent to investigate your codebase, GitHub repositories, and the web. It produces a Markdown report with citations that you can save as a GitHub gist. It uses a fixed model independent of your session's model selection.

Related Pages

Add WarpGrep to Your Copilot Fleet

WarpGrep is an agentic code search tool that works as an MCP server. Connect it to Copilot CLI for faster codebase context in your fleet subagents. Fewer hallucinated paths, more accurate results.

Sources