What Are Gemini CLI Subagents
Coding agents hit a wall when they try to do everything in one context window. A single agent searching a 200K-line codebase, reasoning about architecture, writing code, and running tests burns through its context budget fast. By turn 15, it has forgotten what it found on turn 3.
Gemini CLI addresses this with subagents: isolated agent instances that execute specific tasks and return consolidated summaries. The main agent stays lean. The subagents do the heavy lifting in their own context windows.
This is an experimental feature as of March 2026, but the architecture follows a well-documented pattern. Anthropic found multi-agent systems improved performance by 90% on internal benchmarks. Cognition measured that coding agents spend 60% of their time on search. Subagents address both problems: they keep the orchestrator's context clean and let specialists focus on what they do well.
Each subagent gets its own system prompt, tool set, conversation history, and context window. From the main agent's perspective, a subagent looks like a tool: call it, it does work, it returns a summary. The entire multi-turn conversation inside the subagent gets compressed into a single entry in the main agent's history.
Google's documentation describes this as a "strategic orchestrator" pattern. The main agent treats its context window as its most precious resource and delegates heavy lifting to specialists.
How to Configure Subagents
Enable in settings.json
Add the experimental flag to your Gemini CLI settings:
{
"experimental": {
"enableAgents": true
}
}YOLO Mode Warning
Subagents run in YOLO mode by default, executing tools without user confirmation. A misconfigured subagent with write_file and run_shell_command can modify or delete files without asking. Restrict tools carefully.
Create Agent Files
Custom agents are Markdown files with YAML frontmatter. Place them in .gemini/agents/ at the project level or ~/.gemini/agents/ at the user level.
| Field | Type | Required | Description |
|---|---|---|---|
| name | string | Yes | Unique identifier (lowercase, hyphens, underscores) |
| description | string | Yes | Short description for routing decisions |
| kind | string | No | "local" (default) or "remote" |
| tools | array | No | List of tool names the agent can access |
| model | string | No | e.g. "gemini-2.5-pro" (defaults to inherit) |
| temperature | number | No | 0.0 to 2.0 |
| max_turns | number | No | Maximum conversation turns (default: 15) |
| timeout_mins | number | No | Max execution time in minutes (default: 5) |
Example: Security Auditor
---
name: security-auditor
description: >
Finds security vulnerabilities in code. Use for auth
flows, input validation, SQL injection, XSS, and
dependency audits.
kind: local
tools:
- read_file
- grep_search
- glob
model: gemini-2.5-pro
temperature: 0.2
max_turns: 10
---
You are a security auditor. Analyze code for:
- Authentication and authorization flaws
- Input validation gaps
- SQL injection and XSS vectors
- Hardcoded secrets and credentials
- Dependency vulnerabilities
Report findings with file paths, line numbers,
severity (critical/high/medium/low), and
recommended fixes. Be specific.Built-in Subagents
Gemini CLI ships with several built-in subagents:
Codebase Investigator
Analyzes dependencies and code structure. Traces feature implementations across frontend, API, and database layers.
CLI Help Agent
Answers questions about Gemini CLI commands and configuration. Useful for self-referential queries about the tool itself.
Generalist Agent
Routes tasks to the most appropriate specialist subagent based on the request content.
Browser Agent
Automates web tasks using browser automation. Experimental, requires separate enablement in settings.json.
Subagent Use Cases
When to Delegate
Subagents pay off when the task is context-heavy and independent. The overhead of spinning up an isolated agent is not worth it for a two-turn question.
| Delegate to Subagent | Keep in Main Agent |
|---|---|
| Batch file analysis (3+ files) | Single-file edits |
| Exploratory codebase research | Direct questions (1-2 turns) |
| Multi-step investigations | Simple reads |
| Documentation generation | Surgical code changes |
| Security audits across modules | Quick lookups |
Code Migrations
Define a migration-analyzer subagent with read-only tools that scans the codebase for deprecated patterns. Define a migration-writer that takes the analyzer's output and generates updated code. The main agent orchestrates the sequence: analyze first, then write.
This separation prevents the writer from wasting context on the analysis phase. It also lets you use different models: a cheaper model for scanning, a stronger model for writing.
Large Refactors
Split the work by concern. A type-checker subagent validates TypeScript types. A test-writer generates tests for changed functions. A docs-updater refreshes documentation. Each runs in its own context without polluting the others.
Parallel Research
When you need to understand a feature that spans multiple subsystems, spin up read-only subagents for each layer. One investigates the frontend components. Another traces the API routes. A third maps the database schema. All run in parallel. The main agent synthesizes their findings.
Concurrency Rule
Never run parallel subagents that mutate the same files. Read-only tasks are safe to parallelize. Write operations on shared resources will create race conditions.
Gemini CLI vs Claude Code Subagents
Both Gemini CLI and Claude Code use Markdown-based agent definitions with isolated context windows. The implementations differ in maturity, concurrency limits, and context capacity.
| Feature | Gemini CLI | Claude Code |
|---|---|---|
| Status | Experimental | Stable |
| Config location | .gemini/agents/*.md | .claude/agents/*.md |
| Max concurrent agents | Limited (improving) | Up to 10 |
| Context per agent | 1M+ tokens (Gemini models) | ~200K tokens (Claude models) |
| Dependency tracking | No | Yes (shared task lists) |
| Team coordination | No | Yes (agent teams) |
| Tool restriction | Per-agent tools array | Per-agent allowed_tools |
| Model override | Per-agent model field | Not per-agent |
| YOLO mode | Default (no confirmation) | Requires explicit flag |
| Remote agents | A2A protocol (experimental) | Not supported |
| MCP server support | Yes | Yes |
Claude Code's multi-agent system is more mature. It supports dependency tracking between agents, team coordination with shared task lists, and runs up to 10 concurrent subagents reliably. If you need production-grade multi-agent orchestration today, Claude Code is further along.
Gemini CLI's advantage is context capacity. Each subagent running on Gemini 2.5 Pro gets a 1M+ token context window, roughly 5x what Claude models offer per agent. For tasks that require processing large amounts of code in a single subagent session, like analyzing an entire module or tracing a feature across a large codebase, Gemini subagents can hold more in memory.
The per-agent model override in Gemini CLI is also useful. You can run cheap models for simple scanning tasks and reserve expensive models for complex reasoning, optimizing cost across a multi-agent workflow.
Making Subagents More Effective
Intelligence organizes into hierarchies under resource constraints. A subagent with a 5-minute timeout and 15 turns has a fixed budget. How it spends that budget determines whether it succeeds or times out with partial results.
The Search Bottleneck
Cognition measured that coding agents spend 60% of their time on search. For a subagent with 15 turns, that means 9 turns spent looking for code and 6 turns actually reasoning about it. Default tools like grep_search and glob use keyword matching. They miss semantic connections: a search for "authentication" won't find a file that handles auth through middleware patterns without using that word.
WarpGrep is an RL-trained semantic codebase search MCP server. It achieves 0.73 F1 in 3.8 steps on codebase search benchmarks. Connect it to Gemini CLI's MCP configuration and your subagents get search that understands meaning, not string matching. A search subagent with WarpGrep finds relevant code in 2-3 tool calls instead of 6-8 with grep.
The Editing Bottleneck
Subagents that write code hit a second bottleneck: edit speed. A migration subagent processing 50 files needs each edit to be fast and precise. Fast Apply handles code modifications at 10,500 tokens per second with diff-level precision. Instead of rewriting entire files, it applies targeted changes.
WarpGrep for Subagent Search
RL-trained semantic search via MCP. 0.73 F1 in 3.8 steps. Subagents find code by meaning, not keywords. Works with Gemini CLI's MCP server configuration.
Fast Apply for Subagent Edits
10,500 tok/s code editing API. Subagents apply precise, targeted modifications instead of rewriting whole files. Reduces edit time and error rate.
The combination cuts both bottlenecks. A subagent that spends 3 turns searching (instead of 9) and applies edits at 10,500 tok/s (instead of rewriting files) has 2-3x more of its turn budget available for reasoning. That is the difference between a subagent that times out and one that finishes the job.
Limitations and Gotchas
YOLO Mode Is the Default
Subagents execute tools without user confirmation. A misconfigured agent with write_file access can modify or delete code silently. Always restrict the tools array to the minimum needed.
Permission Inheritance Is Broken
Tool permissions granted to the main agent do not flow to subagents. You must explicitly list allowed tools in each subagent's YAML frontmatter. This is still being refined.
Parallel Execution Is Limited
Native parallel subagent execution is still experimental. Read-only tasks can run concurrently, but the implementation has known issues. Third-party tools like Maestro-Gemini work around this by spawning separate CLI processes.
Summary Bloat
Subagents return summaries to the main agent. If a subagent's system prompt doesn't enforce concise output, large summaries can bloat the orchestrator's context. Always instruct subagents to return structured, brief results.
Tuning Tips
- Descriptions are routing: The main agent uses the description field to decide when to delegate. Vague descriptions cause missed delegations. Be specific about expertise and use cases.
- Tune timeouts per agent: A security auditor scanning a large codebase needs
timeout_mins: 10andmax_turns: 25. A documentation formatter needstimeout_mins: 2andmax_turns: 5. The defaults (5 min, 15 turns) are rarely optimal. - Use model overrides: Run cheap models (Gemini Flash) for scanning and analysis. Reserve expensive models (Gemini Pro) for code generation. This can cut costs 3-5x on multi-agent workflows.
- Read-only agents first: Give investigation agents only
read_file,grep_search, andglob. Add write tools only to agents that need them.
Frequently Asked Questions
Are Gemini CLI subagents stable for production use?
No. Subagents are experimental as of March 2026. The core functionality works for development workflows, but the API may change, permission inheritance is incomplete, and parallel execution has known bugs. Do not build production pipelines on this feature yet.
How do I enable subagents?
Add {"experimental": {"enableAgents": true}} to your settings.json. Then create Markdown files with YAML frontmatter in .gemini/agents/ (project-level) or ~/.gemini/agents/ (user-level).
Can subagents run in parallel?
Yes, with restrictions. Multiple subagents can run in the same turn if their tasks are independent and read-only. Never run parallel subagents that mutate the same files. Native parallel execution is still being improved; third-party orchestrators like Maestro-Gemini spawn separate CLI processes as a workaround.
How do Gemini CLI subagents compare to Claude Code subagents?
Both use Markdown-based configuration with isolated context windows. Claude Code supports up to 10 concurrent subagents with dependency tracking and team coordination. Gemini CLI subagents benefit from 1M+ token context windows per agent but lack mature parallel execution and cross-agent coordination. Claude Code is more production-ready; Gemini CLI offers more context capacity per agent.
Can I use MCP servers with Gemini CLI subagents?
Yes. MCP servers configured in your Gemini CLI settings are available to subagents. This is how you connect tools like WarpGrep for semantic search. Include the MCP tool names in the subagent's tools array to grant access.
Related Articles
Better Tools for Better Subagents
WarpGrep gives your Gemini CLI subagents semantic codebase search via MCP. Fast Apply gives them 10,500 tok/s code editing. Cut the search and edit bottlenecks so your agents spend their budget on reasoning.