Gemini CLI Subagents: Configuration, Use Cases, and How to Make Them Effective

Gemini CLI subagents run specialized tasks in isolated contexts, keeping the main agent's context window lean. Configuration, use cases, comparison to Claude Code subagents, and how to make them faster with better search and editing tools.

March 13, 2026 · 7 min read

What Are Gemini CLI Subagents

Coding agents hit a wall when they try to do everything in one context window. A single agent searching a 200K-line codebase, reasoning about architecture, writing code, and running tests burns through its context budget fast. By turn 15, it has forgotten what it found on turn 3.

Gemini CLI addresses this with subagents: isolated agent instances that execute specific tasks and return consolidated summaries. The main agent stays lean. The subagents do the heavy lifting in their own context windows.

This is an experimental feature as of March 2026, but the architecture follows a well-documented pattern. Anthropic found multi-agent systems improved performance by 90% on internal benchmarks. Cognition measured that coding agents spend 60% of their time on search. Subagents address both problems: they keep the orchestrator's context clean and let specialists focus on what they do well.

60%
Agent time spent on search (Cognition)
90%
Multi-agent improvement (Anthropic)
1M+
Token context per Gemini subagent

Each subagent gets its own system prompt, tool set, conversation history, and context window. From the main agent's perspective, a subagent looks like a tool: call it, it does work, it returns a summary. The entire multi-turn conversation inside the subagent gets compressed into a single entry in the main agent's history.

Google's documentation describes this as a "strategic orchestrator" pattern. The main agent treats its context window as its most precious resource and delegates heavy lifting to specialists.

How to Configure Subagents

Enable in settings.json

Add the experimental flag to your Gemini CLI settings:

{
  "experimental": {
    "enableAgents": true
  }
}

YOLO Mode Warning

Subagents run in YOLO mode by default, executing tools without user confirmation. A misconfigured subagent with write_file and run_shell_command can modify or delete files without asking. Restrict tools carefully.

Create Agent Files

Custom agents are Markdown files with YAML frontmatter. Place them in .gemini/agents/ at the project level or ~/.gemini/agents/ at the user level.

FieldTypeRequiredDescription
namestringYesUnique identifier (lowercase, hyphens, underscores)
descriptionstringYesShort description for routing decisions
kindstringNo"local" (default) or "remote"
toolsarrayNoList of tool names the agent can access
modelstringNoe.g. "gemini-2.5-pro" (defaults to inherit)
temperaturenumberNo0.0 to 2.0
max_turnsnumberNoMaximum conversation turns (default: 15)
timeout_minsnumberNoMax execution time in minutes (default: 5)

Example: Security Auditor

---
name: security-auditor
description: >
  Finds security vulnerabilities in code. Use for auth
  flows, input validation, SQL injection, XSS, and
  dependency audits.
kind: local
tools:
  - read_file
  - grep_search
  - glob
model: gemini-2.5-pro
temperature: 0.2
max_turns: 10
---

You are a security auditor. Analyze code for:
- Authentication and authorization flaws
- Input validation gaps
- SQL injection and XSS vectors
- Hardcoded secrets and credentials
- Dependency vulnerabilities

Report findings with file paths, line numbers,
severity (critical/high/medium/low), and
recommended fixes. Be specific.

Built-in Subagents

Gemini CLI ships with several built-in subagents:

Codebase Investigator

Analyzes dependencies and code structure. Traces feature implementations across frontend, API, and database layers.

CLI Help Agent

Answers questions about Gemini CLI commands and configuration. Useful for self-referential queries about the tool itself.

Generalist Agent

Routes tasks to the most appropriate specialist subagent based on the request content.

Browser Agent

Automates web tasks using browser automation. Experimental, requires separate enablement in settings.json.

Subagent Use Cases

When to Delegate

Subagents pay off when the task is context-heavy and independent. The overhead of spinning up an isolated agent is not worth it for a two-turn question.

Delegate to SubagentKeep in Main Agent
Batch file analysis (3+ files)Single-file edits
Exploratory codebase researchDirect questions (1-2 turns)
Multi-step investigationsSimple reads
Documentation generationSurgical code changes
Security audits across modulesQuick lookups

Code Migrations

Define a migration-analyzer subagent with read-only tools that scans the codebase for deprecated patterns. Define a migration-writer that takes the analyzer's output and generates updated code. The main agent orchestrates the sequence: analyze first, then write.

This separation prevents the writer from wasting context on the analysis phase. It also lets you use different models: a cheaper model for scanning, a stronger model for writing.

Large Refactors

Split the work by concern. A type-checker subagent validates TypeScript types. A test-writer generates tests for changed functions. A docs-updater refreshes documentation. Each runs in its own context without polluting the others.

Parallel Research

When you need to understand a feature that spans multiple subsystems, spin up read-only subagents for each layer. One investigates the frontend components. Another traces the API routes. A third maps the database schema. All run in parallel. The main agent synthesizes their findings.

Concurrency Rule

Never run parallel subagents that mutate the same files. Read-only tasks are safe to parallelize. Write operations on shared resources will create race conditions.

Gemini CLI vs Claude Code Subagents

Both Gemini CLI and Claude Code use Markdown-based agent definitions with isolated context windows. The implementations differ in maturity, concurrency limits, and context capacity.

FeatureGemini CLIClaude Code
StatusExperimentalStable
Config location.gemini/agents/*.md.claude/agents/*.md
Max concurrent agentsLimited (improving)Up to 10
Context per agent1M+ tokens (Gemini models)~200K tokens (Claude models)
Dependency trackingNoYes (shared task lists)
Team coordinationNoYes (agent teams)
Tool restrictionPer-agent tools arrayPer-agent allowed_tools
Model overridePer-agent model fieldNot per-agent
YOLO modeDefault (no confirmation)Requires explicit flag
Remote agentsA2A protocol (experimental)Not supported
MCP server supportYesYes

Claude Code's multi-agent system is more mature. It supports dependency tracking between agents, team coordination with shared task lists, and runs up to 10 concurrent subagents reliably. If you need production-grade multi-agent orchestration today, Claude Code is further along.

Gemini CLI's advantage is context capacity. Each subagent running on Gemini 2.5 Pro gets a 1M+ token context window, roughly 5x what Claude models offer per agent. For tasks that require processing large amounts of code in a single subagent session, like analyzing an entire module or tracing a feature across a large codebase, Gemini subagents can hold more in memory.

The per-agent model override in Gemini CLI is also useful. You can run cheap models for simple scanning tasks and reserve expensive models for complex reasoning, optimizing cost across a multi-agent workflow.

Making Subagents More Effective

Intelligence organizes into hierarchies under resource constraints. A subagent with a 5-minute timeout and 15 turns has a fixed budget. How it spends that budget determines whether it succeeds or times out with partial results.

The Search Bottleneck

Cognition measured that coding agents spend 60% of their time on search. For a subagent with 15 turns, that means 9 turns spent looking for code and 6 turns actually reasoning about it. Default tools like grep_search and glob use keyword matching. They miss semantic connections: a search for "authentication" won't find a file that handles auth through middleware patterns without using that word.

WarpGrep is an RL-trained semantic codebase search MCP server. It achieves 0.73 F1 in 3.8 steps on codebase search benchmarks. Connect it to Gemini CLI's MCP configuration and your subagents get search that understands meaning, not string matching. A search subagent with WarpGrep finds relevant code in 2-3 tool calls instead of 6-8 with grep.

The Editing Bottleneck

Subagents that write code hit a second bottleneck: edit speed. A migration subagent processing 50 files needs each edit to be fast and precise. Fast Apply handles code modifications at 10,500 tokens per second with diff-level precision. Instead of rewriting entire files, it applies targeted changes.

WarpGrep for Subagent Search

RL-trained semantic search via MCP. 0.73 F1 in 3.8 steps. Subagents find code by meaning, not keywords. Works with Gemini CLI's MCP server configuration.

Fast Apply for Subagent Edits

10,500 tok/s code editing API. Subagents apply precise, targeted modifications instead of rewriting whole files. Reduces edit time and error rate.

The combination cuts both bottlenecks. A subagent that spends 3 turns searching (instead of 9) and applies edits at 10,500 tok/s (instead of rewriting files) has 2-3x more of its turn budget available for reasoning. That is the difference between a subagent that times out and one that finishes the job.

Limitations and Gotchas

YOLO Mode Is the Default

Subagents execute tools without user confirmation. A misconfigured agent with write_file access can modify or delete code silently. Always restrict the tools array to the minimum needed.

Permission Inheritance Is Broken

Tool permissions granted to the main agent do not flow to subagents. You must explicitly list allowed tools in each subagent's YAML frontmatter. This is still being refined.

Parallel Execution Is Limited

Native parallel subagent execution is still experimental. Read-only tasks can run concurrently, but the implementation has known issues. Third-party tools like Maestro-Gemini work around this by spawning separate CLI processes.

Summary Bloat

Subagents return summaries to the main agent. If a subagent's system prompt doesn't enforce concise output, large summaries can bloat the orchestrator's context. Always instruct subagents to return structured, brief results.

Tuning Tips

  • Descriptions are routing: The main agent uses the description field to decide when to delegate. Vague descriptions cause missed delegations. Be specific about expertise and use cases.
  • Tune timeouts per agent: A security auditor scanning a large codebase needs timeout_mins: 10 and max_turns: 25. A documentation formatter needs timeout_mins: 2 and max_turns: 5. The defaults (5 min, 15 turns) are rarely optimal.
  • Use model overrides: Run cheap models (Gemini Flash) for scanning and analysis. Reserve expensive models (Gemini Pro) for code generation. This can cut costs 3-5x on multi-agent workflows.
  • Read-only agents first: Give investigation agents only read_file, grep_search, and glob. Add write tools only to agents that need them.

Frequently Asked Questions

Are Gemini CLI subagents stable for production use?

No. Subagents are experimental as of March 2026. The core functionality works for development workflows, but the API may change, permission inheritance is incomplete, and parallel execution has known bugs. Do not build production pipelines on this feature yet.

How do I enable subagents?

Add {"experimental": {"enableAgents": true}} to your settings.json. Then create Markdown files with YAML frontmatter in .gemini/agents/ (project-level) or ~/.gemini/agents/ (user-level).

Can subagents run in parallel?

Yes, with restrictions. Multiple subagents can run in the same turn if their tasks are independent and read-only. Never run parallel subagents that mutate the same files. Native parallel execution is still being improved; third-party orchestrators like Maestro-Gemini spawn separate CLI processes as a workaround.

How do Gemini CLI subagents compare to Claude Code subagents?

Both use Markdown-based configuration with isolated context windows. Claude Code supports up to 10 concurrent subagents with dependency tracking and team coordination. Gemini CLI subagents benefit from 1M+ token context windows per agent but lack mature parallel execution and cross-agent coordination. Claude Code is more production-ready; Gemini CLI offers more context capacity per agent.

Can I use MCP servers with Gemini CLI subagents?

Yes. MCP servers configured in your Gemini CLI settings are available to subagents. This is how you connect tools like WarpGrep for semantic search. Include the MCP tool names in the subagent's tools array to grant access.

Related Articles

Better Tools for Better Subagents

WarpGrep gives your Gemini CLI subagents semantic codebase search via MCP. Fast Apply gives them 10,500 tok/s code editing. Cut the search and edit bottlenecks so your agents spend their budget on reasoning.