Skip to content

Latest commit

 

History

History
307 lines (240 loc) · 11.6 KB

File metadata and controls

307 lines (240 loc) · 11.6 KB

Subagents (experimental)

Subagents are specialized agents that operate within your main Gemini CLI session. They are designed to handle specific, complex tasks—like deep codebase analysis, documentation lookup, or domain-specific reasoning—without cluttering the main agent's context or toolset.

Note: Subagents are currently an experimental feature.

To use custom subagents, you must explicitly enable them in your settings.json:

{
  "experimental": { "enableAgents": true }
}

Warning: Subagents currently operate in "YOLO mode", meaning they may execute tools without individual user confirmation for each step. Proceed with caution when defining agents with powerful tools like run_shell_command or write_file.

What are subagents?

Subagents are "specialists" that the main Gemini agent can hire for a specific job.

  • Focused context: Each subagent has its own system prompt and persona.
  • Specialized tools: Subagents can have a restricted or specialized set of tools.
  • Independent context window: Interactions with a subagent happen in a separate context loop, which saves tokens in your main conversation history.

Subagents are exposed to the main agent as a tool of the same name. When the main agent calls the tool, it delegates the task to the subagent. Once the subagent completes its task, it reports back to the main agent with its findings.

Built-in subagents

Gemini CLI comes with the following built-in subagents:

Codebase Investigator

  • Name: codebase_investigator
  • Purpose: Analyze the codebase, reverse engineer, and understand complex dependencies.
  • When to use: "How does the authentication system work?", "Map out the dependencies of the AgentRegistry class."
  • Configuration: Enabled by default. You can configure it in settings.json. Example (forcing a specific model):
    {
      "experimental": {
        "codebaseInvestigatorSettings": {
          "enabled": true,
          "maxNumTurns": 20,
          "model": "gemini-2.5-pro"
        }
      }
    }

CLI Help Agent

  • Name: cli_help
  • Purpose: Get expert knowledge about Gemini CLI itself, its commands, configuration, and documentation.
  • When to use: "How do I configure a proxy?", "What does the /rewind command do?"
  • Configuration: Enabled by default.

Generalist Agent

  • Name: generalist_agent
  • Purpose: Route tasks to the appropriate specialized subagent.
  • When to use: Implicitly used by the main agent for routing. Not directly invoked by the user.
  • Configuration: Enabled by default. No specific configuration options.

Browser Agent (experimental)

  • Name: browser_agent
  • Purpose: Automate web browser tasks — navigating websites, filling forms, clicking buttons, and extracting information from web pages — using the accessibility tree.
  • When to use: "Go to example.com and fill out the contact form," "Extract the pricing table from this page," "Click the login button and enter my credentials."

Note: This is a preview feature currently under active development.

Prerequisites

The browser agent requires:

  • Chrome version 144 or later (any recent stable release will work).
  • Node.js with npx available (used to launch the chrome-devtools-mcp server).

Enabling the browser agent

The browser agent is disabled by default. Enable it in your settings.json:

{
  "agents": {
    "overrides": {
      "browser_agent": {
        "enabled": true
      }
    }
  }
}

Session modes

The sessionMode setting controls how Chrome is launched and managed. Set it under agents.browser:

{
  "agents": {
    "overrides": {
      "browser_agent": {
        "enabled": true
      }
    },
    "browser": {
      "sessionMode": "persistent"
    }
  }
}

The available modes are:

Mode Description
persistent (Default) Launches Chrome with a persistent profile stored at ~/.gemini/cli-browser-profile/. Cookies, history, and settings are preserved between sessions.
isolated Launches Chrome with a temporary profile that is deleted after each session. Use this for clean-state automation.
existing Attaches to an already-running Chrome instance. You must enable remote debugging first by navigating to chrome://inspect/#remote-debugging in Chrome. No new browser process is launched.

Configuration reference

All browser-specific settings go under agents.browser in your settings.json.

Setting Type Default Description
sessionMode string "persistent" How Chrome is managed: "persistent", "isolated", or "existing".
headless boolean false Run Chrome in headless mode (no visible window).
profilePath string Custom path to a browser profile directory.
visualModel string Model override for the visual agent (for example, "gemini-2.5-computer-use-preview-10-2025").

Security

The browser agent enforces the following security restrictions:

  • Blocked URL patterns: file://, javascript:, data:text/html, chrome://extensions, and chrome://settings/passwords are always blocked.
  • Sensitive action confirmation: Actions like form filling, file uploads, and form submissions require user confirmation through the standard policy engine.

Visual agent

By default, the browser agent interacts with pages through the accessibility tree using element uid values. For tasks that require visual identification (for example, "click the yellow button" or "find the red error message"), you can enable the visual agent by setting a visualModel:

{
  "agents": {
    "overrides": {
      "browser_agent": {
        "enabled": true
      }
    },
    "browser": {
      "visualModel": "gemini-2.5-computer-use-preview-10-2025"
    }
  }
}

When enabled, the agent gains access to the analyze_screenshot tool, which captures a screenshot and sends it to the vision model for analysis. The model returns coordinates and element descriptions that the browser agent uses with the click_at tool for precise, coordinate-based interactions.

Note: The visual agent requires API key or Vertex AI authentication. It is not available when using Google Login.

Creating custom subagents

You can create your own subagents to automate specific workflows or enforce specific personas. To use custom subagents, you must enable them in your settings.json:

{
  "experimental": {
    "enableAgents": true
  }
}

Agent definition files

Custom agents are defined as Markdown files (.md) with YAML frontmatter. You can place them in:

  1. Project-level: .gemini/agents/*.md (Shared with your team)
  2. User-level: ~/.gemini/agents/*.md (Personal agents)

File format

The file MUST start with YAML frontmatter enclosed in triple-dashes ---. The body of the markdown file becomes the agent's System Prompt.

Example: .gemini/agents/security-auditor.md

---
name: security-auditor
description: Specialized in finding security vulnerabilities in code.
kind: local
tools:
  - read_file
  - grep_search
model: gemini-2.5-pro
temperature: 0.2
max_turns: 10
---

You are a ruthless Security Auditor. Your job is to analyze code for potential
vulnerabilities.

Focus on:

1.  SQL Injection
2.  XSS (Cross-Site Scripting)
3.  Hardcoded credentials
4.  Unsafe file operations

When you find a vulnerability, explain it clearly and suggest a fix. Do not fix
it yourself; just report it.

Configuration schema

Field Type Required Description
name string Yes Unique identifier (slug) used as the tool name for the agent. Only lowercase letters, numbers, hyphens, and underscores.
description string Yes Short description of what the agent does. This is visible to the main agent to help it decide when to call this subagent.
kind string No local (default) or remote.
tools array No List of tool names this agent can use. If omitted, it may have access to a default set.
model string No Specific model to use (e.g., gemini-2.5-pro). Defaults to inherit (uses the main session model).
temperature number No Model temperature (0.0 - 2.0).
max_turns number No Maximum number of conversation turns allowed for this agent before it must return. Defaults to 15.
timeout_mins number No Maximum execution time in minutes. Defaults to 5.

Optimizing your subagent

The main agent's system prompt encourages it to use an expert subagent when one is available. It decides whether an agent is a relevant expert based on the agent's description. You can improve the reliability with which an agent is used by updating the description to more clearly indicate:

  • Its area of expertise.
  • When it should be used.
  • Some example scenarios.

For example, the following subagent description should be called fairly consistently for Git operations.

Git expert agent which should be used for all local and remote git operations. For example:

  • Making commits
  • Searching for regressions with bisect
  • Interacting with source control and issues providers such as GitHub.

If you need to further tune your subagent, you can do so by selecting the model to optimize for with /model and then asking the model why it does not think that your subagent was called with a specific prompt and the given description.

Remote subagents (Agent2Agent) (experimental)

Gemini CLI can also delegate tasks to remote subagents using the Agent-to-Agent (A2A) protocol.

Note: Remote subagents are currently an experimental feature.

See the Remote Subagents documentation for detailed configuration and usage instructions.

Extension subagents

Extensions can bundle and distribute subagents. See the Extensions documentation for details on how to package agents within an extension.