docs/src/oss/deepagents/harness.mdx at 3eadc5d2fa9329ef61c09ef0c8a24aae76299523 · langchain-ai/docs

title	Harness capabilities
sidebarTitle	Overview

An agent harness is a combination of several different capabilities that make building long-running agents easier:

Planning capabilities
Virtual filesystem
Filesystem permissions
Task delegation (subagents)
Context and token management
Code execution
Human-in-the-loop

Alongside these capabilities, Deep Agents use Skills and Memory for additional context and instructions.

Planning capabilities

The harness provides a write_todos tool that agents can use to maintain a structured task list.

Features:

Track multiple tasks with statuses ('pending', 'in_progress', 'completed')
Persisted in agent state
Helps agent organize complex multi-step work
Useful for long-running tasks and planning

Virtual filesystem access

The harness provides a configurable virtual filesystem which can be backed by different pluggable backends. The backends support the following file system operations:

Tool	Description
`ls`	List files in a directory with metadata (size, modified time)
`read_file`	Read file contents with line numbers, supports offset/limit for large files. Also supports returning multimodal content blocks for non-text files (images, video, audio, and documents). See supported extensions below.
`write_file`	Create new files
`edit_file`	Perform exact string replacements in files (with global replace mode)
`glob`	Find files matching patterns (e.g., `*/.py`)
`grep`	Search file contents with multiple output modes (files only, content with context, or counts)
`execute`	Run shell commands in the environment (available with sandbox backends only)

Type	Extensions
Image	`.png`, `.jpg`, `.jpeg`, `.gif`, `.webp`, `.heic`, `.heif`
Video	`.mp4`, `.mpeg`, `.mov`, `.avi`, `.flv`, `.mpg`, `.webm`, `.wmv`, `.3gpp`
Audio	`.wav`, `.mp3`, `.aiff`, `.aac`, `.ogg`, `.flac`
File	`.pdf`, `.ppt`, `.pptx`

The virtual filesystem is used by several other harness capabilities such as skills, memory, code execution, and context management. You can also use the file system when building custom tools and middleware for Deep Agents.

For more information, see backends.

Filesystem permissions

The harness supports declarative permission rules that control which files and directories the agent can read or write. Permissions apply to the built-in filesystem tools listed above and are evaluated in declaration order with first-match-wins semantics.

How it works:

Pass a list of rules to permissions= when creating the agent
Each rule specifies operations ("read", "write"), paths (glob patterns), and mode ("allow" or "deny")
The first matching rule wins. If no rule matches, the operation is allowed.

Why it's useful:

Restrict agents to specific directories (e.g., /workspace/)
Protect sensitive files (e.g., .env, credentials)
Give subagents narrower access than the parent agent

Permissions do not apply to sandbox backends, which support arbitrary command execution via the execute tool. For custom validation logic, use backend policy hooks.

For the full rule structure, examples, and subagent inheritance, see Permissions.

Task delegation (subagents)

The harness allows the main agent to create ephemeral "subagents" for isolated multi-step tasks.

Why it's useful:

Context isolation - Subagent's work doesn't clutter main agent's context
Parallel execution - Multiple subagents can run concurrently
Specialization - Subagents can have different tools/configurations
Token efficiency - Large subtask context is compressed into a single result

How it works:

Main agent has a task tool
When invoked, it creates a fresh agent instance with its own context
Subagent executes autonomously until completion
Returns a single final report to the main agent
Subagents are stateless (can't send multiple messages back)

Default subagent:

"general-purpose" subagent automatically available
Has filesystem tools by default
Can be customized with additional tools/middleware

Custom subagents:

Define specialized subagents with specific tools
Example: code-reviewer, web-researcher, test-runner
Configure via subagents parameter

Context management

The harness manages context so deep agents can handle long-running tasks within token limits while retaining the information they need.

How it works:

Input context — System prompt, memory, skills, and tool prompts shape what the agent knows at startup
Compression — Built-in offloading and summarization keep context within window limits as tasks progress
Isolation — Subagents quarantine heavy work and return only results (see Task delegation)
Long-term memory — Persistent storage across threads via the virtual filesystem

Why it's useful:

Enables multi-step tasks that exceed a single context window
Keeps the most relevant information in scope without manual trimming
Reduces token usage through automatic summarization and offloading

For configuration details, see Context engineering.

Code execution

When you use a sandbox backend, the harness exposes an execute tool that lets the agent run shell commands in an isolated environment. This enables the agent to install dependencies, run scripts, and execute code as part of its task.

How it works:

Sandbox backends implement the SandboxBackendProtocolV2 — when detected, the harness adds the execute tool to the agent's available tools
Without a sandbox backend, the agent only has filesystem tools (read_file, write_file, etc.) and cannot run commands
The execute tool returns combined stdout/stderr, exit code, and truncates large outputs (saving to a file for the agent to read incrementally)

Why it's useful:

Security — Code runs in isolation, protecting your host system from the agent's operations
Clean environments — Use specific dependencies or OS configurations without local setup
Reproducibility — Consistent execution environments across teams

For setup, providers, and file transfer APIs, see Sandboxes.

Human-in-the-loop

The harness can pause agent execution at specified tool calls to allow human approval or modification. This feature is opt-in via the interrupt_on parameter.

Configuration:

Pass interrupt_on to create_deep_agent with a mapping of tool names to interrupt configurations
Example: interrupt_on={"edit_file": True} pauses before every edit
You can provide approval messages or modify tool inputs when prompted

Why it's useful:

Safety gates for destructive operations
User verification before expensive API calls
Interactive debugging and guidance

Skills

The harness supports skills that provide specialized workflows and domain knowledge to your deep agent.

How it works:

Skills follow the Agent Skills standard
Each skill is a directory containing a SKILL.md file with instructions and metadata
Skills can include additional scripts, reference docs, templates, and other resources
Skills use progressive disclosure—they are only loaded when the agent determines they're useful for the current task
Agent reads frontmatter from each SKILL.md file at startup, then reviews full skill content when needed

Why it's useful:

Reduces token usage by only loading relevant skills when needed
Bundles capabilities together into larger actions with additional context
Provides specialized expertise without cluttering the system prompt
Enables modular, reusable agent capabilities

For more information, see Skills.

Memory

The harness supports persistent memory files that provide extra context to your deep agent across conversations. These files often contain general coding style, preferences, conventions, and guidelines that help the agent understand how to work with your codebase and follow your preferences.

How it works:

Uses AGENTS.md files to provide persistent context
Memory files are always loaded (unlike skills, which use progressive disclosure)
Pass one or more file paths to the memory parameter when creating your agent
Files are stored in the agent's backend (StateBackend, StoreBackend, or FilesystemBackend)
The agent can update memory based on your interactions, feedback, and identified patterns

Why it's useful:

Provides persistent context that doesn't need to be re-specified each conversation
Useful for storing user preferences, project guidelines, or domain knowledge
Always available to the agent, ensuring consistent behavior

For configuration details and examples, see Memory.

Harness profiles

`HarnessProfile` is a public beta API and may be updated in future releases.

Harness profiles let you shape how the harness behaves once a model is selected so the agent is guided toward the optimal behavior for your application.

Use harness profiles to augment the runtime experience of an agent, for example by:

Appending to the base deepagents system prompt (system_prompt_suffix), or replacing it outright (base_system_prompt)
Overriding individual tool descriptions (tool_description_overrides)
Excluding specific harness-level tools (excluded_tools)
Excluding specific middleware classes entirely (excluded_middleware)
Adding extra middleware for specific models or providers (extra_middleware)
Disabling, renaming, or re-prompting the general-purpose subagent (general_purpose_subagent)

Register a profile under a provider name like "openai" for provider-wide defaults, or under a fully qualified provider:model key like "openai:gpt-5.4" for per-model overrides. Registrations are additive: re-registering under an existing key merges on top of the prior registration (unioning excluded_tools and excluded_middleware, merging middleware by type, merging general_purpose_subagent field-wise, and preserving any fields the new registration leaves unset).

from deepagents import (
    GeneralPurposeSubagentProfile,
    HarnessProfile,
    register_harness_profile,
)
from deepagents.middleware.summarization import SummarizationMiddleware

# Illustrative example showing some of the capabilities of harness profiles:
# Applied by `create_deep_agent` when the selected model resolves to
# `openai:gpt-5.4`. Appends a system-prompt suffix, hides the `execute` tool,
# drops conversation summarization, and skips the auto-added `general-purpose`
# subagent (which also drops the `task` tool when no other subagents are
# configured).
register_harness_profile(
    "openai:gpt-5.4",
    HarnessProfile(
        system_prompt_suffix="Respond in under 100 words.",
        excluded_tools={"execute"},
        excluded_middleware=frozenset({SummarizationMiddleware}),
        general_purpose_subagent=GeneralPurposeSubagentProfile(enabled=False),
    ),
)

`excluded_middleware` cannot remove scaffolding that deep agents rely on. Listing `FilesystemMiddleware`, `SubAgentMiddleware`, or the internal permission middleware raises `ValueError` when `create_deep_agent` resolves the profile.

When you pass a preconfigured chat model instance (@[BaseChatModel] subclass) instead of a provider:model string, the harness synthesizes the canonical provider:identifier key from the instance and looks it up in this order: provider:identifier -> identifier-only (only when the identifier already contains :) -> provider-only fallback.

Harness profiles are complementary to Provider profiles: provider profiles shape how the model is built, while harness profiles shape how the harness works once that model is in use.

Distributable harness profiles can register themselves via `importlib.metadata` entry points instead of requiring callers to run `register_harness_profile` by hand. Declare an entry point in the distribution's own `pyproject.toml` under the `deepagents.harness_profiles` group:

```toml
[project.entry-points."deepagents.harness_profiles"]
gemini = "my_pkg.profiles:register"
```

The target resolves to a zero-arg callable that performs the registrations when `deepagents.profiles` is imported:

```python
from deepagents import HarnessProfile, register_harness_profile

def register() -> None:
    register_harness_profile(
        "google_genai",
        HarnessProfile(system_prompt_suffix="Batch independent tool calls in parallel."),
    )
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Planning capabilities

Virtual filesystem access

Filesystem permissions

Task delegation (subagents)

Context management

Code execution

Human-in-the-loop

Skills

Memory

Harness profiles

FilesExpand file tree

harness.mdx

Latest commit

History

harness.mdx

File metadata and controls

Planning capabilities

Virtual filesystem access

Filesystem permissions

Task delegation (subagents)

Context management

Code execution

Human-in-the-loop

Skills

Memory

Harness profiles