Global development guidelines for the Deep Agents monorepo

This document provides context to understand the Deep Agents Python project and assist with development.

Project architecture and context

Monorepo structure

This is a Python monorepo with multiple independently versioned packages that use uv.

deepagents/
├── libs/
│   ├── deepagents/  # SDK
│   ├── cli/         # CLI tool
│   ├── acp/         # Agent Context Protocol support
│   ├── harbor/      # Evaluation/benchmark framework
│   └── partners/    # Integration packages
│       └── daytona/
│       └── ...
├── .github/         # CI/CD workflows and templates
└── README.md        # Information about Deep Agents

Development tools & commands

uv – Fast Python package installer and resolver (replaces pip/poetry)
make – Task runner for common development commands. Feel free to look at the Makefile for available commands and usage patterns.
ruff – Fast Python linter and formatter
ty – Static type checking
Do NOT use Sphinx-style double backtick formatting (``code``). Use single backticks (code) for inline code references in docstrings and comments.

Suppressing ruff lint rules

Prefer inline # noqa: RULE over [tool.ruff.lint.per-file-ignores] for individual exceptions. per-file-ignores silences a rule for the entire file — If you add it for one violation, all future violations of that rule in the same file are silently ignored. Inline # noqa is precise to the line, self-documenting, and keeps the safety net intact for the rest of the file.

Reserve per-file-ignores for categorical policy that applies to a whole class of files (e.g., "tests/**" = ["D1", "S101"] — tests don't need docstrings, assert is expected). These are not exceptions; they are different rules for a different context.

# GOOD – categorical policy in pyproject.toml
[tool.ruff.lint.per-file-ignores]
"tests/**" = ["D1", "S101"]

# BAD – single-line exception buried in pyproject.toml
"deepagents_cli/agent.py" = ["PLR2004"]

# GOOD – precise, self-documenting inline suppression
timeout = 30  # noqa: PLR2004  # default HTTP timeout, not arbitrary

pytest – Testing framework

This monorepo uses uv for dependency management. Local development uses editable installs: [tool.uv.sources]

Each package in libs/ has its own pyproject.toml and uv.lock.

# Run unit tests (no network)
make test

# Run specific test file
uv run --group test pytest tests/unit_tests/test_specific.py

# Lint code
make lint

# Format code
make format

Key config files

pyproject.toml: Main workspace configuration with dependency groups
uv.lock: Locked dependencies for reproducible builds
Makefile: Development tasks

Commit standards

Suggest PR titles that follow Conventional Commits format. Refer to .github/workflows/pr_lint for allowed types and scopes. Note that all commit/PR titles should be in lowercase with the exception of proper nouns/named entities. All PR titles should include a scope with no exceptions. For example:

feat(sdk): add new chat completion feature
fix(cli): resolve type hinting issue
chore(harbor): update infrastructure dependencies

Do NOT use Sphinx-style double backtick formatting (``code``). Use single backticks (code) for inline code references in docstrings and comments.

Pull request guidelines

Always add a disclaimer to the PR description mentioning how AI agents are involved with the contribution.
Describe the "why" of the changes, why the proposed solution is the right one. Limit prose.
Highlight areas of the proposed changes that require careful review.

Core development principles

Maintain stable public interfaces

CRITICAL: Always attempt to preserve function signatures, argument positions, and names for exported/public methods. Do not make breaking changes.

You should warn the developer for any function signature changes, regardless of whether they look breaking or not.

Before making ANY changes to public APIs:

Check if the function/class is exported in __init__.py
Look for existing usage patterns in tests and examples
Use keyword-only arguments for new parameters: *, new_param: str = "default"
Mark experimental features clearly with docstring warnings (using MkDocs Material admonitions, like !!! warning)

Ask: "Would this change break someone's code if they used it last week?"

Code quality standards

All Python code MUST include type hints and return types.

def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:
    """Single line description of the function.

    Any additional context about the function can go here.

    Args:
        users: List of user identifiers to filter.
        known_users: Set of known/valid user identifiers.

    Returns:
        List of users that are not in the `known_users` set.
    """

Use descriptive, self-explanatory variable names.
Follow existing patterns in the codebase you're modifying
Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense
Avoid using the any type
Prefer single word variable names where possible

Testing requirements

Every new feature or bugfix MUST be covered by unit tests.

Unit tests: tests/unit_tests/ (no network calls allowed)
Integration tests: tests/integration_tests/ (network calls permitted)
We use pytest as the testing framework; if in doubt, check other existing tests for examples.
Do NOT add @pytest.mark.asyncio to async tests — every package sets asyncio_mode = "auto" in pyproject.toml, so pytest-asyncio discovers them automatically.
The testing file structure should mirror the source code structure.
Avoid mocks as much as possible
Test actual implementation, do not duplicate logic into tests

Ensure the following:

Does the test suite fail if your new logic is broken?
Edge cases and error conditions are tested
Tests are deterministic (no flaky tests)

Security and risk assessment

No eval(), exec(), or pickle on user-controlled input
Proper exception handling (no bare except:) and use a msg variable for error messages
Remove unreachable/commented code before committing
Race conditions or resource leaks (file handles, sockets, threads).
Ensure proper resource cleanup (file handles, connections)

Documentation standards

Use Google-style docstrings with Args section for all public functions.

def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:
    """Send an email to a recipient with specified priority.

    Any additional context about the function can go here.

    Args:
        to: The email address of the recipient.
        msg: The message body to send.
        priority: Email priority level.

    Returns:
        `True` if email was sent successfully, `False` otherwise.

    Raises:
        InvalidEmailError: If the email address format is invalid.
        SMTPConnectionError: If unable to connect to email server.
    """

Types go in function signatures, NOT in docstrings
- If a default is present, DO NOT repeat it in the docstring unless there is post-processing or it is set conditionally.
Focus on "why" rather than "what" in descriptions
Document all parameters, return values, and exceptions
Keep descriptions concise but clear
Ensure American English spelling (e.g., "behavior", not "behaviour")
Do NOT use Sphinx-style double backtick formatting (``code``). Use single backticks (code) for inline code references in docstrings and comments.

Package-specific guidance

Deep Agents CLI (`libs/cli/`)

deepagents-cli uses Textual for its terminal UI framework.

Key Textual resources:

Guide: https://textual.textualize.io/guide/
Widget gallery: https://textual.textualize.io/widget_gallery/
CSS reference: https://textual.textualize.io/styles/
API reference: https://textual.textualize.io/api/

Textual patterns used in this codebase:

Workers (@work decorator) for async operations - see Workers guide
Message passing for widget communication - see Events guide
Reactive attributes for state management - see Reactivity guide

SDK dependency pin:

The CLI pins an exact deepagents==X.Y.Z version in libs/cli/pyproject.toml. When developing CLI features that depend on new SDK functionality, bump this pin as part of the same PR. A CI check verifies the pin matches the current SDK version at release time (unless bypassed with dangerous-skip-sdk-pin-check).

Startup performance:

The CLI must stay fast to launch. Never import heavy packages (e.g., deepagents, LangChain, LangGraph) at module level or in the argument-parsing path. These imports pull in large dependency trees and add seconds to every invocation, including trivial commands like deepagents -v.

Keep top-level imports in main.py and other entry-point modules minimal.
Defer heavy imports to the point where they are actually needed (inside functions/methods).
To read another package's version without importing it, use importlib.metadata.version("package-name").

CLI help screen:

The deepagents --help screen is hand-maintained in ui.show_help(), separate from the argparse definitions in main.parse_args(). When adding a new CLI flag, update both files. A drift-detection test (test_args.TestHelpScreenDrift) fails if a flag is registered in argparse but missing from the help screen.

Building chat/streaming interfaces:

Blog post: Anatomy of a Textual User Interface - demonstrates building an AI chat interface with streaming responses

Testing Textual apps:

Use textual.pilot for async UI testing - see Testing guide
Snapshot testing available for visual regression - see repo notes/snapshot_testing.md

Additional resources

Documentation: https://docs.langchain.com/oss/python/deepagents/overview and source at https://github.com/langchain-ai/docs or ../docs/. Prefer the local install and use file search tools for best results. If needed, use the docs MCP server as defined in .mcp.json for programmatic access.
Contributing Guide: Contributing Guide
CLI Release Process: See .github/RELEASING.md for the full CLI release workflow (release-please, version bumping, troubleshooting failed releases, and label management).
Do NOT use Sphinx-style double backtick formatting (``code``). Use single backticks (code) for inline code references in docstrings and comments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Global development guidelines for the Deep Agents monorepo

Project architecture and context

Monorepo structure

Development tools & commands

Suppressing ruff lint rules

Key config files

Commit standards

Pull request guidelines

Core development principles

Maintain stable public interfaces

Code quality standards

Testing requirements

Security and risk assessment

Documentation standards

Package-specific guidance

Deep Agents CLI (`libs/cli/`)

Additional resources

FilesExpand file tree

AGENTS.md

Latest commit

History

AGENTS.md

File metadata and controls

Global development guidelines for the Deep Agents monorepo

Project architecture and context

Monorepo structure

Development tools & commands

Suppressing ruff lint rules

Key config files

Commit standards

Pull request guidelines

Core development principles

Maintain stable public interfaces

Code quality standards

Testing requirements

Security and risk assessment

Documentation standards

Package-specific guidance

Deep Agents CLI (libs/cli/)

Additional resources

Deep Agents CLI (`libs/cli/`)