Skip to content

feat(langchain): SecretMiddleware for tool-call credential detection#37192

Draft
Bagatur (baskaryan) wants to merge 2 commits into
langchain-ai:masterfrom
baskaryan:feat/secret-middleware
Draft

feat(langchain): SecretMiddleware for tool-call credential detection#37192
Bagatur (baskaryan) wants to merge 2 commits into
langchain-ai:masterfrom
baskaryan:feat/secret-middleware

Conversation

@baskaryan
Copy link
Copy Markdown
Collaborator

Description

Adds a tool-call middleware (SecretMiddleware) that detects known credential patterns in tool-call arguments and either blocks the call or redacts the matched substring before the tool runs.

Motivation

Agents that read attacker-controllable input — system prompts, retrieved documents, tool-result content from upstream tools — can be steered (via prompt injection) into emitting tool calls that embed credentials they read from elsewhere, exfiltrating them through the legitimate tool surface. The egress allowlist on the agent's environment can't help here because the destination host is exactly the one you have to allow for legitimate use.

A chokepoint in front of tool execution that inspects args closes that gap. It's the natural complement to PIIMiddleware (which scans messages for PII) — different threat surface (tool args), different patterns (credentials), but the same shape of solution.

API

from langchain.agents import create_agent
from langchain.agents.middleware import SecretMiddleware

# Default: block any tool call carrying a known secret.
agent = create_agent("openai:gpt-5", tools=[...], middleware=[SecretMiddleware()])

# Or redact in place and let the tool run on rewritten args.
agent = create_agent(
    "openai:gpt-5",
    tools=[...],
    middleware=[SecretMiddleware(strategy="redact")],
)

Other constructor args:

  • tools=[...] — limit to specific tools (mirrors ToolRetryMiddleware).
  • secret_types=[...] — limit to specific built-in detectors. See BUILTIN_SECRET_TYPES for the full set: github_classic_token, github_fine_grained_pat, langsmith_key, anthropic_key, openai_project_key, openai_legacy_key, aws_access_key_id, jwt.
  • custom_detectors={"name": finder} — register additional detectors. Each finder takes a string and yields (start, end) byte-offset pairs.

Public surface: SecretMiddleware, SecretMatch, find_secrets, BUILTIN_SECRET_TYPES.

Design notes

No \b word-boundary anchors on the patterns. Each detector relies on prefix + length + alphabet to be tight on its own. Boundaries would block matches when secrets are concatenated with alphanumeric characters in JSON the agent built up by interpolation — exactly the case attackers exploit. The negative corpus in test_find_secrets_negative_corpus (commit SHAs, UUIDs, ULIDs, {"lc": 1, ...} manifest snippets, prose like "the docs mention sk- as a prefix") confirms FP rate stays at zero without them.

Block strategy never echoes the matched substring back. Returning the match in the ToolMessage content would re-publish the secret into the agent's context window, defeating the purpose. The rejection mentions only the type (e.g. github_classic_token) and the tool name.

Redact uses request.override to produce a new ToolCallRequest with rewritten args, leaving the original immutable.

Tests

51 tests covering: each built-in pattern detected, negative corpus, nested dict walking with path tracking, offset reporting, block strategy, redact strategy (incl. nested + overlapping spans), tools filter, secret_types filter, custom detectors with and without built-ins, async wrap, missing args dict, and end-to-end create_agent integration for both strategies.

51 passed in 3.60s

Lint clean (uv run --group lint ruff check), formatted (uv run --group lint ruff format).

…ection

Introduces a tool-call middleware that detects known credential
patterns (GitHub tokens classic + fine-grained, LangSmith keys,
Anthropic keys, OpenAI legacy + project keys, AWS Access Key IDs,
JWTs) in tool-call arguments and either blocks or redacts them.

Use case: an agent reading attacker-controllable input — system
prompts, retrieved documents, tool-result content — can be steered
into emitting tool calls that embed credentials it has read from
elsewhere, exfiltrating them through the legitimate tool surface.
The egress allowlist on the agent's environment can't help here
because the destination host is exactly the one you have to allow
for legitimate use. A chokepoint in front of tool execution that
inspects args closes that gap.

Mirrors the conventions of the existing PIIMiddleware where they
apply (strategy literal, custom detector hook, public detector
function for direct use), but focuses on tool-call args rather than
message content. Two strategies:

- "block" (default): return ToolMessage(status="error") instead of
  executing the tool. The matched substring is intentionally not
  echoed back, which would re-publish the secret into the agent's
  context.
- "redact": substitute matches with [REDACTED_<SECRET_TYPE>] in the
  args and execute the tool with the rewritten args.

Each detector pattern anchors on a fixed, high-entropy prefix +
length + alphabet. The tests' negative corpus (commit SHAs, UUIDs,
ULIDs, base64 manifests, "sk-" in prose) confirms the FP rate stays
at zero without word-boundary anchors, which is important because
boundaries would block matches when secrets are concatenated with
alphanumeric characters in JSON the agent built up by interpolation.

Public surface: SecretMiddleware, SecretMatch, find_secrets,
BUILTIN_SECRET_TYPES. Direct hook tests + create_agent end-to-end
tests + sync/async coverage = 51 passing tests.
@github-actions github-actions Bot added feature For PRs that implement a new feature; NOT A FEATURE REQUEST internal langchain `langchain` package issues & PRs size: L 500-999 LOC labels May 5, 2026
@baskaryan Bagatur (baskaryan) marked this pull request as draft May 5, 2026 15:36
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 5, 2026

Merging this PR will not alter performance

✅ 2 untouched benchmarks
⏩ 13 skipped benchmarks1


Comparing baskaryan:feat/secret-middleware (e9f83fc) with master (5a9b1ec)2

Open in CodSpeed

Footnotes

  1. 13 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

  2. No successful run was found on master (9bd730e) during the generation of this report, so 5a9b1ec was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

`ToolCallRequest.override(tool_call=...)` is typed to accept
`ToolCall` (a TypedDict from langchain_core.messages). The runtime
shape is identical to the dict-spread we were producing — adding a
cast keeps the call flagged for the type checker without any
behavioural change.
@mdrxy Mason Daugherty (mdrxy) changed the title feat(langchain_v1): add SecretMiddleware for tool-call credential detection feat(langchain): SecretMiddleware for tool-call credential detection May 5, 2026
Copy link
Copy Markdown
Collaborator

@eyurtsev Eugene Yurtsev (eyurtsev) left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you seeing the PIIMiddleware interface?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature For PRs that implement a new feature; NOT A FEATURE REQUEST internal langchain `langchain` package issues & PRs size: L 500-999 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants