Skip to content

feat: implement intervention primitive in python with cancellation support#2693

Merged
mehtarac merged 9 commits into
strands-agents:mainfrom
mehtarac:interventions_python
Jun 11, 2026
Merged

feat: implement intervention primitive in python with cancellation support#2693
mehtarac merged 9 commits into
strands-agents:mainfrom
mehtarac:interventions_python

Conversation

@mehtarac

@mehtarac mehtarac commented Jun 9, 2026

Copy link
Copy Markdown
Member

Description

Adds the intervention primitive — a composable control layer for agents that enables authorization, guardrails, steering, and operational controls to share a common interface with ordered evaluation, short-circuiting, and typed actions.

This implements Python parity with the TypeScript SDK's intervention primitive (strands-agents/sdk-typescript#883).

Resolves #2667

Public API Changes

Agent(interventions=...)

Agent(interventions: list[InterventionHandler] | None = None)

Handlers are evaluated in registration order at each lifecycle event. Cheapest handlers (authorization, guardrails) should be listed first; expensive ones (LLM steering) last.

Public Exports

# Top-level
from strands import InterventionHandler

# Action types and handler interface (for type annotations and construction)
from strands.interventions import (
    InterventionHandler,
    InterventionAction,
    OnError,
    Proceed,
    Deny,
    Guide,
    Confirm,
    Transform,
)

InterventionHandler

class InterventionHandler(ABC):
    name: str  # abstract property — unique identifier
    on_error: OnError = "throw"  # error policy

    # Lifecycle methods — override the ones you care about (sync or async)
    def before_invocation(self, event: BeforeInvocationEvent, **kwargs) -> Proceed | Deny | Guide | Transform
    def before_tool_call(self, event: BeforeToolCallEvent, **kwargs) -> Proceed | Deny | Guide | Confirm | Transform
    def after_tool_call(self, event: AfterToolCallEvent, **kwargs) -> Proceed | Transform
    def before_model_call(self, event: BeforeModelCallEvent, **kwargs) -> Proceed | Deny | Guide | Transform
    def after_model_call(self, event: AfterModelCallEvent, **kwargs) -> Proceed | Guide | Transform

All lifecycle methods default to Proceed(). Override only the ones you need — the framework detects overrides at the class level and only registers hooks for those. Handlers can be sync (def) or async (async def).

Action Types

Actions are frozen dataclasses constructed directly:

Proceed(reason: str | None = None)
Deny(reason: str = "")
Guide(feedback: str = "", reason: str | None = None)
Confirm(prompt: str = "", reason: str | None = None, response: Any = None, evaluate: Callable = default_evaluate)
Transform(apply: Callable[[LifecycleEvent], None] = ..., reason: str | None = None)
Action Description
Proceed Allow the operation to continue
Deny Block the operation (sets event.cancel)
Guide Steer with feedback
Confirm Pause for human approval (before_tool_call only)
Transform Modify event content in-place

Action-to-Event Compatibility Matrix

Action before_invocation before_tool_call before_model_call after_tool_call after_model_call
Proceed
Deny cancel cancel cancel
Guide cancel+ cancel+ inject inject + retry
Confirm confirm
Transform apply apply apply apply apply
  • = no-op (warns at runtime)
  • cancel = sets event.cancel/cancel_tool, short-circuits (remaining handlers skipped)
  • cancel+ = sets cancel with accumulated feedback from all guiding handlers
  • confirm = uses preemptive response or interrupt, checks with evaluate, sets cancel if denied
  • inject = appends accumulated feedback as a user message so the model sees it on this call
  • inject + retry = appends accumulated feedback and retries so the model sees guidance
  • apply = calls action.apply(event) for in-place mutation, later handlers see the change

OnError Policy

Value Behavior
"throw" Rethrow (default). Invocation fails.
"proceed" Skip handler, continue to next (fail-open).
"deny" Apply Deny (fail-closed).

Hook Ordering

  • Before* events: interventions run at INTERVENTION_INPUT (90) — after plugins (0)
  • After* events: interventions run at INTERVENTION_OUTPUT (-90) — before plugins (0)

Flow: plugins → intervention → [operation] → intervention → plugins

What's NOT exported (internal)

  • InterventionRegistry — internal dispatch mechanism
  • Audit log — not included. Will be added when consumption patterns are clear.

Infrastructure Changes (cancellation support)

This PR also adds general-purpose cancellation support to two hook events. These fields are usable by any hook or plugin, not just interventions — but are required for the intervention primitive's Deny action to work.

BeforeInvocationEvent.cancel

cancel: bool | str = False

When set by a hook callback, the agent loop:

  1. Creates an assistant message with the cancel text
  2. Fires MessageAddedEvent
  3. Yields EventLoopStopEvent with stop_reason="end_turn"
  4. Fires AfterInvocationEvent
  5. Returns without entering the event loop

BeforeModelCallEvent.cancel

cancel: bool | str = False

When set by a hook callback, the event loop:

  1. Creates a synthetic assistant message with the cancel text
  2. Fires AfterModelCallEvent (allows retry via event.retry = True)
  3. Ends the model invoke span
  4. Yields ModelStopReason event
  5. Breaks out of the model retry loop (or continues if retry requested)

HookOrder constants

HookOrder.INTERVENTION_INPUT = 90    # After plugins on Before* events
HookOrder.INTERVENTION_OUTPUT = -90  # Before plugins on After* events

Example Usage

from strands import Agent, InterventionHandler, tool
from strands.interventions import Deny, Guide, Proceed


@tool
def send_email(to: str, body: str) -> str:
    """Send an email to a recipient."""
    return f"Email sent to {to}"


@tool
def query_database(query: str) -> str:
    """Run a database query."""
    return f"Results for: {query}"


ALLOWED_TOOLS = {
    "analyst": ["query_database"],
    "admin": ["query_database", "send_email"],
}


class RoleAuth(InterventionHandler):
    name = "role-auth"

    def before_tool_call(self, event):
        role = event.invocation_state.get("role", "")
        tool_name = event.tool_use["name"]
        if tool_name not in ALLOWED_TOOLS.get(role, []):
            return Deny(reason=f"Role '{role}' is not authorized for tool '{tool_name}'")
        return Proceed()


class PoliteGuard(InterventionHandler):
    name = "polite-guard"

    def after_model_call(self, event):
        if event.stop_response and event.stop_response.message:
            text = "".join(
                block.get("text", "")
                for block in event.stop_response.message.get("content", [])
                if "text" in block
            )
            if any(word in text.lower() for word in ["stupid", "idiot", "dumb"]):
                return Guide(feedback="Rephrase your response to be professional and respectful.")
        return Proceed()


agent = Agent(
    tools=[query_database, send_email],
    interventions=[RoleAuth(), PoliteGuard()],  # cheapest first
)

# Analyst can query but not send email
result = agent("Send an email to bob@example.com saying hello", invocation_state={"role": "analyst"})

# Admin can do both
result = agent("Query the database for recent orders", invocation_state={"role": "admin"})

Related Issues

Resolves #2667

Documentation PR

N/A

Type of Change

New feature

Testing

How have you tested the change?

  • I ran hatch run prepare
  • 74 unit tests covering all action types, short-circuiting, guide accumulation, dispatch ordering, override detection, all three onError modes, sync and async handlers, duplicate name rejection, conflict resolution, native interrupt propagation, confirm (pause/approve/deny/preemptive/custom evaluate/falsy response), transform on all event types, unsupported action warnings, cancel-path integration (deny→end_turn, retry re-entry, plain hook cancel), and full hook integration

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@mehtarac mehtarac temporarily deployed to manual-approval June 9, 2026 14:34 — with GitHub Actions Inactive
@mehtarac mehtarac added the api/needs-review Makes changes to the public API surface label Jun 9, 2026
@github-actions github-actions Bot added size/xl and removed size/xl labels Jun 9, 2026
@mehtarac mehtarac marked this pull request as ready for review June 9, 2026 14:53
Comment thread strands-py/src/strands/interventions/registry.py
Comment thread strands-py/src/strands/interventions/registry.py
Comment thread strands-py/src/strands/interventions/registry.py
Comment thread strands-py/src/strands/interventions/registry.py
Comment thread strands-py/src/strands/event_loop/event_loop.py Outdated
Comment thread strands-py/src/strands/event_loop/event_loop.py
Comment thread strands-py/src/strands/interventions/actions.py
Comment thread strands-py/src/strands/interventions/actions.py Outdated
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Assessment: Comment

Clean, well-structured intervention primitive that integrates elegantly with the existing hook system. The API is intuitive, the test coverage is comprehensive (67 tests covering all action types and edge cases), and the TypeScript parity goal is well-served.

Review Themes
  • Observability gap: Error handling in _handle_error silently swallows errors for "proceed" and "deny" modes without the logging promised in the documentation/docstrings.
  • Safety: Guide-triggered model retries have no cap in the unbounded while True model call loop. Worth documenting convergence requirements or adding a framework-level limit.
  • Side effects: Direct agent.messages mutation in Guide handlers bypasses session management pipeline (_append_messages).
  • Type safety: _apply_* methods use the broad LifecycleEvent union while accessing narrow event-specific attributes, which will fail under strict mypy.
  • Code duplication: tracer.end_model_invoke_span called twice in the cancel path in event_loop.py.

The architecture (registry bridging handlers to hooks, Guide accumulation, Deny short-circuiting, Confirm interrupt integration) is well-thought-out.

Comment thread strands-py/src/strands/interventions/__init__.py
@agent-of-mkmeral

Copy link
Copy Markdown
Contributor

Re-review @ 905da50 — all 4 blockers resolved ✅ (one new regression from the factory removal)

Checked out the new head and re-verified everything locally. All four blocking items from the previous summary are fixed:

# Blocker Status Verification
1 ruff format on registry.py:216 ✅ fixed ruff format --check clean; CI Python / Lint passes
2 Cancel-path tests ✅ added new test_cancel_paths.py (7 tests) covers exactly the requested scenarios: Deny→end_turn + DENIED: text, model not invoked, plain-hook cancel=True→default text, cancel+retry→loop re-entry, before_invocation deny ×3. 74/74 intervention tests pass
3 Exports ✅ resolved Proceed/Deny/Guide/Confirm/Transform/InterventionAction/OnError all importable from strands.interventions (verified). Module docstring example updated to match
4 **kwargs: Any on lifecycle methods ✅ added all 5 methods

Also noted: the factory functions and InterventionActions were removed entirely in favor of direct dataclass construction (Deny(reason="...")), per @mkmeral's open thread. The PR description was updated to match. Direct dataclasses are arguably more Pythonic and the namespacing concern is moot since Deny/Proceed are self-documenting names — but note this supersedes the resolution of @lizradway's namespacing thread (which was resolved via InterventionActions), so flagging for her re-confirmation.

🔴 New regression: removing confirm() reintroduced the exact hazard it guarded against

The factory's stated purpose (thread) was coalescing evaluate=Nonedefault_evaluate. With the factory gone, that guard is gone:

Confirm(prompt="ok?", response=True, evaluate=None)
# → TypeError: 'NoneType' object is not callable   (at dispatch time, deep in the registry)

Verified on this head. Since evaluate: Callable[[Any], bool] = field(default=default_evaluate) only protects omission, not explicit None, please add a __post_init__ guard to the frozen dataclass:

def __post_init__(self) -> None:
    if self.evaluate is None:
        object.__setattr__(self, "evaluate", default_evaluate)

(or type it Callable[[Any], bool] | None and resolve at use site — either way, explicit None shouldn't detonate at dispatch time in an approval primitive).

Still open from earlier reviews (maintainer calls, not blockers)
  • Bare callable in interventions=[...] still dies with AttributeError: 'function' object has no attribute 'name' at registration (verified) — a clear TypeError would help, since hooks=[...] does accept callables.
  • Instance-assigned handler methods still silently unenforced (_is_overridden checks the class only — verified returns False; TS detects this). Docs-only stance unchanged.
  • Phantom 0-token spanmodel_invoke_span still starts before the cancel check, so cancelled calls emit a 0-usage span; TS emits nothing.

CI status

All code gates are green (Lint, unit tests on py3.10–3.14 × linux/windows). The remaining failures are process checks: check-api-review-label (PR carries api/needs-review without api/review-complete — awaiting API-review sign-off), label-size, and check-access-and-checkout.

Verdict: with the Confirm(evaluate=None) guard added, this is mergeable from my side — pending the API-review label sign-off.

@github-actions

Copy link
Copy Markdown
Contributor

Assessment: Comment (approve-leaning)

Re-reviewed against the new HEAD 726255b after re-syncing. The regression flagged on the previous head — Confirm(evaluate=None) detonating at dispatch — is fixed: evaluate is now typed Callable[[Any], bool] | None (actions.py:101) and the registry guards the use-site with action.evaluate if action.evaluate is not None else default_evaluate (registry.py:136). I reproduced the old crash and confirmed it no longer occurs. All four earlier blockers (format gate, cancel-path tests, top-level exports, **kwargs) remain resolved.

Verification @ 726255b
  • ruff format --check: 15 files already formatted ✅
  • ruff check: all checks passed ✅
  • mypy (interventions + event_loop): clean ✅
  • pytest: 237 passed (interventions + event_loop + agent) ✅
Still open — minor, already have inline threads (replied)
  • Doc drift (actions.py:1): docstring still says "and factory functions"; module is dataclasses-only now.
  • Per-field test assertions (test_registry.py:302, :542): could collapse to a single full-object equality.

Both are quick wins, neither blocks merge.

Maintainer-call DevEx items (verified still present, not blocking)
  • Bare callable in interventions=[fn] dies with AttributeError: 'function' object has no attribute 'name' at registration. Since hooks=[...] accepts callables, users will reasonably try it — a clear TypeError would help.
  • Instance-assigned methods silently unenforced: _is_overridden checks the class only, so h.before_tool_call = lambda ... returns False and never fires. For a security primitive, a one-line registration warning would turn silent non-enforcement into a loud signal. Currently docs-only.

From a code standpoint this is in good shape — the only merge gate I see is the external api/needs-reviewapi/review-complete label sign-off.

@mkmeral mkmeral added the api/review-complete An API Bar-raiser reviewed and accepted the APIs label Jun 11, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Assessment: Approve-leaning

Re-reviewed against the new HEAD c150d11 ("fix docstrings"). Both minor nits from the previous round are now resolved in code:

  • Doc driftactions.py:1 now reads "Intervention action types." (factory-function reference removed).
  • Per-field test assertionstest_registry.py:301 and :538 now use a single full-object equality (agent.messages == [{...}]).
Verification @ c150d11
  • ruff format --check: 15 files already formatted ✅
  • ruff check: all checks passed ✅
  • mypy (interventions + event_loop): clean ✅
  • pytest: 237 passed (interventions + event_loop + agent) ✅

The two remaining items are explicit maintainer calls, not blockers, and are unchanged: bare callable in interventions=[fn] still raises a cryptic AttributeError (vs. a clear TypeError), and instance-assigned lifecycle methods are still silently unenforced (_is_overridden checks the class only). Both are documented/known and fine to defer.

Code-wise this is in good shape — the only outstanding merge gate is the external api/needs-reviewapi/review-complete label sign-off. Nice, tight iteration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api/needs-review Makes changes to the public API surface api/review-complete An API Bar-raiser reviewed and accepted the APIs area-interventions Related to interventions enhancement New feature or request python Pull requests that update python code size/xl

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Interventions Primitive - Python Parity

6 participants