[BREAKING] Python: Add context mode to AgentExecutor#4668
Conversation
python/packages/core/agent_framework/_workflows/_agent_executor.py
Outdated
Show resolved
Hide resolved
python/packages/core/agent_framework/_workflows/_agent_executor.py
Outdated
Show resolved
Hide resolved
python/packages/core/agent_framework/_workflows/_agent_executor.py
Outdated
Show resolved
Hide resolved
python/packages/core/agent_framework/_workflows/_agent_executor.py
Outdated
Show resolved
Hide resolved
python/packages/core/agent_framework/_workflows/_agent_executor.py
Outdated
Show resolved
Hide resolved
python/packages/core/agent_framework/_workflows/_agent_executor.py
Outdated
Show resolved
Hide resolved
Python Test Coverage Report •
Python Unit Test Overview
|
||||||||||||||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Question around: checkpoint serialization that can crash with custom ContextMode
on_checkpoint_save stores self._context_mode directly in the state dict. The checkpoint encoding system (_checkpoint_encoding.py) uses pickle.dumps() for non-JSON-native types. However, Python's standard pickle cannot serialize nested functions or closures.
ContextMode.last_n() creates a closure (_last_n_messages) as the messages_filter attribute. Any attempt to checkpoint an executor using ContextMode.last_n() — or any user-provided lambda/closure as messages_filter — will crash:
PicklingError: Can't pickle local object 'ContextMode.last_n.<locals>._last_n_messages'
Per _runner.py:302-307, this propagates as a WorkflowCheckpointException and fails the entire checkpoint save, not just the context_mode field.
Verified locally:
import pickle
class Foo:
@classmethod
def with_closure(cls, n):
def _inner(x):
return x[-n:]
return cls(_inner)
def __init__(self, fn):
self.fn = fn
pickle.dumps(Foo.with_closure(3))
# -> PicklingError: Can't pickle local objectSome ideas: either (a) don't include context_mode in checkpoint state — reconstruct it from constructor args on restore, or (b) implement a custom __getstate__/__setstate__ on ContextMode that serializes the filter mode + parameters (e.g., store ("last_n", 5)) rather than the closure itself.
Minor secondary note: the on_checkpoint_restore try/except around self._context_mode = context_mode_payload is dead code — a simple assignment can never raise, so the except branch is unreachable.
Good catch! I think we can probably drop the |
moonbox3
left a comment
There was a problem hiding this comment.
Automated Code Review
Reviewers: 4 | Confidence: 91%
✗ Correctness
The diff introduces a ContextMode class for controlling conversation context in AgentExecutor, makes full_conversation required on AgentExecutorResponse, and updates all handlers to use the new context mode. There are two acknowledged-but-unfixed bugs (checkpoint pickle crash with closures, dead try/except) and a silent default-behavior change in from_response that switches from cache-replacement to cache-append semantics, which could cause duplicate messages in existing workflows.
✗ Security Reliability
This diff introduces ContextMode for managing conversation context in AgentExecutor. There are two blocking reliability issues: (1) storing self._context_mode directly in checkpoint state will crash with PicklingError when ContextMode.last_n() is used because it captures a closure, and (2) the from_response handler changed from cache-replace to cache-extend semantics under the default ContextMode, which silently breaks existing workflows. Additionally, the on_checkpoint_restore has a dead try/except that cannot execute, and the restored context_mode payload is assigned without any type validation, which undermines checkpoint integrity.
✗ Test Coverage
The diff introduces a substantial new ContextMode class with three filter modes, changes the default cache behavior in all four message handlers (extend instead of replace), adds checkpoint serialization for context_mode, and makes full_conversation non-optional — yet there are zero new tests for any of this functionality. All test changes in the diff are purely mechanical (adding the now-required full_conversation argument to existing constructors). There are no unit tests for ContextMode construction/validation, no integration tests verifying handler behavior under different context modes, no checkpoint round-trip tests for context_mode (which is known to crash with closures from last_n), and no tests covering the behavioral change from cache-replace to cache-extend.
✗ Design Approach
The diff introduces a
ContextModeclass for controlling conversation context inAgentExecutor, makesfull_conversationnon-optional onAgentExecutorResponse, and updates all callsites. Three design problems remain unfixed from prior review: (1)last_n()embeds an unpicklable closure that will crash checkpoint save, (2) the try/except inon_checkpoint_restorewraps an assignment that cannot raise, making the except branch dead code. Two new design problems appear: (3) the default behaviour offrom_responsesilently changed from cache-replace to cache-extend, which will duplicate messages for callers relying on the old replace semantics; and (4)last_agentfilter mode is handled only infrom_response—the mode is silently ignored infrom_str,from_message, andfrom_messages, making the abstraction inconsistently applied and misleading to users.
Flagged Issues
- on_checkpoint_save stores self._context_mode directly, but ContextMode.last_n() captures a closure (_last_n_messages) that cannot be pickled — any checkpoint save with last_n mode will crash with PicklingError, killing the entire checkpoint operation. Either serialize only the picklable constructor arguments (filter_mode, retain_cache, and for last_n the value of n) and reconstruct on restore, or implement getstate/setstate on ContextMode.
- from_response default behavior silently changed from self._cache = list(source_messages) (replace) to self._cache.extend(...) (append). Under the default ContextMode (retain_cache=True), existing cache content — including staged messages from AgentExecutorRequest(should_respond=False) — is now preserved instead of replaced, which is a behavioral regression for all current callers that relied on from_response resetting the cache.
- last_agent filter_mode is silently a no-op in from_str, from_message, and from_messages. These handlers extend the cache unconditionally with all incoming messages and never branch on filter_mode=='last_agent'. Users who set ContextMode.last_agent() will get different and undocumented behavior depending on which handler fires.
- No tests exist for the ContextMode class (default(), last_agent(), last_n(n), ValueError on custom without filter) or for checkpoint round-trip serialization of context_mode — a round-trip test with ContextMode.last_n() would immediately surface the pickle crash.
- No tests cover handler behavior under different context modes (from_response with full/last_agent/custom, from_str/from_message/from_messages with retain_cache=False) or validate the changed cache extend-vs-replace semantics.
Suggestions
- Remove the dead try/except in on_checkpoint_restore — self._context_mode = context_mode_payload is a plain assignment that cannot raise. The except branch is unreachable dead code. Replace with a straightforward conditional assignment.
- Add type validation when restoring context_mode from checkpoint state (e.g., isinstance check for ContextMode) to prevent corrupted or tampered checkpoint data from assigning an arbitrary object. Fall back to ContextMode.default() with a warning on type mismatch.
- Add a fallback else branch in from_response's filter_mode chain that logs a warning and falls back to full conversation, so that an unrecognized filter_mode doesn't silently drop all messages.
- Warn or raise when filter_mode='last_agent' or filter_mode='custom' is used with from_str/from_message/from_messages, since these handlers silently ignore filtering — the mode behaves identically to 'full' with no indication to the user.
- Centralize filtering logic in a single helper method so every handler (from_response, from_str, from_message, from_messages) applies filter_mode and retain_cache uniformly, rather than each handler reimplementing the logic inconsistently.
- Add a test that calls from_response twice in succession to verify and document the extend-vs-replace semantics under default ContextMode.
- Add unit tests for ContextMode covering all factories (default, last_agent, last_n including edge cases n=0 and n > len) and the ValueError on custom mode without a filter function.
- Add a checkpoint round-trip test for each ContextMode variant, including ContextMode.last_n() to document and catch the pickle failure.
Automated review by moonbox3's agents
python/packages/core/agent_framework/_workflows/_agent_executor.py
Outdated
Show resolved
Hide resolved
python/packages/core/agent_framework/_workflows/_agent_executor.py
Outdated
Show resolved
Hide resolved
python/packages/core/agent_framework/_workflows/_agent_executor.py
Outdated
Show resolved
Hide resolved
python/packages/core/agent_framework/_workflows/_agent_executor.py
Outdated
Show resolved
Hide resolved
python/packages/core/agent_framework/_workflows/_agent_executor.py
Outdated
Show resolved
Hide resolved
python/packages/core/agent_framework/_workflows/_agent_executor.py
Outdated
Show resolved
Hide resolved
python/packages/core/agent_framework/_workflows/_agent_executor.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
This PR introduces a configurable “context mode” for chaining agent executions in Python workflows/orchestrations, including a sequential-builder option to pass only the prior agent’s output forward. It also makes AgentExecutorResponse.full_conversation required (breaking change) and updates downstream code/tests accordingly.
Changes:
- Add
context_mode/context_filtertoAgentExecutorand makeAgentExecutorResponse.full_conversationrequired. - Add
chain_only_agent_responsesoption toSequentialBuilder, plumbing it into agent executors. - Update orchestrations/core/azurefunctions tests and add a new sample demonstrating the sequential chaining behavior.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| python/packages/core/agent_framework/_workflows/_agent_executor.py | Adds context_mode support and makes full_conversation required on AgentExecutorResponse. |
| python/packages/orchestrations/agent_framework_orchestrations/_sequential.py | Introduces chain_only_agent_responses and forwards it via executor context_mode. |
| python/packages/orchestrations/agent_framework_orchestrations/_orchestration_request_info.py | Allows request-info wrapper executor to pass context_mode through to its internal AgentExecutor. |
| python/packages/orchestrations/agent_framework_orchestrations/_concurrent.py | Adjusts aggregation logic to rely on required full_conversation. |
| python/packages/orchestrations/tests/test_sequential.py | Adds coverage for chain_only_agent_responses behavior. |
| python/packages/orchestrations/tests/test_orchestration_request_info.py | Updates tests for required full_conversation. |
| python/packages/core/tests/workflow/test_runner.py | Updates test construction for required full_conversation. |
| python/packages/core/tests/workflow/test_agent_executor.py | Expands tests for context mode behavior and adjusts workflows used in tests. |
| python/packages/azurefunctions/tests/test_workflow.py | Updates tests for required full_conversation. |
| python/packages/azurefunctions/tests/test_func_utils.py | Updates serialization roundtrip test for required full_conversation. |
| python/samples/03-workflows/orchestrations/sequential_chain_only_agent_responses.py | New sample demonstrating sequential chaining with “only prior agent response” behavior. |
Motivation and Context
Closes #3688
Description
context_modeandcontext_filtertoAgentExecutor. These two control knobs are applied on the agent executor when it receives anAgentExecutorResponsethat carries an agent response and the full conversation.sequentialto allow customers to configure if they want to chain the full conversation or not.Contribution Checklist