Skip to content

Feat/new sales agent#7136

Open
Siddharth2624 wants to merge 6 commits into
aden-hive:mainfrom
Siddharth2624:feat/new-sales-agent
Open

Feat/new sales agent#7136
Siddharth2624 wants to merge 6 commits into
aden-hive:mainfrom
Siddharth2624:feat/new-sales-agent

Conversation

@Siddharth2624
Copy link
Copy Markdown
Contributor

@Siddharth2624 Siddharth2624 commented Apr 25, 2026

Fixes #6836

  • Add set_output tool support for event_loop nodes
  • Fix input_keys not being injected into system prompts
  • Add trigger payload injection into NodeContext.input_data
  • Ensure storage directories exist before write operations

Summary

Fixed critical bugs preventing the sales_ops_agent from running correctly:

  1. Missing set_output tool - event_loop nodes couldn't write outputs
  2. input_keys not visible to LLM - current_date wasn't accessible in prompts
  3. Race condition in run persistence - confusing error during shutdown

Changes Made

Agent Template (sales_ops_agent)

  • tools.py: Ensure data directory exists before writing demo data
  • nodes/init.py: Tightened prompts with clear decision trees
  • mcp_servers.json: Enable unverified tools (Salesforce support)

Core Framework

Why core framework changes?

The bugs were in framework-level plumbing, not the agent template itself:

  1. AgentLoop (agent_loop.py)
    - Missing build_set_output_tool import caused crash
    - Added _build_set_output_tool() method to construct the tool
    - Added set_output tool to tools list for nodes with output_keys
    - Replaced error handler with proper implementation
  2. Input Data Flow (context.py, prompting.py)
    - Changed derive_input_data_from_buffer default from False to True
    - INPUT DATA section now injected at top of system prompt (right after identity)
    - This ensures current_date and other input_keys are visible to LLM
  3. Trigger Payloads (cursor_persistence.py)
    - Added ctx parameter to drain_trigger_queue()
    - Trigger payload keys now injected into NodeContext.input_data
    - Existing keys are not overridden (explicit input wins)
  4. Storage (concurrent.py, io.py, stream_runtime.py)
    - Created runs/ and summaries/ directories at ConcurrentStorage init
    - Suppressed false-positive "Failed to save run" error when file exists
    - Simplified atomic_write to avoid race conditions

Testing

  • ✅ All existing tests pass
  • ✅ Agent validates successfully
  • ✅ Ruff linting passes
  • ✅ Manual test: agent completes successfully with proper outputs

Impact

  • Backwards compatible - Existing agents continue to work
  • Default behavior change - derive_input_data_from_buffer=True means input_keys now read from
    buffer by default (correct behavior)
  • Tool access - event_loop nodes now have set_output when they have output_keys defined

Summary by CodeRabbit

  • New Features

    • Conditional set-output handling for nodes with declared outputs
    • "Minimal" tool-access policy to limit node tools
    • INPUT DATA block added to system prompts for richer context
    • New Sales Ops Agent template with CLI, demo data, tools, flowchart, and README
  • Bug Fixes

    • Suppress noisy errors when run files already exist
    • Ensure directories exist for atomic writes
    • Prevent stale trigger-injected input from persisting across iterations
  • Tests

    • Added tests for trigger-queue context injection
  • Chores

    • Updated .gitignore to exclude generated artifacts

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 25, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Injects a synthetic set_output tool and handler; makes trigger draining context-aware by merging trigger payloads into AgentContext.input_data and clearing stale keys; surfaces input data in prompts; adds a Sales Ops Agent template (graph, tools, CLI, demo data); hardens storage/atomic I/O; and adds tests.

Changes

Cohort / File(s) Summary
Agent loop & synthetic tools
core/framework/agent_loop/agent_loop.py, core/framework/agent_loop/internals/synthetic_tools.py
Injects a synthetic set_output tool for nodes with output_keys; switches handling from rejection to validation via handle_set_output, returns ToolResult, async-stores validated key/value into the turn OutputAccumulator, records set keys, and adds _build_set_output_tool. Accepts output_keys as `list[str]
Trigger draining & cursor persistence + tests
core/framework/agent_loop/internals/cursor_persistence.py, core/framework/agent_loop/internals/tests/test_cursor_persistence.py
drain_trigger_queue now accepts optional `ctx: AgentContext
Prompting & orchestrator context
core/framework/agent_loop/prompting.py, core/framework/orchestrator/prompting.py, core/framework/orchestrator/context.py
Adds input_data field to PromptSpec/NodePromptSpec and populates it from ctx.input_data; build_system_prompt now renders an INPUT DATA: block (omitting None values). Adds "minimal" tool_access_policy limiting tools to set_output and escalate. build_node_context now defaults derive_input_data_from_buffer=True.
Storage & I/O robustness
core/framework/storage/concurrent.py, core/framework/utils/io.py, core/framework/host/stream_runtime.py
Ensure storage root and runs//summaries/ dirs exist on init/flush and before atomic writes. Treat run-file write races (file already exists) as benign (debug) in CancelledError and other exception paths; atomic_write now creates parent dir before writing. Stream runtime suppresses noisy error when run file already exists.
Sales Ops Agent template — core, config & exports
examples/templates/sales_ops_agent/agent.py, examples/templates/sales_ops_agent/config.py, examples/templates/sales_ops_agent/__init__.py
Adds SalesOpsAgent class (start/stop/trigger_and_wait/run/info/validate), default_agent instance, AgentMetadata, runtime config, package exports, and __version__.
Sales Ops Agent template — CLI & docs
examples/templates/sales_ops_agent/__main__.py, examples/templates/sales_ops_agent/README.md
Adds Click CLI (run, tui, info, validate, shell) with logging setup and comprehensive README documenting pipeline, triggers, tools, demo data and usage.
Sales Ops Agent template — graph, nodes & JSON
examples/templates/sales_ops_agent/nodes/__init__.py, examples/templates/sales_ops_agent/agent.json, examples/templates/sales_ops_agent/flowchart.json
Adds five-node graph (trigger→monitor→analyze→rebalance→log), node specs, flow/graph JSON and flowchart templates, edges, entry/terminal nodes.
Sales Ops Agent template — tools & demo data
examples/templates/sales_ops_agent/tools.py, examples/templates/sales_ops_agent/demo_data.json
New tool registry and executor (load_data, append_data, load_demo_sales_data, demo_log_action) with safe path handling and demo JSON data.
Sales Ops Agent template — configs
examples/templates/sales_ops_agent/mcp_servers.json, examples/templates/sales_ops_agent/triggers.json
Add MCP server config for hive_tools and a monthly cron trigger (0 9 1 * *).
Misc
.gitignore
Add graphify-out to ignore list.

Sequence Diagram(s)

sequenceDiagram
    participant LLM as LLM
    participant AgentLoop as AgentLoop
    participant Handler as set_output_Handler
    participant Output as OutputAccumulator
    participant Ctx as AgentContext

    LLM->>AgentLoop: call synthetic tool "set_output" with payload
    AgentLoop->>Handler: handle_set_output(tool_input, output_keys)
    Handler->>Handler: validate key ∈ output_keys
    alt valid
        Handler->>AgentLoop: return ToolResult(success)
        AgentLoop->>Output: async store(key, value)
        AgentLoop->>Ctx: record outputs_set_this_turn
        AgentLoop->>LLM: surface safe preview of stored value
    else invalid
        Handler-->>AgentLoop: return ToolResult(error)
        AgentLoop->>LLM: surface error
    end
Loading
sequenceDiagram
    participant Queue as TriggerQueue
    participant Drain as drain_trigger_queue
    participant Ctx as AgentContext
    participant Conv as Conversation

    Queue->>Drain: dequeue TriggerEvent(s)
    alt ctx provided
        Drain->>Ctx: inject payload keys into ctx.input_data when absent
        Drain->>Ctx: remove previously injected stale keys not in new payloads
    end
    Drain->>Conv: add_user_message(is_trigger=true, content=batched payload)
    Conv-->>Drain: message added
    Drain-->>Queue: return count
Loading
sequenceDiagram
    participant Trigger as Trigger Node
    participant Monitor as Monitor Node
    participant Analyze as Analyze Node
    participant Rebalance as Rebalance Node
    participant Log as Log Node
    participant CRM as CRM

    Trigger->>Trigger: check date is 1st
    alt first of month
        Trigger->>Monitor: emit is_first_of_month, month_year
        Monitor->>Analyze: deliver sales_data, unassigned_pool
        Analyze->>Rebalance: deliver rep_analysis, rebalance_candidates
        Rebalance->>Log: deliver rebalance_actions
        Log->>CRM: persist reassignment logs
        Log->>AgentLoop: set_output(summary_report)
    else not first
        Trigger-->>Trigger: no-op
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐇 I hopped through changes, quick and spry,

set_output tucked outputs safe and spry.
Triggers whispered payloads into ctx,
Prompts now show inputs up next —
Sales rebalance hums beneath the sky.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'Feat/new sales agent' clearly and concisely summarizes the main change—the addition of a new Sales Ops Agent template.
Linked Issues check ✅ Passed The PR implementation fully addresses all primary objectives from issue #6836: monthly trigger check, monitor stage with data collection, analyze stage with metric computation and rebalancing flagging, rebalance stage with account reassignment and constraint handling, log stage with CRM updates, and all key requirements including set_output/input_keys support, MCP-only tools, and edge case handling.
Out of Scope Changes check ✅ Passed All code changes are directly scoped to either the new Sales Ops Agent implementation or framework enhancements required to support it, including set_output tool support, input data injection, and storage reliability improvements.
Docstring Coverage ✅ Passed Docstring coverage is 88.14% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/framework/orchestrator/context.py (1)

158-179: ⚠️ Potential issue | 🟡 Minor

Docstring contradicts the new minimal branch.

The function docstring at lines 165-166 still says “Framework-default tools (_ALWAYS_AVAILABLE_TOOLS) are always included regardless of policy.” That is no longer true: the minimal branch at lines 178-179 returns only _MINIMAL_TOOLS and skips the always_tools merge entirely (which is in fact the intent — minimal nodes should not get read_file/write_file/etc., per the comment at lines 40-41). Please update the docstring so callers don't rely on the old guarantee.

📝 Suggested wording
     - ``"none"``     -- only framework-default tools (read_file, set_output, etc.).
     - ``"minimal"``  -- only essential tools (set_output, escalate). No file I/O.

-    Framework-default tools (``_ALWAYS_AVAILABLE_TOOLS``) are always included
-    regardless of policy — agents need file I/O and output/escalate to function.
+    Framework-default tools (``_ALWAYS_AVAILABLE_TOOLS``) are included for
+    every policy except ``"minimal"``, which intentionally restricts the
+    surface to ``_MINIMAL_TOOLS`` (no file I/O).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/framework/orchestrator/context.py` around lines 158 - 179, Update the
docstring to remove the statement that framework-default tools
(`_ALWAYS_AVAILABLE_TOOLS`) are always included regardless of policy and clearly
document the three policies: "explicit" (tools in `node_spec.tools` plus
`_ALWAYS_AVAILABLE_TOOLS`), "none" (only `_ALWAYS_AVAILABLE_TOOLS`), and
"minimal" (only `_MINIMAL_TOOLS`, excluding `_ALWAYS_AVAILABLE_TOOLS` such as
file I/O). Mention that `override_tools` still merges with
`_ALWAYS_AVAILABLE_TOOLS` (deduped by name) and that the `minimal` policy
short-circuits to return only `_MINIMAL_TOOLS`.
core/framework/agent_loop/agent_loop.py (1)

859-867: ⚠️ Potential issue | 🟠 Major

Preserve set_output during dynamic tool refresh.

This branch only keeps ask_user and escalate, so any node with output_keys loses the synthetic set_output tool after the first refresh. Worker report_to_parent gets dropped for the same reason.

Suggested fix
-                _synthetic_names = {
-                    "ask_user",
-                    "escalate",
-                }
+                _synthetic_names = {
+                    "ask_user",
+                    "escalate",
+                    "set_output",
+                    "report_to_parent",
+                }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/framework/agent_loop/agent_loop.py` around lines 859 - 867, The dynamic
tools refresh in the agent loop currently only preserves tools named "ask_user"
and "escalate", causing synthetic tools like "set_output" and "report_to_parent"
(and any tool that carries node output semantics) to be dropped; update the
branch in agent_loop.py where ctx.dynamic_tools_provider is used (the block that
builds _synthetic_names and `synthetic = [t for t in tools if t.name in
_synthetic_names]`) to also preserve the "set_output" and "report_to_parent"
tools and any tool that has non-empty output_keys: either add "set_output" and
"report_to_parent" to the _synthetic_names set and/or extend the synthetic
selection to include tools with t.output_keys (or equivalent attribute) so those
synthetic/output-carrying tools are appended after refreshing via
ctx.dynamic_tools_provider().
🧹 Nitpick comments (8)
core/framework/storage/concurrent.py (2)

78-82: Filesystem side effects in __init__.

Performing mkdir during construction means any test, introspection, or dependency-injection wiring that instantiates ConcurrentStorage (e.g., with a temp/placeholder path) will now create directories on disk eagerly. If start() is the documented lifecycle entry point, consider moving directory creation there (or into a dedicated ensure_layout() method) so construction stays side-effect-free. Not a blocker, just worth noting given how __init__ is typically used in DI/testing contexts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/framework/storage/concurrent.py` around lines 78 - 82, The constructor
of ConcurrentStorage currently creates directories (self.base_path.mkdir and
subdirectory mkdir calls) causing filesystem side effects during instantiation;
move that logic out of ConcurrentStorage.__init__ into the lifecycle method
ConcurrentStorage.start() (or create a new private method like
ConcurrentStorage.ensure_layout() that start() calls) so the class remains
side-effect-free at construction; update start() to call ensure_layout()/perform
the mkdirs and remove the mkdir calls from __init__, keeping behavior identical
otherwise.

203-206: Redundant base_path.mkdir call.

__init__ already creates base_path, runs/, and summaries/ (lines 79–81), and runs_dir.mkdir(parents=True, exist_ok=True) on line 206 will recreate any missing ancestors anyway. The added self.base_path.mkdir(...) on line 204 is redundant and can be dropped.

♻️ Proposed cleanup
         self._validate_key(run.id)
-        # Ensure base_path exists before creating runs/ subdirectory
-        self.base_path.mkdir(parents=True, exist_ok=True)
         runs_dir = self.base_path / "runs"
         runs_dir.mkdir(parents=True, exist_ok=True)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/framework/storage/concurrent.py` around lines 203 - 206, The extra call
to self.base_path.mkdir(parents=True, exist_ok=True) is redundant because
__init__ already creates base_path, runs, and summaries and
runs_dir.mkdir(parents=True, exist_ok=True) will create any missing ancestors;
remove the redundant self.base_path.mkdir(...) invocation from the block so only
runs_dir = self.base_path / "runs" followed by runs_dir.mkdir(parents=True,
exist_ok=True) remains, referencing the base_path attribute and runs_dir
variable in the current method.
core/framework/utils/io.py (1)

10-12: Defensive parent-dir creation may mask caller-side path mistakes.

Auto-creating path.parent is convenient but means any caller passing a typo'd/unexpected path will silently get a new directory tree instead of a clear FileNotFoundError. Given that ConcurrentStorage.__init__ and _save_run_sync already ensure the relevant directories exist, the additional safety here is largely redundant for the in-tree call sites. Consider either documenting this behavior in the docstring or limiting it to a single layer (exist_ok=True without parents=True) so deeply-misrouted writes still fail loudly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/framework/utils/io.py` around lines 10 - 12, The defensive creation of
deep parent directories via path.parent.mkdir(parents=True, exist_ok=True) can
mask caller mistakes; replace it with a single-level creation (remove
parents=True so it becomes path.parent.mkdir(exist_ok=True)) so deeply-misrouted
writes raise, and update the docstring for the containing function in
core/framework/utils/io.py to state that missing ancestor directories are not
auto-created and callers should rely on ConcurrentStorage.__init__ and
_save_run_sync to prepare directories; run tests to ensure callers still pass
existing parent paths.
core/framework/orchestrator/prompting.py (1)

208-217: Optional: extract the INPUT DATA renderer to avoid drift with agent_loop/prompting.py.

Identical (modulo formatting) logic now lives in both core/framework/agent_loop/prompting.py (lines 88-97) and here. They're two separate modules with their own PromptSpec/NodePromptSpec, so a small shared helper like render_input_data_block(input_data: dict) -> str | None in a neutral module would prevent future drift (e.g., if you change separators, ordering, or how None values are rendered, both call sites must be updated). Not a blocker.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/framework/orchestrator/prompting.py` around lines 208 - 217, Duplicate
logic building the "INPUT DATA" block exists in prompting.py (uses
spec.input_data and parts.append) and agent_loop/prompting.py; extract that
logic into a shared helper e.g., render_input_data_block(input_data: dict) ->
str | None in a neutral module (e.g., core.framework.prompting_utils) and
replace the inline loop in both locations to call this helper, ensuring it
returns None when no data and preserves current behavior (skips None values,
formats "INPUT DATA:" then "  - key: value" lines, and only appends when
non-empty).
examples/templates/sales_ops_agent/__main__.py (2)

227-227: Use asyncio.get_running_loop() inside a coroutine.

asyncio.get_event_loop() is deprecated when there is no current loop and emits a DeprecationWarning since Python 3.10; from inside a running coroutine you should use get_running_loop(). Better still, use asyncio.to_thread for blocking input.

🔧 Proposed fix
-                    command = await asyncio.get_event_loop().run_in_executor(None, input, "sales-ops> ")
+                    command = await asyncio.to_thread(input, "sales-ops> ")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/templates/sales_ops_agent/__main__.py` at line 227, The code uses
asyncio.get_event_loop().run_in_executor(None, input, "sales-ops> ") inside a
coroutine which can emit DeprecationWarnings; change this to either use
asyncio.get_running_loop().run_in_executor(None, input, "sales-ops> ") or,
preferably, replace the whole call with asyncio.to_thread(input, "sales-ops> ")
to run the blocking input call off the event loop — update the assignment to
command accordingly where the current run_in_executor call appears.

118-171: Refactor: reuse SalesOpsAgent._setup instead of re-implementing it.

run_with_tui re-creates storage path, tool registry, MCP loading, LLM, graph, and AgentHost — all of which SalesOpsAgent._setup/start() already do. Keeping two copies invites drift (the mock-mode behavior already drifted, see prior comment; the EntryPointSpec here also uses id="start"/isolation_level="isolated" while _setup registers id="default"/isolation_level="shared").

Consider exposing the runtime via agent.start(mock_mode=mock) and registering the TUI entry point through agent._agent_runtime.register_entry_point(...) (or adding a small helper on SalesOpsAgent). This keeps mock-mode handling, MCP skipping, checkpoint config, and tool discovery in one place.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/templates/sales_ops_agent/__main__.py` around lines 118 - 171, The
run_with_tui function duplicates SalesOpsAgent setup (storage, ToolRegistry,
MCP/tools load, LLM, graph, and AgentHost); refactor to call the agent's
existing setup/start helper instead: call agent.start(mock_mode=mock) or
agent._setup(...) to create and initialize the runtime, then register the TUI
entry point on the existing runtime via
agent._agent_runtime.register_entry_point(...) (or add a small helper on
SalesOpsAgent to register a TUI entry point) so mock-mode handling, MCP
skipping, checkpoint config, tool discovery, and entry_point metadata
(id/isolation_level) remain consistent and avoid drift.
examples/templates/sales_ops_agent/agent.py (2)

293-300: run() start/stop per invocation is fine for one-shot CLI; flag for repeated use.

Each call to run() performs full start() (incl. MCP server bootstrap) and stop(). For the __main__.py run command this is acceptable, but if anything programmatic ever calls default_agent.run(...) in a loop (tests, batch backfills) it will pay the MCP/checkpoint setup cost every iteration and may also hit the "cleanup timeouts" alluded to in _setup. Worth a docstring note that this is intended for single-shot use; for repeated execution, callers should start() once and use trigger_and_wait directly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/templates/sales_ops_agent/agent.py` around lines 293 - 300, The
run() method currently calls start() and stop() on every invocation which
bootstraps the MCP server and checkpointing (see run, start, stop,
trigger_and_wait, and _setup), causing heavy overhead and potential cleanup
timeouts when used repeatedly; update the run() docstring to clearly state it's
intended for single-shot CLI use and add guidance to callers to call start()
once and then use trigger_and_wait("default", ...) for repeated executions (or
provide an optional parameter like reuse_session or no_start to bypass
start/stop for programmatic loops) so repeated programmatic calls avoid
reinitializing MCP and checkpoint setup.

321-348: Consider delegating to GraphSpec.validate() to avoid drift.

The framework already validates edge endpoints, entry/terminal nodes, and entry-point references inside GraphSpec (per core/framework/orchestrator/edge.py). Re-implementing a subset here means: (a) any new validation rule added in the framework won’t apply to this agent until duplicated, and (b) easy-to-miss checks like duplicate node IDs and orphan/unreachable nodes are silently skipped.

Consider building the graph and delegating, e.g.:

def validate(self):
    graph = self._build_graph()
    result = graph.validate()  # or whatever the framework exposes
    return {"valid": not result.errors, "errors": result.errors, "warnings": result.warnings}

Falling back to the local checks only if the framework doesn’t expose a public validate() from GraphSpec.

Does GraphSpec (core/framework/orchestrator/edge.py) expose a public validate() method that returns errors/warnings? The relevant snippet only shows the class shape, not its validate implementation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/templates/sales_ops_agent/agent.py` around lines 321 - 348, The
validate method in agent.py re-implements graph validation logic locally causing
drift; change it to build the framework GraphSpec and delegate validation to its
public validate() (use self._build_graph() to construct the graph and call
graph.validate()), then map the returned result into {"valid": ..., "errors":
..., "warnings": ...}; if GraphSpec does not expose a public validate(), fall
back to the existing local checks but add any missing checks (duplicate node
IDs, orphan/unreachable nodes) so behavior matches framework validation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/framework/agent_loop/internals/cursor_persistence.py`:
- Around line 248-261: The injected trigger payload is persisting across loop
iterations because ctx.input_data is long-lived; in drain_trigger_queue you
should avoid mutating ctx.input_data with sticky keys. Either (A) remove
previously injected trigger keys before injecting this batch by tracking keys
injected in the prior drain and deleting them from ctx.input_data, or (B) stop
inlining keys altogether and place the whole batch under a dedicated namespace
(e.g., ctx.input_data["_trigger_payload"] or similar) and populate that
namespace with the current triggers payloads; update references to use that
namespace and ensure you only read from ctx.input_data for explicit user inputs,
not stale trigger fields. Ensure changes target the injection logic that
iterates over triggers and keys in cursor_persistence.drain_trigger_queue (ctx,
triggers, ctx.input_data).

In `@core/framework/agent_loop/internals/tests/test_cursor_persistence.py`:
- Around line 47-72: The test for drain_trigger_queue only asserts that
conflicting keys are not overwritten but misses verifying that non-conflicting
payload keys are merged; update the TriggerEvent payload to include at least one
new key (e.g., "new_key") and after calling drain_trigger_queue(assert count ==
1) also assert that ctx.input_data contains the new key with the payload value
while still preserving "current_date" unchanged. Reference: drain_trigger_queue,
TriggerEvent, and ctx.input_data to locate and modify the test.

In `@core/framework/agent_loop/prompting.py`:
- Line 25: The declared default for input_data in PromptSpec is an empty tuple
(input_data: dict[str, Any] = ()), causing type/behavior bugs; change it to use
a proper empty dict factory like field(default_factory=dict) (matching
NodePromptSpec) and remove the "# type: ignore[assignment]" so the type system
catches issues; update the PromptSpec definition in
core/framework/agent_loop/prompting.py (the input_data attribute) accordingly so
callers can safely call spec.input_data.items().

In `@core/framework/host/stream_runtime.py`:
- Around line 177-182: The except block in _save_run references a non-existent
private attribute self._storage._base_path which raises AttributeError; update
the logic to use the public attribute self._storage.base_path (or call a storage
API) when checking run_path existence (look for _save_run and usage of
self._storage in stream_runtime.py and ConcurrentStorage in
core/framework/storage/concurrent.py), and preferably move/signal this
suppression behavior into ConcurrentStorage (e.g., consolidate into _flush_batch
or a new helper) so callers like stream_runtime do not rely on internal storage
attributes.

In `@core/framework/storage/concurrent.py`:
- Around line 429-435: The current exception handler in _save_run_locked uses a
file-exists check (self.base_path / "runs" / f"{item.id}.json").exists() to
treat failures as harmless shutdown races, which can mask real write failures;
change the logic in the except block to only suppress logging for explicit
shutdown/cancellation exceptions (e.g., asyncio.CancelledError and known
loop-closed RuntimeError variants) and otherwise log the error as an actual
failure, and update the comment to explain that only cancellation/loop-close is
treated as a benign race; reference the symbols item_type, _save_run_locked, and
self.base_path / "runs" / f"{item.id}.json" to locate the handler to modify.

In `@examples/templates/sales_ops_agent/__main__.py`:
- Around line 220-267: The agent is started on a different (now-closed) event
loop; remove the top-level asyncio.run(agent.start()) and instead call await
agent.start() inside the _interactive_shell coroutine before entering the input
loop so SalesOpsAgent runtime (agent.start), its async primitives and background
tasks are created on the same loop used by asyncio.run(_interactive_shell());
keep the existing await agent.stop() in the finally block to shut down the same
runtime, and ensure all calls to agent.trigger_and_wait and agent.stop run on
that single event loop.
- Around line 133-161: The current TUI mock-mode sets llm=None and always loads
MCP, which diverges from agent.py::_setup; instead, when mock is True
instantiate MockLLMProvider (use MockLLMProvider() in place of LiteLLMProvider)
and pass that into AgentHost, and skip/unload the MCP configuration loading when
mock is True (mirror the conditional behavior around MCP in agent.py::_setup) so
AgentHost/graph nodes receive a proper mock LLM and MCP is not loaded in mock
mode.

In `@examples/templates/sales_ops_agent/mcp_servers.json`:
- Around line 11-15: Update the template README to document that the
mcp_servers.json template sets the environment flag
INCLUDE_UNVERIFIED_TOOLS=true (used by tools/mcp_server.py) and explain the
security posture and rationale: note that this enables unverified/community tool
integrations, the associated risks (expanded tool surface and potential unvetted
code), and the reason it was enabled for this template (e.g., "Required for
Salesforce support"); also recommend when to change or remove the flag for
production use and link to any mitigation steps or review guidance.

In `@examples/templates/sales_ops_agent/nodes/__init__.py`:
- Around line 17-18: The trigger_node currently emits is_first_of_month but
monitor_node never consumes or checks it, so the graph continues
unconditionally; update monitor_node to accept is_first_of_month (add
"is_first_of_month" to monitor_node.input_keys) and modify monitor_node's logic
(in its handler/function) to gate downstream actions by returning/continuing
only when is_first_of_month is truthy, and ensure the graph wiring connects
trigger_node's output to monitor_node's input so the emitted value is passed
through.
- Around line 66-72: The demo branch sets outputs via load_demo_sales_data and
set_output but downstream stages still hardcode filenames, so update the data
flow to propagate the actual filenames: ensure load_demo_sales_data is called
(with its retry + escalate behavior) and set_output("sales_data",
sales_reps_file) and set_output("unassigned_pool", unassigned_accounts_file) (or
otherwise set the values to the exact filenames you get from
load_demo_sales_data) and make downstream nodes read those values via input_keys
rather than hardcoded "sales_data.jsonl"/"unassigned_pool.jsonl"; locate
references to set_output, load_demo_sales_data, input_keys and the downstream
prompt/reader logic and change the hardcoded names to use the propagated keys.

In `@examples/templates/sales_ops_agent/README.md`:
- Around line 23-30: The README instructs to "cd
examples/templates/sales_ops_agent" then run "python -m sales_ops_agent …",
which fails because Python will look for a nested sales_ops_agent package; fix
by either removing the cd and instructing users to run the module from the
parent directory (e.g., run the shown "python -m sales_ops_agent run --crm-type
demo" from examples/templates or the repository root where sales_ops_agent is
importable) or change the cd to the parent folder (cd examples/templates) before
running the "python -m sales_ops_agent ..." commands so the module can be found.

In `@examples/templates/sales_ops_agent/tools.py`:
- Around line 125-126: The code constructs file_path by joining user-supplied
filename to data_dir (data_dir = _get_data_dir(); file_path = Path(data_dir) /
filename) which allows path traversal; update load_data, append_data, and
load_demo_sales_data to resolve the candidate path (e.g., use
Path(...).resolve()) and verify it is inside the session workspace by checking
that the resolved path is relative to the resolved data_dir (e.g., using
resolved_file_path.relative_to(resolved_data_dir) or equivalent); if the check
fails, reject the request by raising an exception (ValueError/PermissionError)
and do not perform any file read/write. Ensure the same protection is applied to
the other occurrences noted (lines ~163-166 and 195-199) that build file paths
from model-controlled input.
- Around line 22-38: The load_data Tool currently always returns the first 10
records but its parameters/schema do not expose any pagination inputs, so add
pagination inputs (e.g., "offset" or "cursor" and optionally "limit") to the
Tool parameters and update the handler for load_data to accept those inputs,
read the file slice starting at the given offset/cursor, return the page of
records and a next cursor/next_offset (or null) plus the existing has_more flag;
update the Tool schema properties (e.g., "filename", "offset"/"cursor", "limit")
and the response structure (e.g., include "next_offset" or "next_cursor") so
callers can request subsequent pages—apply the same change to the other
occurrence referenced (lines 139-145) to keep pagination consistent.

---

Outside diff comments:
In `@core/framework/agent_loop/agent_loop.py`:
- Around line 859-867: The dynamic tools refresh in the agent loop currently
only preserves tools named "ask_user" and "escalate", causing synthetic tools
like "set_output" and "report_to_parent" (and any tool that carries node output
semantics) to be dropped; update the branch in agent_loop.py where
ctx.dynamic_tools_provider is used (the block that builds _synthetic_names and
`synthetic = [t for t in tools if t.name in _synthetic_names]`) to also preserve
the "set_output" and "report_to_parent" tools and any tool that has non-empty
output_keys: either add "set_output" and "report_to_parent" to the
_synthetic_names set and/or extend the synthetic selection to include tools with
t.output_keys (or equivalent attribute) so those synthetic/output-carrying tools
are appended after refreshing via ctx.dynamic_tools_provider().

In `@core/framework/orchestrator/context.py`:
- Around line 158-179: Update the docstring to remove the statement that
framework-default tools (`_ALWAYS_AVAILABLE_TOOLS`) are always included
regardless of policy and clearly document the three policies: "explicit" (tools
in `node_spec.tools` plus `_ALWAYS_AVAILABLE_TOOLS`), "none" (only
`_ALWAYS_AVAILABLE_TOOLS`), and "minimal" (only `_MINIMAL_TOOLS`, excluding
`_ALWAYS_AVAILABLE_TOOLS` such as file I/O). Mention that `override_tools` still
merges with `_ALWAYS_AVAILABLE_TOOLS` (deduped by name) and that the `minimal`
policy short-circuits to return only `_MINIMAL_TOOLS`.

---

Nitpick comments:
In `@core/framework/orchestrator/prompting.py`:
- Around line 208-217: Duplicate logic building the "INPUT DATA" block exists in
prompting.py (uses spec.input_data and parts.append) and
agent_loop/prompting.py; extract that logic into a shared helper e.g.,
render_input_data_block(input_data: dict) -> str | None in a neutral module
(e.g., core.framework.prompting_utils) and replace the inline loop in both
locations to call this helper, ensuring it returns None when no data and
preserves current behavior (skips None values, formats "INPUT DATA:" then "  -
key: value" lines, and only appends when non-empty).

In `@core/framework/storage/concurrent.py`:
- Around line 78-82: The constructor of ConcurrentStorage currently creates
directories (self.base_path.mkdir and subdirectory mkdir calls) causing
filesystem side effects during instantiation; move that logic out of
ConcurrentStorage.__init__ into the lifecycle method ConcurrentStorage.start()
(or create a new private method like ConcurrentStorage.ensure_layout() that
start() calls) so the class remains side-effect-free at construction; update
start() to call ensure_layout()/perform the mkdirs and remove the mkdir calls
from __init__, keeping behavior identical otherwise.
- Around line 203-206: The extra call to self.base_path.mkdir(parents=True,
exist_ok=True) is redundant because __init__ already creates base_path, runs,
and summaries and runs_dir.mkdir(parents=True, exist_ok=True) will create any
missing ancestors; remove the redundant self.base_path.mkdir(...) invocation
from the block so only runs_dir = self.base_path / "runs" followed by
runs_dir.mkdir(parents=True, exist_ok=True) remains, referencing the base_path
attribute and runs_dir variable in the current method.

In `@core/framework/utils/io.py`:
- Around line 10-12: The defensive creation of deep parent directories via
path.parent.mkdir(parents=True, exist_ok=True) can mask caller mistakes; replace
it with a single-level creation (remove parents=True so it becomes
path.parent.mkdir(exist_ok=True)) so deeply-misrouted writes raise, and update
the docstring for the containing function in core/framework/utils/io.py to state
that missing ancestor directories are not auto-created and callers should rely
on ConcurrentStorage.__init__ and _save_run_sync to prepare directories; run
tests to ensure callers still pass existing parent paths.

In `@examples/templates/sales_ops_agent/__main__.py`:
- Line 227: The code uses asyncio.get_event_loop().run_in_executor(None, input,
"sales-ops> ") inside a coroutine which can emit DeprecationWarnings; change
this to either use asyncio.get_running_loop().run_in_executor(None, input,
"sales-ops> ") or, preferably, replace the whole call with
asyncio.to_thread(input, "sales-ops> ") to run the blocking input call off the
event loop — update the assignment to command accordingly where the current
run_in_executor call appears.
- Around line 118-171: The run_with_tui function duplicates SalesOpsAgent setup
(storage, ToolRegistry, MCP/tools load, LLM, graph, and AgentHost); refactor to
call the agent's existing setup/start helper instead: call
agent.start(mock_mode=mock) or agent._setup(...) to create and initialize the
runtime, then register the TUI entry point on the existing runtime via
agent._agent_runtime.register_entry_point(...) (or add a small helper on
SalesOpsAgent to register a TUI entry point) so mock-mode handling, MCP
skipping, checkpoint config, tool discovery, and entry_point metadata
(id/isolation_level) remain consistent and avoid drift.

In `@examples/templates/sales_ops_agent/agent.py`:
- Around line 293-300: The run() method currently calls start() and stop() on
every invocation which bootstraps the MCP server and checkpointing (see run,
start, stop, trigger_and_wait, and _setup), causing heavy overhead and potential
cleanup timeouts when used repeatedly; update the run() docstring to clearly
state it's intended for single-shot CLI use and add guidance to callers to call
start() once and then use trigger_and_wait("default", ...) for repeated
executions (or provide an optional parameter like reuse_session or no_start to
bypass start/stop for programmatic loops) so repeated programmatic calls avoid
reinitializing MCP and checkpoint setup.
- Around line 321-348: The validate method in agent.py re-implements graph
validation logic locally causing drift; change it to build the framework
GraphSpec and delegate validation to its public validate() (use
self._build_graph() to construct the graph and call graph.validate()), then map
the returned result into {"valid": ..., "errors": ..., "warnings": ...}; if
GraphSpec does not expose a public validate(), fall back to the existing local
checks but add any missing checks (duplicate node IDs, orphan/unreachable nodes)
so behavior matches framework validation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 60e3499e-a134-4054-b952-a02c7a98ff7e

📥 Commits

Reviewing files that changed from the base of the PR and between 445c960 and fa03c93.

📒 Files selected for processing (23)
  • .gitignore
  • core/framework/agent_loop/agent_loop.py
  • core/framework/agent_loop/internals/cursor_persistence.py
  • core/framework/agent_loop/internals/tests/__init__.py
  • core/framework/agent_loop/internals/tests/test_cursor_persistence.py
  • core/framework/agent_loop/prompting.py
  • core/framework/host/stream_runtime.py
  • core/framework/orchestrator/context.py
  • core/framework/orchestrator/prompting.py
  • core/framework/storage/concurrent.py
  • core/framework/utils/io.py
  • examples/templates/sales_ops_agent/README.md
  • examples/templates/sales_ops_agent/__init__.py
  • examples/templates/sales_ops_agent/__main__.py
  • examples/templates/sales_ops_agent/agent.json
  • examples/templates/sales_ops_agent/agent.py
  • examples/templates/sales_ops_agent/config.py
  • examples/templates/sales_ops_agent/demo_data.json
  • examples/templates/sales_ops_agent/flowchart.json
  • examples/templates/sales_ops_agent/mcp_servers.json
  • examples/templates/sales_ops_agent/nodes/__init__.py
  • examples/templates/sales_ops_agent/tools.py
  • examples/templates/sales_ops_agent/triggers.json

Comment thread core/framework/agent_loop/internals/cursor_persistence.py
Comment thread core/framework/agent_loop/prompting.py Outdated
Comment thread core/framework/host/stream_runtime.py
Comment thread core/framework/storage/concurrent.py
Comment thread examples/templates/sales_ops_agent/nodes/__init__.py
Comment thread examples/templates/sales_ops_agent/nodes/__init__.py
Comment thread examples/templates/sales_ops_agent/README.md
Comment thread examples/templates/sales_ops_agent/tools.py
Comment thread examples/templates/sales_ops_agent/tools.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/framework/agent_loop/agent_loop.py`:
- Around line 3000-3015: The debug log can raise TypeError when value is
non-subscriptable; fix by computing a safe preview before logging and move the
log after side-effects: after await accumulator.set(key, value) append
outputs_set_this_turn and assign results_by_id[tc.tool_use_id] = result, then
compute preview = value[:100] if isinstance(value, str) else
(json.dumps(value)[:100] if value is not None else None) (import json if needed)
or simply preview = str(value)[:100]; finally call logger.debug("[%s]
set_output: %s = %s", node_id, key, preview) so slicing won’t run on non-strings
and the accumulator/results remain in sync if logging fails.
- Around line 670-673: The type mismatch arises because
ctx.agent_spec.output_keys is tuple[str, ...] but _build_set_output_tool and
handle_set_output (and the equivalent build_set_output_tool/handle_set_output in
synthetic_tools.py) are annotated as list[str] | None; update their parameter
annotations to accept both list and tuple (e.g., list[str] | tuple[str, ...] |
None or more general Sequence[str] | None) and adjust any internal typing uses
accordingly so callers passing ctx.agent_spec.output_keys no longer produce type
errors for _build_set_output_tool and handle_set_output.
- Line 105: Update the type annotation for _drain_trigger_queue to accept
AgentContext | None instead of NodeContext | None and remove the now-unused
import of NodeContext; specifically, change the function signature of
_drain_trigger_queue to use AgentContext (matching _execute_impl which passes
ctx: AgentContext and consistent with _drain_injection_queue) and delete the
import line that brings in NodeContext from framework.orchestrator.node.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 371e4930-2c54-4eeb-b2a3-5b8ffd16cfa6

📥 Commits

Reviewing files that changed from the base of the PR and between fa03c93 and 04ee994.

📒 Files selected for processing (1)
  • core/framework/agent_loop/agent_loop.py

Comment thread core/framework/agent_loop/agent_loop.py Outdated
Comment thread core/framework/agent_loop/agent_loop.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (4)
core/framework/host/stream_runtime.py (1)

177-182: Prior _base_path typo addressed; consider centralizing the suppression in storage.

The attribute is now correctly self._storage.base_path (public), so the previously flagged AttributeError is resolved and the suppression logic is functionally correct.

That said, this duplicates the same shutdown-race check already in ConcurrentStorage._flush_batch (per core/framework/storage/concurrent.py:422-445). Since save_run() exceptions only propagate here on the immediate/shutdown path, you could consolidate by exposing a small helper on ConcurrentStorage (e.g., run_exists(run_id) or a save_run variant that handles the race internally) so callers don't need to know the on-disk layout (runs/{id}.json). Not blocking — this is a maintainability nudge to keep filesystem layout knowledge inside the storage class.

♻️ Optional refactor sketch

In core/framework/storage/concurrent.py:

def run_exists(self, run_id: str) -> bool:
    return (self.base_path / "runs" / f"{run_id}.json").exists()

Then here:

         except Exception as e:
             # Suppress error if file actually exists (shutdown race condition)
-            run_path = self._storage.base_path / "runs" / f"{run.id}.json"
-            if run_path.exists():
+            if self._storage.run_exists(run.id):
                 logger.debug(f"Run {run.id} saved despite exception (shutdown race): {e}")
             else:
                 logger.error(f"Failed to save run {run.id}: {e}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/framework/host/stream_runtime.py` around lines 177 - 182, The
shutdown-race suppression logic in stream_runtime.save_run (using
self._storage.base_path / "runs" / f"{run.id}.json) duplicates functionality in
ConcurrentStorage._flush_batch; add a small helper on the storage class (e.g.,
ConcurrentStorage.run_exists or a storage.save_run variant that handles the
race) and call that from stream_runtime instead of inspecting base_path
directly—move the path knowledge and exists-check into ConcurrentStorage
(referencing ConcurrentStorage._flush_batch for the existing pattern and
stream_runtime.save_run which currently does the direct base_path check).
examples/templates/sales_ops_agent/README.md (1)

219-235: Clarify that Salesforce/HubSpot tools come from MCP, not from tools.py.

The "Tools Required" section lists salesforce_*/hubspot_* alongside load_data/append_data, which can read like they're part of this template's tools.py. They actually arrive via mcp_servers.json (the very reason for INCLUDE_UNVERIFIED_TOOLS=true mentioned in the security note). A one-line callout — e.g. "Provided by the Hive Tools MCP server; loaded via mcp_servers.json" — would prevent confusion when readers grep tools.py and don't find them. Also worth listing load_demo_sales_data and demo_log_action here for completeness.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/templates/sales_ops_agent/README.md` around lines 219 - 235, The
Tools Required list is ambiguous about the source of the salesforce_* and
hubspot_* tools; update the README’s "Tools Required" block to add a one-line
callout that those tools are provided by the Hive Tools MCP server and are
loaded via mcp_servers.json (i.e., not defined in this template's tools.py), and
also add the missing demo utility tools load_demo_sales_data and demo_log_action
to the Data Management section so readers know those are available too.
examples/templates/sales_ops_agent/__main__.py (1)

107-111: Send the install hint to stderr.

The tui failure path mixes the user-facing error into stdout; downstream callers parsing --quiet JSON output (or piping info --json) get a cleaner contract if errors go to stderr.

📝 Proposed fix
-    try:
-        from framework.tui.app import AdenTUI
-    except ImportError:
-        click.echo("TUI requires the 'textual' package. Install with: pip install textual")
-        sys.exit(1)
+    try:
+        from framework.tui.app import AdenTUI
+    except ImportError:
+        click.echo("TUI requires the 'textual' package. Install with: pip install textual", err=True)
+        sys.exit(1)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/templates/sales_ops_agent/__main__.py` around lines 107 - 111, The
current ImportError handler for importing AdenTUI prints the install hint to
stdout; change it to write to stderr so errors don't contaminate stdout
JSON/quiet output—locate the try/except that imports AdenTUI in __main__.py and
replace the click.echo call in the except block to emit to stderr (e.g., use
click.echo with err=True or write to sys.stderr) and keep the same exit(1)
behavior.
examples/templates/sales_ops_agent/tools.py (1)

145-148: Lost exception cause when re-raising.

Raising a new ValueError inside an except ValueError clause without from e (or from None) makes debugging path-traversal failures harder by hiding the original relative_to failure context.

📝 Proposed fix
-    try:
-        resolved_path.relative_to(data_dir_path)
-    except ValueError:
-        raise ValueError(
-            f"Path traversal detected: '{filename}' resolves outside the data directory. "
-            f"Only files within {data_dir_path} are accessible."
-        )
+    try:
+        resolved_path.relative_to(data_dir_path)
+    except ValueError as e:
+        raise ValueError(
+            f"Path traversal detected: '{filename}' resolves outside the data directory. "
+            f"Only files within {data_dir_path} are accessible."
+        ) from e
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/templates/sales_ops_agent/tools.py` around lines 145 - 148, The
re-raise of ValueError when detecting path traversal currently loses the
original exception context; update the except block that handles the relative_to
failure so the new ValueError is raised "from e" (i.e., preserve the original
exception) — locate the block that references filename and data_dir_path and
change the raise to include the original exception (use except ValueError as e
and raise ValueError(... ) from e) so the original relative_to traceback is
retained for debugging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/framework/agent_loop/agent_loop.py`:
- Around line 669-672: The initial add of set_output via
self._build_set_output_tool(ctx.agent_spec.output_keys) is lost when the
ctx.dynamic_tools_provider branch rebuilds the tools list, so update the
dynamic-tools refresh logic to preserve the set_output tool: when you rebuild
tools in the ctx.dynamic_tools_provider branch (where ask_user and escalate are
currently retained), detect an existing set_output tool (or recreate it via
_build_set_output_tool with ctx.agent_spec.output_keys) and include it in the
rebuilt tools list alongside ask_user and escalate so nodes keep the ability to
emit required outputs mid-run.
- Around line 2999-3017: The set_output branch currently records a successful
ToolResult but isn't marked synthetic, which causes downstream logic that
filters "real tools" (used by doom-loop and natural-stop checks) to treat
set_output as MCP work; update the set_output handling in agent_loop.py so the
ToolResult stored in results_by_id is explicitly marked synthetic (e.g.,
add/assign a synthetic flag or set a field like synthetic=True on the ToolResult
or an equivalent marker on the tool use entry) so downstream filters that
inspect ToolResult (or results_by_id) will exclude set_output as a real tool
everywhere; ensure the synthetic marker is set for tc.tool_name == "set_output"
before any downstream logging or side-effects (references: the set_output
branch, ToolResult creation, results_by_id, outputs_set_this_turn).

In `@core/framework/agent_loop/prompting.py`:
- Around line 91-97: The code appends raw spec.input_data values into the system
prompt, allowing untrusted multiline/instruction-like text to break out of the
bullet and affect the prompt; change the loop that builds input_lines (the block
using spec.input_data, input_lines, and parts) to serialize each value as data
rather than interpolating raw text — for example, replace f"  - {key}: {value}"
with a quoted/escaped representation (JSON-encode or use repr-style escaping
that preserves newlines as \n and wraps the value in quotes) so values are
treated as data tokens and cannot inject prompt instructions or newlines.

In `@examples/templates/sales_ops_agent/__main__.py`:
- Around line 149-165: The code incorrectly passes an unsupported entry_points
kwarg to AgentHost.__init__ and also creates an entry-point mismatch
(id="start", isolation_level="isolated") versus the agent's registered entry
point (id="default", isolation_level="shared"); replace the manual AgentHost
construction by invoking agent.start(mock_mode=mock) to perform the correct
setup, then hand agent._agent_runtime into AdenTUI (or the TUI runner) and
ensure you call agent.stop() on shutdown to clean up—this removes the invalid
entry_points parameter and guarantees the entry point IDs/isolation levels match
the agent._setup registration.

In `@examples/templates/sales_ops_agent/README.md`:
- Around line 11-13: The three fenced code blocks in README.md (the pipeline
arrow block containing "Trigger → Monitor → Analyze → Rebalance → Log", the
data-flow diagram block starting with "current_date", and the ASCII architecture
box block starting with the box-drawing characters) lack a language hint; update
each opening triple-backtick to include the language tag "text" (i.e.,
"```text") so markdownlint MD040 is satisfied and the ASCII art is rendered with
the proper language hint.

In `@examples/templates/sales_ops_agent/tools.py`:
- Around line 112-117: The RuntimeError message in _get_data_dir currently
contains a grammar/jargon issue; update the raised message to use "an
Orchestrator" and prefer lowercase/internal wording (e.g., "orchestrator") so it
reads something like: "data_dir not set in execution context. Is the tool
running inside an orchestrator?" Keep this change only in the RuntimeError
raised by the _get_data_dir function.
- Around line 220-234: The current append_data implementation reopens and
re-reads the entire file to compute total_records causing O(n²) behavior on
repeated appends; instead remove the second file open/read and stop returning
total_records (or alternatively maintain an external counter), so update the
function (append_data) to write the JSON line to file_path and return
{"success": True, "filename": filename} (remove the count variable and the block
that checks file_path.exists() and iterates lines) and keep the existing
exception handling unchanged.
- Around line 87-92: The tool description for demo_log_action incorrectly states
it writes to demo_rebalance_log.json; update the description string in the Tool
instantiation (symbol: demo_log_action) to reference demo_rebalance_log.jsonl to
match the actual function that appends logs (symbol: _demo_log_action) so LLMs
and schema consumers see the correct file extension. Ensure only the
human-readable description text is changed and remains consistent with the
implementation.
- Around line 308-321: The audit log writes a naive local timestamp using an
inner import; change it to a timezone-aware UTC timestamp and hoist the import:
remove the in-function "from datetime import datetime" and instead import
datetime and timezone at module level (e.g., from datetime import datetime,
timezone), and when adding the timestamp to action_data use
datetime.now(timezone.utc).isoformat() (ensure this happens in the block where
action_data is populated and written to log_file).

---

Nitpick comments:
In `@core/framework/host/stream_runtime.py`:
- Around line 177-182: The shutdown-race suppression logic in
stream_runtime.save_run (using self._storage.base_path / "runs" /
f"{run.id}.json) duplicates functionality in ConcurrentStorage._flush_batch; add
a small helper on the storage class (e.g., ConcurrentStorage.run_exists or a
storage.save_run variant that handles the race) and call that from
stream_runtime instead of inspecting base_path directly—move the path knowledge
and exists-check into ConcurrentStorage (referencing
ConcurrentStorage._flush_batch for the existing pattern and
stream_runtime.save_run which currently does the direct base_path check).

In `@examples/templates/sales_ops_agent/__main__.py`:
- Around line 107-111: The current ImportError handler for importing AdenTUI
prints the install hint to stdout; change it to write to stderr so errors don't
contaminate stdout JSON/quiet output—locate the try/except that imports AdenTUI
in __main__.py and replace the click.echo call in the except block to emit to
stderr (e.g., use click.echo with err=True or write to sys.stderr) and keep the
same exit(1) behavior.

In `@examples/templates/sales_ops_agent/README.md`:
- Around line 219-235: The Tools Required list is ambiguous about the source of
the salesforce_* and hubspot_* tools; update the README’s "Tools Required" block
to add a one-line callout that those tools are provided by the Hive Tools MCP
server and are loaded via mcp_servers.json (i.e., not defined in this template's
tools.py), and also add the missing demo utility tools load_demo_sales_data and
demo_log_action to the Data Management section so readers know those are
available too.

In `@examples/templates/sales_ops_agent/tools.py`:
- Around line 145-148: The re-raise of ValueError when detecting path traversal
currently loses the original exception context; update the except block that
handles the relative_to failure so the new ValueError is raised "from e" (i.e.,
preserve the original exception) — locate the block that references filename and
data_dir_path and change the raise to include the original exception (use except
ValueError as e and raise ValueError(... ) from e) so the original relative_to
traceback is retained for debugging.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6c572166-4818-4e78-9e0c-11f09e59507b

📥 Commits

Reviewing files that changed from the base of the PR and between 04ee994 and 99ff582.

📒 Files selected for processing (11)
  • core/framework/agent_loop/agent_loop.py
  • core/framework/agent_loop/internals/cursor_persistence.py
  • core/framework/agent_loop/internals/synthetic_tools.py
  • core/framework/agent_loop/internals/tests/test_cursor_persistence.py
  • core/framework/agent_loop/prompting.py
  • core/framework/host/stream_runtime.py
  • core/framework/storage/concurrent.py
  • examples/templates/sales_ops_agent/README.md
  • examples/templates/sales_ops_agent/__main__.py
  • examples/templates/sales_ops_agent/nodes/__init__.py
  • examples/templates/sales_ops_agent/tools.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • core/framework/storage/concurrent.py
  • examples/templates/sales_ops_agent/nodes/init.py
  • core/framework/agent_loop/internals/cursor_persistence.py

Comment thread core/framework/agent_loop/agent_loop.py
Comment thread core/framework/agent_loop/agent_loop.py
Comment thread core/framework/agent_loop/prompting.py
Comment thread examples/templates/sales_ops_agent/__main__.py Outdated
Comment thread examples/templates/sales_ops_agent/README.md
Comment thread examples/templates/sales_ops_agent/tools.py
Comment thread examples/templates/sales_ops_agent/tools.py
Comment thread examples/templates/sales_ops_agent/tools.py
Comment thread examples/templates/sales_ops_agent/tools.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/framework/agent_loop/prompting.py (1)

67-80: ⚠️ Potential issue | 🟠 Major

Only lift declared inputs into the system prompt.

ctx.input_data is a shared runtime buffer, not a prompt-safe contract. Copying it wholesale into PromptSpec upgrades arbitrary upstream/trigger data to system-message authority and duplicates the same payloads that AgentLoop._build_initial_message() already injects as user content. Please restrict this to ctx.agent_spec.input_keys (or another explicit whitelist) instead of the full buffer.

♻️ Proposed fix
-    # Collect input_data for injection into system prompt
-    input_data = dict(getattr(ctx, "input_data", None) or {})
+    # Only promote declared prompt inputs to system-message scope.
+    raw_input_data = dict(getattr(ctx, "input_data", None) or {})
+    allowed_input_keys = set(getattr(ctx.agent_spec, "input_keys", ()) or ())
+    input_data = {key: raw_input_data[key] for key in allowed_input_keys if key in raw_input_data}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/framework/agent_loop/prompting.py` around lines 67 - 80, The current
code lifts the entire runtime buffer ctx.input_data into the PromptSpec, which
is unsafe; modify the construction of input_data passed to PromptSpec so it only
includes declared/whitelisted keys from ctx.agent_spec.input_keys (or an
explicit list) rather than the full buffer — i.e., read ctx.input_data safely
(default {}), then build a new dict by iterating over ctx.agent_spec.input_keys
and copying only those keys present, and pass that filtered dict into
PromptSpec.input_data; keep AgentLoop._build_initial_message() behavior
unchanged.
🧹 Nitpick comments (2)
examples/templates/sales_ops_agent/__main__.py (2)

158-165: shell lacks a --mock flag, unlike run and tui.

run and tui both expose --mock, but shell always calls agent.start() with the real LLM/MCP path. For a "demo" crm_type (the default here) users will likely want to also exercise mock LLM behavior without setting up API keys. Consider adding a --mock flag and forwarding it as agent.start(mock_mode=mock) for parity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/templates/sales_ops_agent/__main__.py` around lines 158 - 165, Add a
--mock flag to the shell command and forward it to agent.start so shell can run
the demo with mock LLM/MCP like run and tui; modify the click options on the
shell function to include a boolean click.option("--mock", is_flag=True)
parameter and pass that boolean into the call to agent.start as
agent.start(mock_mode=mock) (refer to the shell function and the agent.start
invocation to locate where to add and forward the flag).

178-178: Use asyncio.get_running_loop() instead of asyncio.get_event_loop() in coroutines.

asyncio.get_running_loop() is the recommended approach for code guaranteed to run within a coroutine. It clearly signals intent and will raise an error immediately if called outside an async context, rather than potentially returning an unexpected event loop.

♻️ Proposed change
-                    command = await asyncio.get_event_loop().run_in_executor(None, input, "sales-ops> ")
+                    command = await asyncio.get_running_loop().run_in_executor(None, input, "sales-ops> ")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/templates/sales_ops_agent/__main__.py` at line 178, Replace the use
of asyncio.get_event_loop() with asyncio.get_running_loop() when calling
run_in_executor in the async input loop; locate the line that assigns to command
using asyncio.get_event_loop().run_in_executor and update it to call
asyncio.get_running_loop().run_in_executor so the call enforces being inside a
running coroutine and fails fast outside async contexts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/framework/agent_loop/agent_loop.py`:
- Around line 3009-3019: The code persists outputs via await
accumulator.set(...) but never notifies listeners; after confirming success
(inside the if not result.is_error block, right after
outputs_set_this_turn.append(key)), call the existing _publish_output_key_set
helper to emit an OUTPUT_KEY_SET event for that key; pass the node identifier
(node_id), the key, and a safe/preview value (e.g., the same preview string
already computed or value truncated) and await the call so listeners receive the
per-key update as soon as set_output succeeds (keep this call colocated with
accumulator.set and before logging).

In `@examples/templates/sales_ops_agent/__main__.py`:
- Around line 106-110: The ImportError handler for the TUI tries to import
AdenTUI but prints the missing-dependency message to stdout; change the
click.echo call in the except block (the try/except around "from
framework.tui.app import AdenTUI") to write to stderr by passing err=True so the
message is printed to stderr and is consistent with other error paths and run's
behavior, then exit as before.

---

Outside diff comments:
In `@core/framework/agent_loop/prompting.py`:
- Around line 67-80: The current code lifts the entire runtime buffer
ctx.input_data into the PromptSpec, which is unsafe; modify the construction of
input_data passed to PromptSpec so it only includes declared/whitelisted keys
from ctx.agent_spec.input_keys (or an explicit list) rather than the full buffer
— i.e., read ctx.input_data safely (default {}), then build a new dict by
iterating over ctx.agent_spec.input_keys and copying only those keys present,
and pass that filtered dict into PromptSpec.input_data; keep
AgentLoop._build_initial_message() behavior unchanged.

---

Nitpick comments:
In `@examples/templates/sales_ops_agent/__main__.py`:
- Around line 158-165: Add a --mock flag to the shell command and forward it to
agent.start so shell can run the demo with mock LLM/MCP like run and tui; modify
the click options on the shell function to include a boolean
click.option("--mock", is_flag=True) parameter and pass that boolean into the
call to agent.start as agent.start(mock_mode=mock) (refer to the shell function
and the agent.start invocation to locate where to add and forward the flag).
- Line 178: Replace the use of asyncio.get_event_loop() with
asyncio.get_running_loop() when calling run_in_executor in the async input loop;
locate the line that assigns to command using
asyncio.get_event_loop().run_in_executor and update it to call
asyncio.get_running_loop().run_in_executor so the call enforces being inside a
running coroutine and fails fast outside async contexts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 20cab4d4-0c44-4365-b170-09dbc5582fe9

📥 Commits

Reviewing files that changed from the base of the PR and between 99ff582 and b48c773.

📒 Files selected for processing (4)
  • core/framework/agent_loop/agent_loop.py
  • core/framework/agent_loop/prompting.py
  • examples/templates/sales_ops_agent/__main__.py
  • examples/templates/sales_ops_agent/tools.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • examples/templates/sales_ops_agent/tools.py

Comment thread core/framework/agent_loop/agent_loop.py
Comment thread examples/templates/sales_ops_agent/__main__.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
core/framework/agent_loop/agent_loop.py (1)

3001-3025: Set_output handling looks correct; minor cleanup opportunity.

All previously flagged concerns are addressed: safe str(value)[:100] preview, log moved after side-effects, _publish_output_key_set emitted, results_by_id populated regardless of error, and set_output is now treated as synthetic in downstream filters.

Two minor refactor opportunities (non-blocking):

  1. The two consecutive if not result.is_error: blocks separated by results_by_id[...] = result rely on Python's lack of block scoping — value/key defined in the first block are referenced in the second. This is correct today but fragile to future edits. Consider consolidating.
  2. The result = ToolResult(...) rebuild at L3004–3008 only exists to attach tool_use_id; consider having handle_set_output accept tool_use_id (or set it on the returned ToolResult) so the rewrap can be dropped here and at the other synthetic call sites.
♻️ Proposed local consolidation
                 if tc.tool_name == "set_output":
                     # Handle set_output: validate and store in accumulator
-                    result = handle_set_output(tc.tool_input, ctx.agent_spec.output_keys)
-                    result = ToolResult(
+                    _raw_result = handle_set_output(tc.tool_input, ctx.agent_spec.output_keys)
+                    result = ToolResult(
                         tool_use_id=tc.tool_use_id,
-                        content=result.content,
-                        is_error=result.is_error,
+                        content=_raw_result.content,
+                        is_error=_raw_result.is_error,
                     )
+                    results_by_id[tc.tool_use_id] = result
                     if not result.is_error:
-                        # Set the output in the accumulator
                         key = tc.tool_input.get("key", "")
                         value = tc.tool_input.get("value", "")
                         await accumulator.set(key, value)
                         outputs_set_this_turn.append(key)
                         await self._publish_output_key_set(
                             stream_id,
                             node_id,
                             key,
                             execution_id,
                         )
-                    results_by_id[tc.tool_use_id] = result
-                    # Log after side-effects - compute safe preview for non-string values
-                    if not result.is_error:
                         preview = str(value)[:100] if value is not None else None
                         logger.debug("[%s] set_output: %s = %s", node_id, key, preview)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/framework/agent_loop/agent_loop.py` around lines 3001 - 3025,
Consolidate the two separate "if not result.is_error:" blocks around the
set_output handling to avoid relying on outer scope variables: after calling
handle_set_output(tc.tool_input, ctx.agent_spec.output_keys) immediately set
key/value from tc.tool_input, perform await accumulator.set(key, value), append
to outputs_set_this_turn, call await self._publish_output_key_set(...), populate
results_by_id[tc.tool_use_id] and then compute the safe preview and logger.debug
in the same block; additionally eliminate the Manual rewrap of ToolResult by
either updating handle_set_output to accept tool_use_id and return a ToolResult
with tool_use_id set, or have handle_set_output return a mutable result on which
you set result.tool_use_id = tc.tool_use_id before using it (so you can remove
the ToolResult(...) reconstruction at L3004–3008).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@core/framework/agent_loop/agent_loop.py`:
- Around line 3001-3025: Consolidate the two separate "if not result.is_error:"
blocks around the set_output handling to avoid relying on outer scope variables:
after calling handle_set_output(tc.tool_input, ctx.agent_spec.output_keys)
immediately set key/value from tc.tool_input, perform await accumulator.set(key,
value), append to outputs_set_this_turn, call await
self._publish_output_key_set(...), populate results_by_id[tc.tool_use_id] and
then compute the safe preview and logger.debug in the same block; additionally
eliminate the Manual rewrap of ToolResult by either updating handle_set_output
to accept tool_use_id and return a ToolResult with tool_use_id set, or have
handle_set_output return a mutable result on which you set result.tool_use_id =
tc.tool_use_id before using it (so you can remove the ToolResult(...)
reconstruction at L3004–3008).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e2635c9c-aba6-4e39-a7c0-a75c77fdd376

📥 Commits

Reviewing files that changed from the base of the PR and between b48c773 and a71923a.

📒 Files selected for processing (1)
  • core/framework/agent_loop/agent_loop.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Objective Sales Strategy Ops Manager Agent

1 participant