feat: implement Tool Search progressive MCP/plugin tool disclosure#1778
feat: implement Tool Search progressive MCP/plugin tool disclosure#1778praisonai-triage-agent[bot] wants to merge 2 commits into
Conversation
Implements Hermes-parity progressive disclosure layer for MCP and non-core plugin tools: Phase P0 Complete: - Added ToolSearchConfig with auto/on/off modes and threshold controls - Implemented BM25 search with substring fallback (no external dependencies) - Created bridge tools (tool_search, tool_describe, tool_call) with lazy loading - Integrated into Agent class with tool_search parameter - Updated chat_mixin for tool assembly and bridge mode detection - Added tool_execution unwrapping with security validation - Marked all MCP tools as deferrable (__praisonai_deferrable__ = True) - Comprehensive unit tests covering all core functionality Design Invariants (from Hermes): 1. Core tools never defer (PRAISONAI_CORE_TOOLS) 2. Unknown tools stay visible (never silently dropped) 3. Stateless catalog (rebuilt every assembly) 4. Unwrap before trace/stream (hooks see real tool names) 5. Session scoping (agent's own tool list only) 6. Auto threshold (10% context window default) 7. Zero overhead when disabled (opt-in, lazy imports) Default: tool_search=False (opt-in). Zero overhead when disabled. Auto mode activates based on token threshold vs context window. Core SDK tools (file/shell/web/schedule/memory/clarify) never defer. Fixes #1775 π€ Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <noreply@anthropic.com>
|
@coderabbitai review |
|
/review |
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more β On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
β Actions performedReview triggered.
|
|
Important Review skippedBot user detected. To trigger a single review, invoke the βοΈ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
π WalkthroughWalkthroughThis PR implements a comprehensive progressive tool disclosure system (Tool Search) that replaces large, deferrable tool schemas with three bridge tools ( ChangesTool Search Implementation and Integration
Sequence DiagramsequenceDiagram
participant Client
participant Agent
participant ChatMixin
participant ToolExecutionMixin
participant tool_search
Client->>Agent: init(tool_search="auto")
Agent->>Agent: resolve tool_search to _tool_search_config
Client->>ChatMixin: format_tools_for_completion()
ChatMixin->>tool_search: assemble_tool_defs(formatted_tools, config)
tool_search->>tool_search: classify core vs deferrable
tool_search->>tool_search: replace deferrable with bridge schemas
tool_search->>ChatMixin: (bridge_tools, metadata)
ChatMixin->>ToolExecutionMixin: (bridge_tools + metadata)
Client->>ToolExecutionMixin: execute_tool("tool_search", query)
ToolExecutionMixin->>tool_search: dispatch_tool_search(query)
tool_search->>tool_search: BM25 search over deferrable catalog
tool_search->>ToolExecutionMixin: (matches, total_count)
Client->>ToolExecutionMixin: execute_tool("tool_call", args)
ToolExecutionMixin->>tool_search: resolve_underlying_call(args)
tool_search->>ToolExecutionMixin: (real_tool_name, real_args)
ToolExecutionMixin->>ToolExecutionMixin: execute_tool(real_tool_name, real_args) [recursively]
Estimated code review effortπ― 4 (Complex) | β±οΈ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
π₯ Pre-merge checks | β 5β Passed checks (5 passed)
βοΈ Tip: You can configure your own custom pre-merge checks in the settings. β¨ Finishing Touchesπ§ͺ Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@copilot Do a thorough review of this PR. Read ALL existing reviewer comments above from Qodo, Coderabbit, and Gemini first β incorporate their findings. Review areas:
|
Greptile SummaryIntroduces an opt-in Tool Search layer that replaces large MCP/plugin tool schemas with three lightweight bridge tools (
Confidence Score: 4/5Safe to merge after fixing the praisonai_deferrable marker that leaks into tool_describe responses sent back to the model. The bridge dispatch path in tool_execution.py serialises raw deferrable tool defs β which still carry the praisonai_deferrable internal marker β directly to the LLM via tool_describe. The same marker is explicitly stripped from provider-facing payloads in chat_mixin.py, so this is an inconsistency where the model receives implementation-private metadata it was never meant to see. The rest of the feature (assembly, BM25 search, bridge guard, cache, MCP tagging) is well-structured and the unit tests are thorough. src/praisonai-agents/praisonaiagents/agent/tool_execution.py β the dispatch_tool_describe call site needs to strip praisonai_deferrable from the returned schema before json.dumps. Important Files Changed
Sequence DiagramsequenceDiagram
participant Agent
participant ChatMixin as chat_mixin
participant ToolSearch as tool_search.py
participant LLM
participant ToolExecution as tool_execution
Agent->>ChatMixin: _format_tools_for_completion(tools)
ChatMixin->>ToolSearch: assemble_tool_defs(formatted_tools, config)
ToolSearch-->>ChatMixin: "(assembled_tools, metadata{bridge_mode, deferrable_tools})"
ChatMixin->>ChatMixin: strip __praisonai_deferrable__ from provider payloads
ChatMixin-->>LLM: [core_tools + bridge_tools(tool_search, tool_describe, tool_call)]
LLM->>ToolExecution: call tool_search(query)
ToolExecution->>ToolSearch: dispatch_tool_search(query, deferrable_tools)
ToolSearch-->>ToolExecution: "{results: [...]}"
ToolExecution-->>LLM: JSON search results
LLM->>ToolExecution: call tool_describe(tool_name)
ToolExecution->>ToolSearch: dispatch_tool_describe(tool_name, deferrable_tools)
Note over ToolExecution: schema still has __praisonai_deferrable__
ToolSearch-->>ToolExecution: "{schema: tool_def}"
ToolExecution-->>LLM: JSON schema (leaks internal marker)
LLM->>ToolExecution: call tool_call(tool_name, tool_args)
ToolExecution->>ToolSearch: resolve_underlying_call(tool_call, args)
ToolSearch-->>ToolExecution: (real_name, real_args)
ToolExecution->>ToolExecution: validate in deferrable_names
ToolExecution->>ToolExecution: execute_tool(real_name, real_args)
ToolExecution-->>LLM: real tool result
|
| if function_name in ("tool_search", "tool_describe", "tool_call"): | ||
| return self._handle_bridge_tool_call(function_name, arguments, tool_call_id) | ||
|
|
There was a problem hiding this comment.
Unconditional bridge-name hijack breaks tools when
tool_search is disabled
The guard fires for ALL agents regardless of whether tool_search is enabled. Any user tool named tool_search, tool_describe, or tool_call will be silently broken even with the default tool_search=False β _handle_bridge_tool_call returns the string "Tool search not available or not in bridge mode" instead of executing the real function. The check must be gated on whether tool-search is actually active.
| if function_name in ("tool_search", "tool_describe", "tool_call"): | |
| return self._handle_bridge_tool_call(function_name, arguments, tool_call_id) | |
| # Handle bridge tool unwrapping BEFORE trace/stream/hooks (design invariant #6) | |
| # Only intercept when tool_search is active; otherwise fall through to real tool execution | |
| if (getattr(self, '_tool_search_config', None) is not None and | |
| function_name in ("tool_search", "tool_describe", "tool_call")): | |
| return self._handle_bridge_tool_call(function_name, arguments, tool_call_id) |
| @dataclass | ||
| class ToolSearchConfig: | ||
| """Configuration for Tool Search feature.""" | ||
| enabled: Union[bool, str] = "auto" # auto | on | off | True | False | ||
| threshold_pct: float = 10.0 # Percentage of context window for deferral threshold | ||
| search_default_limit: int = 5 # Default number of search results | ||
| max_search_limit: int = 20 # Maximum search results allowed | ||
| core_tools: Optional[FrozenSet[str]] = None # Override core tools set | ||
|
|
||
| @classmethod | ||
| def from_raw(cls, raw_config: Any) -> "ToolSearchConfig": | ||
| """Create config from various input formats (bool, dict, ToolSearchConfig).""" | ||
| if isinstance(raw_config, cls): | ||
| return raw_config | ||
| elif isinstance(raw_config, bool): | ||
| return cls(enabled="on" if raw_config else "off") | ||
| elif isinstance(raw_config, str): | ||
| if raw_config.lower() in ("true", "on", "yes", "1"): | ||
| return cls(enabled="on") | ||
| elif raw_config.lower() in ("false", "off", "no", "0"): | ||
| return cls(enabled="off") | ||
| elif raw_config.lower() == "auto": | ||
| return cls(enabled="auto") | ||
| else: | ||
| raise ValueError(f"Invalid tool_search string value: {raw_config}") | ||
| elif isinstance(raw_config, dict): | ||
| return cls(**raw_config) | ||
| else: | ||
| raise TypeError(f"Invalid tool_search type: {type(raw_config)}") |
There was a problem hiding this comment.
Duplicate
ToolSearchConfig β public API class missing from_raw
Two separate ToolSearchConfig classes exist: one here (in tools/tool_search.py, with from_raw) and one in config/feature_configs.py (exported via config/__init__.py, without from_raw). Users who follow the PR description (from praisonaiagents import ToolSearchConfig) get the feature_configs version and will hit AttributeError if they call ToolSearchConfig.from_raw("auto"). The two classes should be consolidated β feature_configs.py should import and re-export the canonical class from tools/tool_search.py.
|
@claude You are the FINAL architecture reviewer. If the branch is under MervinPraison/PraisonAI (not a fork), you are able to make modifications to this branch and push directly. SCOPE: Focus ONLY on Python packages (praisonaiagents, praisonai). Do NOT modify praisonai-rust or praisonai-ts. Read ALL comments above from Gemini, Qodo, CodeRabbit, and Copilot carefully before responding. Phase 1: Review per AGENTS.md
Phase 2: FIX Valid Issues Phase 3: Final Verdict |
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and canβt be posted inline due to platform limitations.
β οΈ Outside diff range comments (1)
src/praisonai-agents/praisonaiagents/mcp/mcp.py (1)
606-614:β οΈ Potential issue | π Major | β‘ Quick winStrip
__praisonai_deferrable__from provider-facing tool payloads
__praisonai_deferrable__is added to MCP-generated OpenAI tool dicts insrc/praisonai-agents/praisonaiagents/mcp/mcp.py(and alsomcp_http_stream.py,mcp_sse.py,mcp_websocket.py). Those dicts are then passed through_format_tools_for_completion()unchanged into the Chat Completionstools=payload, so the extra top-level key can reach the provider. OpenAI tool entries are defined with onlytypeandfunction, so arbitrary extra keys risk strict-schema failures.Responses API is safer here (tools are reconstructed from
tool["function"]and unknown keys are dropped), but the Chat Completions path needs the same fix.Keep the marker only on an internal copy for tool-search/classification, and strip/remove
__praisonai_deferrable__right before callingchat.completions.create(or donβt include it into_openai_tool()at all).π€ Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/praisonai-agents/praisonaiagents/mcp/mcp.py` around lines 606 - 614, The MCP-generated OpenAI tool dicts currently include a top-level "__praisonai_deferrable__" key when appended to openai_tools (see the openai_tools.append block), which can leak into provider payloads and break strict schemas; keep that marker only on an internal copy used for tool-search/classification and remove it before sending to the provider (either stop adding it in to_openai_tool or strip "__praisonai_deferrable__" inside _format_tools_for_completion or immediately before calling chat.completions.create), ensuring the provider-facing tool entries contain only the allowed keys ("type" and "function").
π€ Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/praisonai-agents/praisonaiagents/agent/agent.py`:
- Around line 1446-1462: The clone_for_channel logic drops the agent's
_tool_search_config so cloned agents lose tool search behavior; update
clone_for_channel to pass the original tool search into the Agent constructor
(e.g. pass tool_search=self._tool_search_config or equivalent) when creating the
clone so the Agent(...) call receives the same tool_search setting; ensure you
reference the Agent(...) call inside clone_for_channel and set its tool_search
argument to the existing _tool_search_config (or a copy if immutability is
required).
- Around line 1460-1462: The else branch currently accepts any value and assigns
it to self._tool_search_config, which defers type errors; instead validate that
tool_search is an instance of the expected config type(s) (e.g.,
ToolSearchConfig or whatever concrete class/interface used by this agent) before
assignment in the setter/constructor where tool_search is handled, and raise a
clear ValueError/TypeError mentioning the bad value and remediation (e.g.,
"tool_search must be a ToolSearchConfig or None; pass an appropriate config or
omit the parameter") referencing self._tool_search_config and the tool_search
parameter so callers get immediate, actionable feedback.
In `@src/praisonai-agents/praisonaiagents/agent/chat_mixin.py`:
- Around line 348-352: The cache currently stores only the rendered tools (using
_get_tools_cache_key, _tool_search_config and _formatted_tools_cache via
_cache_get) but not the bridge metadata used by tool_search
(self._tool_search_metadata), causing metadata mismatch on cache hits; update
the caching logic so the cache entry stores both the formatted tools and the
associated bridge metadata (e.g., a tuple or dict { "formatted": ...,
"tool_search_metadata": ... }) when saving, and on cache hits restore both the
formatted tools and set self._tool_search_metadata from the cached value (apply
the same change in the other block around the code that spans the second
occurrence referenced in the review).
In `@src/praisonai-agents/praisonaiagents/agent/tool_execution.py`:
- Around line 979-985: Remove the unnecessary f-string prefixes on the two
fixed-string return statements in tool_execution.py (the returns that say "Tool
search not available or not in bridge mode" and "Tool search not in bridge
mode"); edit the method that references self._tool_search_metadata (the block
checking bridge_mode) to return plain string literals without the leading f, so
Ruff F541 is resolved and CI will pass.
In `@src/praisonai-agents/praisonaiagents/config/feature_configs.py`:
- Line 1144: ToolSearchParam is defined but missing its resolver; add a
resolve_tool_search function that accepts the same signature and precedence
pattern used by other resolvers (e.g., resolve_memory, resolve_knowledge) to
normalize a ToolSearchParam into a ToolSearchConfig (or default) by handling
bool, str, dict, and ToolSearchConfig inputs and applying the module's
precedence ladder; place this new resolve_tool_search implementation after the
other resolver functions (around where other resolvers end, before the __all__
declaration) and add "resolve_tool_search" to the __all__ list so it is
exported.
In `@src/praisonai-agents/praisonaiagents/tools/tool_search.py`:
- Line 272: The loop currently uses enumerate for i which is unused and uses zip
without strictness; replace "for i, (item, tf) in enumerate(zip(self.catalog,
self.term_frequencies))" with a loop that drops the unused index and uses
zip(..., strict=True) β e.g., "for item, tf in zip(self.catalog,
self.term_frequencies, strict=True)" β or if you must support Python versions
<3.10, remove enumerate, use "for item, tf in zip(self.catalog,
self.term_frequencies)" and add an assertion in the class initializer (e.g., in
__init__) that len(self.catalog) == len(self.term_frequencies) to prevent silent
mismatches.
- Around line 548-555: The code accesses tool_args with .get which will raise
AttributeError if tool_args is None or not a dict; add a defensive type check
before lines that read real_tool_name and real_args: verify that tool_args is an
instance of dict (e.g., isinstance(tool_args, dict)) and if not raise a
ValueError with a clear message like "tool_call requires 'tool_args' to be a
dict, got <type>"; then proceed to extract real_tool_name and real_args as
currently done (references: tool_args, real_tool_name, real_args, and the
"tool_call" usage).
In `@src/praisonai-agents/tests/unit/tools/test_tool_search.py`:
- Around line 1-394: Add a new async integration test that actually runs an
Agent end-to-end using the real LLM: instantiate Agent with
ToolSearchConfig(enabled="on"), supply 20+ deferrable MCP-like tools, call
agent.start(...) with a prompt that requires discovering and invoking a weather
tool, then assert result.success is True and the text response contains expected
content (e.g., "weather" or "San Francisco"); also verify the execution trace
includes the bridge flow (tool_search β tool_describe β tool_call). Use the
Agent and ToolSearchConfig symbols from the codebase and ensure the test is
marked as integration/async so it executes real LLM calls. Ensure the test fails
fast if the bridge tools are not discovered or invoked.
---
Outside diff comments:
In `@src/praisonai-agents/praisonaiagents/mcp/mcp.py`:
- Around line 606-614: The MCP-generated OpenAI tool dicts currently include a
top-level "__praisonai_deferrable__" key when appended to openai_tools (see the
openai_tools.append block), which can leak into provider payloads and break
strict schemas; keep that marker only on an internal copy used for
tool-search/classification and remove it before sending to the provider (either
stop adding it in to_openai_tool or strip "__praisonai_deferrable__" inside
_format_tools_for_completion or immediately before calling
chat.completions.create), ensuring the provider-facing tool entries contain only
the allowed keys ("type" and "function").
πͺ Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
βΉοΈ Review info
βοΈ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2c7b177b-be1a-4ac6-b07e-f864d300dce6
π Files selected for processing (12)
src/praisonai-agents/praisonaiagents/agent/agent.pysrc/praisonai-agents/praisonaiagents/agent/chat_mixin.pysrc/praisonai-agents/praisonaiagents/agent/tool_execution.pysrc/praisonai-agents/praisonaiagents/config/__init__.pysrc/praisonai-agents/praisonaiagents/config/feature_configs.pysrc/praisonai-agents/praisonaiagents/mcp/mcp.pysrc/praisonai-agents/praisonaiagents/mcp/mcp_http_stream.pysrc/praisonai-agents/praisonaiagents/mcp/mcp_sse.pysrc/praisonai-agents/praisonaiagents/mcp/mcp_websocket.pysrc/praisonai-agents/praisonaiagents/tools/tool_search.pysrc/praisonai-agents/tests/unit/tools/test_tool_search.pytest_tool_search_simple.py
| if tool_search is None or tool_search is False: | ||
| self._tool_search_config = None | ||
| elif tool_search is True: | ||
| # True -> auto mode with defaults | ||
| from ..tools.tool_search import ToolSearchConfig as _ToolSearchConfig | ||
| self._tool_search_config = _ToolSearchConfig(enabled="auto") | ||
| elif isinstance(tool_search, str): | ||
| # String mode ("auto", "on", "off") | ||
| from ..tools.tool_search import ToolSearchConfig as _ToolSearchConfig | ||
| self._tool_search_config = _ToolSearchConfig.from_raw(tool_search) | ||
| elif isinstance(tool_search, dict): | ||
| # Dict -> config overrides | ||
| from ..tools.tool_search import ToolSearchConfig as _ToolSearchConfig | ||
| self._tool_search_config = _ToolSearchConfig(**tool_search) | ||
| else: | ||
| # Assume ToolSearchConfig instance or similar | ||
| self._tool_search_config = tool_search |
There was a problem hiding this comment.
Preserve tool_search in channel clones.
This constructor now owns _tool_search_config, but clone_for_channel() never passes it back into Agent(...). Cloned agents will silently fall back to tool_search=False, so gateway/channel executions can expose a different tool set than the source agent.
π‘ Minimal fix
clone_kwargs = {
# Core identity
'name': self.name,
@@
# Tool configuration
'tool_timeout': getattr(self, '_tool_timeout', None),
'parallel_tool_calls': getattr(self, 'parallel_tool_calls', False),
+ 'tool_search': getattr(self, '_tool_search_config', None),
# CLI backend
'cli_backend': getattr(self, '_cli_backend', None),π€ Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/praisonai-agents/praisonaiagents/agent/agent.py` around lines 1446 -
1462, The clone_for_channel logic drops the agent's _tool_search_config so
cloned agents lose tool search behavior; update clone_for_channel to pass the
original tool search into the Agent constructor (e.g. pass
tool_search=self._tool_search_config or equivalent) when creating the clone so
the Agent(...) call receives the same tool_search setting; ensure you reference
the Agent(...) call inside clone_for_channel and set its tool_search argument to
the existing _tool_search_config (or a copy if immutability is required).
| else: | ||
| # Assume ToolSearchConfig instance or similar | ||
| self._tool_search_config = tool_search |
There was a problem hiding this comment.
Reject unsupported tool_search values here instead of later.
The fallback branch accepts any object and stores it as _tool_search_config. A bad call like tool_search=1 or an unrelated config object won't fail until a later tool-formatting path dereferences missing attributes, which makes the API harder to diagnose.
π‘ Minimal fix
elif isinstance(tool_search, dict):
# Dict -> config overrides
from ..tools.tool_search import ToolSearchConfig as _ToolSearchConfig
self._tool_search_config = _ToolSearchConfig(**tool_search)
else:
- # Assume ToolSearchConfig instance or similar
- self._tool_search_config = tool_search
+ from ..tools.tool_search import ToolSearchConfig as _ToolSearchConfig
+ if isinstance(tool_search, _ToolSearchConfig):
+ self._tool_search_config = tool_search
+ else:
+ raise TypeError(
+ "tool_search must be False/None, True, a mode string, "
+ "a dict of ToolSearchConfig fields, or ToolSearchConfig"
+ )π Committable suggestion
βΌοΈ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| else: | |
| # Assume ToolSearchConfig instance or similar | |
| self._tool_search_config = tool_search | |
| else: | |
| from ..tools.tool_search import ToolSearchConfig as _ToolSearchConfig | |
| if isinstance(tool_search, _ToolSearchConfig): | |
| self._tool_search_config = tool_search | |
| else: | |
| raise TypeError( | |
| "tool_search must be False/None, True, a mode string, " | |
| "a dict of ToolSearchConfig fields, or ToolSearchConfig" | |
| ) |
π€ Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/praisonai-agents/praisonaiagents/agent/agent.py` around lines 1460 -
1462, The else branch currently accepts any value and assigns it to
self._tool_search_config, which defers type errors; instead validate that
tool_search is an instance of the expected config type(s) (e.g.,
ToolSearchConfig or whatever concrete class/interface used by this agent) before
assignment in the setter/constructor where tool_search is handled, and raise a
clear ValueError/TypeError mentioning the bad value and remediation (e.g.,
"tool_search must be a ToolSearchConfig or None; pass an appropriate config or
omit the parameter") referencing self._tool_search_config and the tool_search
parameter so callers get immediate, actionable feedback.
| # Check cache first - include tool_search config in cache key for safety | ||
| tools_key = self._get_tools_cache_key(tools) | ||
| cached_tools = self._cache_get(self._formatted_tools_cache, tools_key) | ||
| tool_search_enabled = getattr(self, '_tool_search_config', None) is not None | ||
| cache_key = f"{tools_key}:tool_search={tool_search_enabled}" | ||
| cached_tools = self._cache_get(self._formatted_tools_cache, cache_key) |
There was a problem hiding this comment.
Cache the bridge metadata with the assembled tools.
This cache currently stores only the rendered tool list, but bridge execution later reads self._tool_search_metadata. If the same agent formats tool set A, then tool set B, then hits the cache for A, the bridge schemas for A will run against Bβs catalog. That can make tool_search/tool_describe/tool_call expose or execute the wrong task-scoped tools.
Suggested fix
- cached_tools = self._cache_get(self._formatted_tools_cache, cache_key)
- if cached_tools is not None:
- return cached_tools
+ cached_entry = self._cache_get(self._formatted_tools_cache, cache_key)
+ if cached_entry is not None:
+ cached_tools, cached_metadata = cached_entry
+ self._tool_search_metadata = cached_metadata
+ return cached_tools
...
- tool_search_enabled = getattr(self, '_tool_search_config', None) is not None
- cache_key = f"{tools_key}:tool_search={tool_search_enabled}"
+ tool_search_config = getattr(self, "_tool_search_config", None)
+ context_length = getattr(self, "_context_window_size", None)
+ cache_key = f"{tools_key}:tool_search={tool_search_config!r}:context={context_length}"
...
- self._cache_put(self._formatted_tools_cache, cache_key, formatted_tools)
+ self._cache_put(
+ self._formatted_tools_cache,
+ cache_key,
+ (formatted_tools, getattr(self, "_tool_search_metadata", None)),
+ )Also applies to: 405-428
π€ Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/praisonai-agents/praisonaiagents/agent/chat_mixin.py` around lines 348 -
352, The cache currently stores only the rendered tools (using
_get_tools_cache_key, _tool_search_config and _formatted_tools_cache via
_cache_get) but not the bridge metadata used by tool_search
(self._tool_search_metadata), causing metadata mismatch on cache hits; update
the caching logic so the cache entry stores both the formatted tools and the
associated bridge metadata (e.g., a tuple or dict { "formatted": ...,
"tool_search_metadata": ... }) when saving, and on cache hits restore both the
formatted tools and set self._tool_search_metadata from the cached value (apply
the same change in the other block around the code that spans the second
occurrence referenced in the review).
| return f"Tool search not available or not in bridge mode" | ||
|
|
||
| metadata = self._tool_search_metadata | ||
|
|
||
| # Check if we're in bridge mode | ||
| if not metadata.get("bridge_mode", False): | ||
| return f"Tool search not in bridge mode" |
There was a problem hiding this comment.
Drop the stray f prefixes here.
These returns are fixed strings, so Ruff flags both lines with F541 and CI will stay red until the prefixes are removed.
Suggested fix
- return f"Tool search not available or not in bridge mode"
+ return "Tool search not available or not in bridge mode"
...
- return f"Tool search not in bridge mode"
+ return "Tool search not in bridge mode"π§° Tools
πͺ Ruff (0.15.14)
[error] 979-979: f-string without any placeholders
Remove extraneous f prefix
(F541)
[error] 985-985: f-string without any placeholders
Remove extraneous f prefix
(F541)
π€ Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/praisonai-agents/praisonaiagents/agent/tool_execution.py` around lines
979 - 985, Remove the unnecessary f-string prefixes on the two fixed-string
return statements in tool_execution.py (the returns that say "Tool search not
available or not in bridge mode" and "Tool search not in bridge mode"); edit the
method that references self._tool_search_metadata (the block checking
bridge_mode) to return plain string literals without the leading f, so Ruff F541
is resolved and CI will pass.
| HooksParam = Union[List[Any], HooksConfig] | ||
| SkillsParam = Union[List[str], SkillsConfig] | ||
| AutonomyParam = Union[bool, Dict[str, Any], "AutonomyConfig"] | ||
| ToolSearchParam = Union[bool, str, Dict[str, Any], ToolSearchConfig] |
There was a problem hiding this comment.
π οΈ Refactor suggestion | π Major | β‘ Quick win
Add resolve_tool_search function to complete the config pattern.
ToolSearchParam is defined but no corresponding resolve_tool_search() resolver function exists. Every other *Param type alias in this module has a matching resolver (e.g., MemoryParam β resolve_memory, KnowledgeParam β resolve_knowledge). This breaks the established precedence-ladder pattern and makes the tool_search parameter incomplete.
β»οΈ Proposed fix: Add resolve_tool_search function
Add this function after line 1393 and before the __all__ declaration:
+def resolve_tool_search(value: ToolSearchParam) -> Optional[ToolSearchConfig]:
+ """Resolve tool_search= parameter following precedence ladder."""
+ if value is None or value is False:
+ return None
+ if value is True:
+ return ToolSearchConfig()
+ if isinstance(value, str):
+ return ToolSearchConfig(enabled=value)
+ if isinstance(value, dict):
+ return ToolSearchConfig(**value)
+ if isinstance(value, ToolSearchConfig):
+ return value
+ return valueThen add "resolve_tool_search" to the __all__ list around line 1453.
π€ Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/praisonai-agents/praisonaiagents/config/feature_configs.py` at line 1144,
ToolSearchParam is defined but missing its resolver; add a resolve_tool_search
function that accepts the same signature and precedence pattern used by other
resolvers (e.g., resolve_memory, resolve_knowledge) to normalize a
ToolSearchParam into a ToolSearchConfig (or default) by handling bool, str,
dict, and ToolSearchConfig inputs and applying the module's precedence ladder;
place this new resolve_tool_search implementation after the other resolver
functions (around where other resolvers end, before the __all__ declaration) and
add "resolve_tool_search" to the __all__ list so it is exported.
| scores = [] | ||
| k1, b = 1.5, 0.75 # BM25 parameters | ||
|
|
||
| for i, (item, tf) in enumerate(zip(self.catalog, self.term_frequencies)): |
There was a problem hiding this comment.
Remove unused loop variable and add strict= to zip().
The loop variable i from enumerate() is not used in the loop body. Additionally, zip() without strict=True can silently handle length mismatches between self.catalog and self.term_frequencies, which could mask bugs if the index falls out of sync.
π§ Proposed fix
- for i, (item, tf) in enumerate(zip(self.catalog, self.term_frequencies)):
+ for item, tf in zip(self.catalog, self.term_frequencies, strict=True):
score = 0.0
doc_length = sum(tf.values())Note: strict=True requires Python 3.10+. If supporting Python 3.9, omit strict= but add a length assertion in __init__:
assert len(self.catalog) == len(self.term_frequencies)π§° Tools
πͺ Ruff (0.15.14)
[warning] 272-272: Loop control variable i not used within loop body
Rename unused i to _i
(B007)
[warning] 272-272: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
π€ Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/praisonai-agents/praisonaiagents/tools/tool_search.py` at line 272, The
loop currently uses enumerate for i which is unused and uses zip without
strictness; replace "for i, (item, tf) in enumerate(zip(self.catalog,
self.term_frequencies))" with a loop that drops the unused index and uses
zip(..., strict=True) β e.g., "for item, tf in zip(self.catalog,
self.term_frequencies, strict=True)" β or if you must support Python versions
<3.10, remove enumerate, use "for item, tf in zip(self.catalog,
self.term_frequencies)" and add an assertion in the class initializer (e.g., in
__init__) that len(self.catalog) == len(self.term_frequencies) to prevent silent
mismatches.
| # Extract real tool call from bridge args | ||
| real_tool_name = tool_args.get("tool_name", "") | ||
| real_args = tool_args.get("tool_args", {}) | ||
|
|
||
| if not real_tool_name: | ||
| raise ValueError("tool_call requires 'tool_name' parameter") | ||
|
|
||
| return real_tool_name, real_args |
There was a problem hiding this comment.
Validate tool_args type before dictionary access.
If tool_args is None or not a dictionary (e.g., malformed LLM output), lines 549-550 will raise AttributeError instead of a clear error message. Add type validation with a helpful error.
π‘οΈ Proposed defensive check
if tool_name != "tool_call":
# Not a bridge call, return as-is
return tool_name, tool_args
+ # Validate tool_args is a dict
+ if not isinstance(tool_args, dict):
+ raise TypeError(
+ f"tool_call expects a dictionary for tool_args, got {type(tool_args).__name__}. "
+ "Ensure the LLM output is properly formatted."
+ )
+
# Extract real tool call from bridge args
real_tool_name = tool_args.get("tool_name", "")
real_args = tool_args.get("tool_args", {})π€ Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/praisonai-agents/praisonaiagents/tools/tool_search.py` around lines 548 -
555, The code accesses tool_args with .get which will raise AttributeError if
tool_args is None or not a dict; add a defensive type check before lines that
read real_tool_name and real_args: verify that tool_args is an instance of dict
(e.g., isinstance(tool_args, dict)) and if not raise a ValueError with a clear
message like "tool_call requires 'tool_args' to be a dict, got <type>"; then
proceed to extract real_tool_name and real_args as currently done (references:
tool_args, real_tool_name, real_args, and the "tool_call" usage).
| """ | ||
| Unit tests for Tool Search functionality. | ||
|
|
||
| Tests core tool search components including: | ||
| - ToolSearchConfig | ||
| - Tool classification | ||
| - BM25 search | ||
| - Bridge tool schemas | ||
| - Assembly logic | ||
| """ | ||
|
|
||
| import pytest | ||
| from praisonaiagents.tools.tool_search import ( | ||
| ToolSearchConfig, classify_tools, search_catalog, bridge_tool_schemas, | ||
| assemble_tool_defs, dispatch_tool_search, dispatch_tool_describe, | ||
| resolve_underlying_call, PRAISONAI_CORE_TOOLS, | ||
| estimate_tool_schema_tokens, should_defer_tools | ||
| ) | ||
|
|
||
| class TestToolSearchConfig: | ||
| """Test ToolSearchConfig creation and validation.""" | ||
|
|
||
| def test_default_config(self): | ||
| """Test default configuration.""" | ||
| config = ToolSearchConfig() | ||
| assert config.enabled == "auto" | ||
| assert config.threshold_pct == 10.0 | ||
| assert config.search_default_limit == 5 | ||
| assert config.max_search_limit == 20 | ||
| assert config.core_tools is None | ||
|
|
||
| def test_from_raw_bool(self): | ||
| """Test creating config from boolean values.""" | ||
| config_true = ToolSearchConfig.from_raw(True) | ||
| assert config_true.enabled == "on" | ||
|
|
||
| config_false = ToolSearchConfig.from_raw(False) | ||
| assert config_false.enabled == "off" | ||
|
|
||
| def test_from_raw_string(self): | ||
| """Test creating config from string values.""" | ||
| config_auto = ToolSearchConfig.from_raw("auto") | ||
| assert config_auto.enabled == "auto" | ||
|
|
||
| config_on = ToolSearchConfig.from_raw("on") | ||
| assert config_on.enabled == "on" | ||
|
|
||
| config_off = ToolSearchConfig.from_raw("off") | ||
| assert config_off.enabled == "off" | ||
|
|
||
| def test_from_raw_dict(self): | ||
| """Test creating config from dictionary.""" | ||
| config = ToolSearchConfig.from_raw({ | ||
| "enabled": "on", | ||
| "threshold_pct": 15.0, | ||
| "search_default_limit": 10 | ||
| }) | ||
| assert config.enabled == "on" | ||
| assert config.threshold_pct == 15.0 | ||
| assert config.search_default_limit == 10 | ||
|
|
||
| def test_from_raw_invalid(self): | ||
| """Test error handling for invalid inputs.""" | ||
| with pytest.raises(ValueError): | ||
| ToolSearchConfig.from_raw("invalid") | ||
|
|
||
| with pytest.raises(TypeError): | ||
| ToolSearchConfig.from_raw(123) | ||
|
|
||
| class TestToolClassification: | ||
| """Test tool classification into core vs deferrable.""" | ||
|
|
||
| def test_core_tool_classification(self): | ||
| """Test that core tools are never deferred.""" | ||
| tool_defs = [ | ||
| { | ||
| "type": "function", | ||
| "function": {"name": "read_file", "description": "Read a file"} | ||
| }, | ||
| { | ||
| "type": "function", | ||
| "function": {"name": "search_web", "description": "Search the web"} | ||
| } | ||
| ] | ||
|
|
||
| config = ToolSearchConfig() | ||
| core_tools, deferrable_tools = classify_tools(tool_defs, config) | ||
|
|
||
| assert len(core_tools) == 2 | ||
| assert len(deferrable_tools) == 0 | ||
| assert core_tools == tool_defs | ||
|
|
||
| def test_mcp_tool_classification(self): | ||
| """Test that MCP tools are marked as deferrable.""" | ||
| tool_defs = [ | ||
| { | ||
| "type": "function", | ||
| "function": {"name": "mcp_weather", "description": "Get weather"}, | ||
| "__praisonai_deferrable__": True | ||
| }, | ||
| { | ||
| "type": "function", | ||
| "function": {"name": "read_file", "description": "Read a file"} | ||
| } | ||
| ] | ||
|
|
||
| config = ToolSearchConfig() | ||
| core_tools, deferrable_tools = classify_tools(tool_defs, config) | ||
|
|
||
| assert len(core_tools) == 1 | ||
| assert len(deferrable_tools) == 1 | ||
| assert core_tools[0]["function"]["name"] == "read_file" | ||
| assert deferrable_tools[0]["function"]["name"] == "mcp_weather" | ||
|
|
||
| def test_unknown_tool_stays_visible(self): | ||
| """Test that unknown tools stay visible (design invariant #2).""" | ||
| tool_defs = [ | ||
| { | ||
| "type": "function", | ||
| "function": {"name": "unknown_tool", "description": "Unknown tool"} | ||
| } | ||
| ] | ||
|
|
||
| config = ToolSearchConfig() | ||
| core_tools, deferrable_tools = classify_tools(tool_defs, config) | ||
|
|
||
| # Unknown tools should stay in core (visible) | ||
| assert len(core_tools) == 1 | ||
| assert len(deferrable_tools) == 0 | ||
| assert core_tools[0]["function"]["name"] == "unknown_tool" | ||
|
|
||
| class TestBM25Search: | ||
| """Test BM25 search functionality.""" | ||
|
|
||
| def test_empty_search(self): | ||
| """Test search with empty inputs.""" | ||
| result = search_catalog([], "test query") | ||
| assert result == [] | ||
|
|
||
| result = search_catalog([{"type": "function", "function": {"name": "test"}}], "") | ||
| assert result == [] | ||
|
|
||
| def test_simple_search(self): | ||
| """Test basic search functionality.""" | ||
| deferrable_tools = [ | ||
| { | ||
| "type": "function", | ||
| "function": { | ||
| "name": "weather_tool", | ||
| "description": "Get current weather information" | ||
| } | ||
| }, | ||
| { | ||
| "type": "function", | ||
| "function": { | ||
| "name": "stock_tool", | ||
| "description": "Get stock market data" | ||
| } | ||
| } | ||
| ] | ||
|
|
||
| results = search_catalog(deferrable_tools, "weather", limit=10) | ||
| assert len(results) >= 1 | ||
| assert any("weather_tool" in r["name"] for r in results) | ||
|
|
||
| results = search_catalog(deferrable_tools, "stock", limit=10) | ||
| assert len(results) >= 1 | ||
| assert any("stock_tool" in r["name"] for r in results) | ||
|
|
||
| class TestBridgeTools: | ||
| """Test bridge tool schemas and assembly.""" | ||
|
|
||
| def test_bridge_schemas(self): | ||
| """Test bridge tool schema generation.""" | ||
| schemas = bridge_tool_schemas() | ||
| assert len(schemas) == 3 | ||
|
|
||
| tool_names = {schema["function"]["name"] for schema in schemas} | ||
| assert tool_names == {"tool_search", "tool_describe", "tool_call"} | ||
|
|
||
| # Validate schema structure | ||
| for schema in schemas: | ||
| assert schema["type"] == "function" | ||
| assert "function" in schema | ||
| assert "name" in schema["function"] | ||
| assert "description" in schema["function"] | ||
| assert "parameters" in schema["function"] | ||
|
|
||
| def test_assemble_disabled(self): | ||
| """Test assembly when tool search is disabled.""" | ||
| tool_defs = [ | ||
| {"type": "function", "function": {"name": "test_tool", "description": "Test"}} | ||
| ] | ||
|
|
||
| config = ToolSearchConfig(enabled="off") | ||
| assembled, metadata = assemble_tool_defs(tool_defs, config) | ||
|
|
||
| assert assembled == tool_defs | ||
| assert metadata["bridge_mode"] is False | ||
| assert metadata["deferred_count"] == 0 | ||
|
|
||
| def test_assemble_no_deferrable_tools(self): | ||
| """Test assembly when no tools are deferrable.""" | ||
| tool_defs = [ | ||
| {"type": "function", "function": {"name": "read_file", "description": "Read file"}} | ||
| ] | ||
|
|
||
| config = ToolSearchConfig(enabled="on") | ||
| assembled, metadata = assemble_tool_defs(tool_defs, config) | ||
|
|
||
| # Should return original tools since nothing to defer | ||
| assert assembled == tool_defs | ||
| assert metadata["bridge_mode"] is False | ||
|
|
||
| def test_assemble_with_deferrable_tools(self): | ||
| """Test assembly with deferrable tools.""" | ||
| tool_defs = [ | ||
| {"type": "function", "function": {"name": "read_file", "description": "Read file"}}, | ||
| { | ||
| "type": "function", | ||
| "function": {"name": "mcp_weather", "description": "Get weather"}, | ||
| "__praisonai_deferrable__": True | ||
| } | ||
| ] | ||
|
|
||
| config = ToolSearchConfig(enabled="on") | ||
| assembled, metadata = assemble_tool_defs(tool_defs, config) | ||
|
|
||
| # Should have core tool + 3 bridge tools | ||
| assert len(assembled) == 4 | ||
| assert metadata["bridge_mode"] is True | ||
| assert metadata["deferred_count"] == 1 | ||
| assert len(metadata["catalog"]) == 1 | ||
|
|
||
| # Verify core tool remains | ||
| core_tool_names = {t["function"]["name"] for t in assembled if t["function"]["name"] != "tool_search" and t["function"]["name"] != "tool_describe" and t["function"]["name"] != "tool_call"} | ||
| assert "read_file" in core_tool_names | ||
|
|
||
| class TestBridgeDispatchers: | ||
| """Test bridge tool dispatch handlers.""" | ||
|
|
||
| def test_tool_search_dispatch(self): | ||
| """Test tool_search bridge dispatch.""" | ||
| deferrable_tools = [ | ||
| { | ||
| "type": "function", | ||
| "function": { | ||
| "name": "weather_tool", | ||
| "description": "Get weather information" | ||
| } | ||
| } | ||
| ] | ||
|
|
||
| config = ToolSearchConfig() | ||
| result = dispatch_tool_search("weather", 5, deferrable_tools, config) | ||
|
|
||
| assert "query" in result | ||
| assert "results" in result | ||
| assert "total_available" in result | ||
| assert result["query"] == "weather" | ||
| assert result["total_available"] == 1 | ||
|
|
||
| def test_tool_describe_dispatch(self): | ||
| """Test tool_describe bridge dispatch.""" | ||
| deferrable_tools = [ | ||
| { | ||
| "type": "function", | ||
| "function": { | ||
| "name": "weather_tool", | ||
| "description": "Get weather information" | ||
| } | ||
| } | ||
| ] | ||
|
|
||
| result = dispatch_tool_describe("weather_tool", deferrable_tools) | ||
|
|
||
| assert "tool_name" in result | ||
| assert "found" in result | ||
| assert result["found"] is True | ||
| assert result["tool_name"] == "weather_tool" | ||
| assert "schema" in result | ||
|
|
||
| def test_tool_describe_not_found(self): | ||
| """Test tool_describe for non-existent tool.""" | ||
| result = dispatch_tool_describe("nonexistent_tool", []) | ||
|
|
||
| assert "tool_name" in result | ||
| assert "found" in result | ||
| assert result["found"] is False | ||
| assert "error" in result | ||
|
|
||
| class TestToolCallUnwrapping: | ||
| """Test tool_call bridge unwrapping.""" | ||
|
|
||
| def test_resolve_underlying_call(self): | ||
| """Test unwrapping tool_call bridge.""" | ||
| # Normal tool call - should pass through | ||
| name, args = resolve_underlying_call("normal_tool", {"arg1": "value1"}) | ||
| assert name == "normal_tool" | ||
| assert args == {"arg1": "value1"} | ||
|
|
||
| # Bridge tool call - should unwrap | ||
| bridge_args = { | ||
| "tool_name": "real_tool", | ||
| "tool_args": {"param1": "test"} | ||
| } | ||
| name, args = resolve_underlying_call("tool_call", bridge_args) | ||
| assert name == "real_tool" | ||
| assert args == {"param1": "test"} | ||
|
|
||
| def test_resolve_missing_tool_name(self): | ||
| """Test error handling for missing tool_name.""" | ||
| with pytest.raises(ValueError, match="tool_call requires 'tool_name' parameter"): | ||
| resolve_underlying_call("tool_call", {"tool_args": {}}) | ||
|
|
||
| class TestTokenEstimation: | ||
| """Test token estimation and threshold logic.""" | ||
|
|
||
| def test_estimate_tool_schema_tokens(self): | ||
| """Test token estimation for tool schemas.""" | ||
| tool_defs = [ | ||
| { | ||
| "type": "function", | ||
| "function": { | ||
| "name": "test_tool", | ||
| "description": "A test tool with parameters", | ||
| "parameters": { | ||
| "type": "object", | ||
| "properties": {"param1": {"type": "string"}} | ||
| } | ||
| } | ||
| } | ||
| ] | ||
|
|
||
| tokens = estimate_tool_schema_tokens(tool_defs) | ||
| assert tokens > 0 | ||
| assert isinstance(tokens, int) | ||
|
|
||
| def test_should_defer_tools(self): | ||
| """Test deferral threshold logic.""" | ||
| # Create some dummy deferrable tools | ||
| deferrable_tools = [ | ||
| { | ||
| "type": "function", | ||
| "function": {"name": f"tool_{i}", "description": "Test tool"} | ||
| } | ||
| for i in range(10) | ||
| ] | ||
|
|
||
| # Test with disabled config | ||
| config_off = ToolSearchConfig(enabled="off") | ||
| assert should_defer_tools(deferrable_tools, config_off, 20000) is False | ||
|
|
||
| # Test with on config | ||
| config_on = ToolSearchConfig(enabled="on") | ||
| assert should_defer_tools(deferrable_tools, config_on, 20000) is True | ||
|
|
||
| # Test auto mode with high threshold (shouldn't defer) | ||
| config_auto_high = ToolSearchConfig(enabled="auto", threshold_pct=90) | ||
| assert should_defer_tools(deferrable_tools, config_auto_high, 20000) is False | ||
|
|
||
| # Test auto mode with low threshold (should defer) | ||
| config_auto_low = ToolSearchConfig(enabled="auto", threshold_pct=1) | ||
| assert should_defer_tools(deferrable_tools, config_auto_low, 20000) is True | ||
|
|
||
| class TestCoreToolsInvariant: | ||
| """Test that core tools never defer.""" | ||
|
|
||
| def test_core_tools_constant(self): | ||
| """Test that PRAISONAI_CORE_TOOLS is properly defined.""" | ||
| assert isinstance(PRAISONAI_CORE_TOOLS, frozenset) | ||
| assert len(PRAISONAI_CORE_TOOLS) > 0 | ||
|
|
||
| # Check some expected core tools | ||
| expected_core = {"read_file", "execute_command", "search_web", "clarify"} | ||
| assert expected_core.issubset(PRAISONAI_CORE_TOOLS) | ||
|
|
||
| def test_core_tools_never_deferred(self): | ||
| """Test that core tools are never put in deferrable category.""" | ||
| # Create tools that are marked as deferrable but are actually core tools | ||
| tool_defs = [] | ||
| for tool_name in list(PRAISONAI_CORE_TOOLS)[:5]: # Test first 5 core tools | ||
| tool_defs.append({ | ||
| "type": "function", | ||
| "function": {"name": tool_name, "description": f"Test {tool_name}"}, | ||
| "__praisonai_deferrable__": True # Try to mark as deferrable | ||
| }) | ||
|
|
||
| config = ToolSearchConfig() | ||
| core_tools, deferrable_tools = classify_tools(tool_defs, config) | ||
|
|
||
| # All should end up in core tools despite the deferrable marker | ||
| assert len(core_tools) == 5 | ||
| assert len(deferrable_tools) == 0 No newline at end of file |
There was a problem hiding this comment.
π οΈ Refactor suggestion | π Major | ποΈ Heavy lift
Add agentic integration test with real LLM execution.
Per coding guidelines: "Real agentic tests are MANDATORY for every feature: Agent must call agent.start() with a real prompt, call the LLM, and produce actual text responseβnot just smoke tests of object construction."
While this unit test module is comprehensive and well-structured, it lacks an agentic test that validates the tool search feature in a real agent execution flow with LLM interaction and bridge tool discovery/execution.
π Recommended agentic test structure
Add a new test file tests/integration/test_tool_search_agentic.py:
"""Agentic integration test for Tool Search feature."""
import pytest
from praisonaiagents import Agent, ToolSearchConfig
`@pytest.mark.integration`
`@pytest.mark.asyncio`
async def test_tool_search_with_real_llm():
"""Test tool search with real LLM execution and bridge tool calls."""
# Create agent with tool search enabled and many MCP tools
agent = Agent(
instructions="You are a helpful assistant.",
tool_search=ToolSearchConfig(enabled="on"),
tools=[
# ... add 20+ MCP tools to trigger deferral
],
llm="gpt-4o-mini", # or use mock LLM for deterministic testing
)
# Execute with prompt that requires tool discovery
result = await agent.start(
"Search for a weather tool and use it to get the weather in San Francisco"
)
# Verify agent discovered and used bridge tools
assert result.success
assert "weather" in result.response.lower()
# Verify tool_search β tool_describe β tool_call bridge flow in execution traceThis ensures the feature works end-to-end with real agent execution, LLM calls, and bridge tool orchestration. As per coding guidelines, tests should validate actual agentic behavior, not just component construction.
π€ Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/praisonai-agents/tests/unit/tools/test_tool_search.py` around lines 1 -
394, Add a new async integration test that actually runs an Agent end-to-end
using the real LLM: instantiate Agent with ToolSearchConfig(enabled="on"),
supply 20+ deferrable MCP-like tools, call agent.start(...) with a prompt that
requires discovering and invoking a weather tool, then assert result.success is
True and the text response contains expected content (e.g., "weather" or "San
Francisco"); also verify the execution trace includes the bridge flow
(tool_search β tool_describe β tool_call). Use the Agent and ToolSearchConfig
symbols from the codebase and ensure the test is marked as integration/async so
it executes real LLM calls. Ensure the test fails fast if the bridge tools are
not discovered or invoked.
|
Claude finished @MervinPraison's task ββ View job Final Architecture Review - Tool Search Feature βSummaryAfter comprehensive review of the Tool Search progressive disclosure feature, I've successfully addressed all critical issues identified by reviewers and ensured compliance with AGENTS.md architecture principles. Architecture Compliance βThe feature now properly follows all architectural guidelines:
Critical Issues Fixed ββ P0 Issues (Blocking)
β P1 Issues (High Priority)
β P2 Issues (Important)
Todo List
Validation Results β
Final Verdict: APPROVED FOR MERGE βThe Tool Search feature is now production-ready and follows all architectural principles:
Recommendation: Merge this PR. All critical issues have been resolved and the feature demonstrates excellent engineering practices aligned with the PraisonAI architecture principles. Files Modified: Commit: |
- Fix unconditional bridge tool hijacking that broke user tools when disabled - Resolve duplicate ToolSearchConfig classes by consolidating to tools module - Add missing resolve_tool_search resolver function - Fix hot path performance issues in BM25 scoring - Add input validation for tool_call bridge args - Fix cache metadata mismatches in chat_mixin - Remove unnecessary f-string prefixes - Strip __praisonai_deferrable__ from provider payloads - Preserve tool_search config in clone_for_channel - Improve type validation in agent constructor - Fix overly broad MCP name heuristic Co-authored-by: Mervin Praison <MervinPraison@users.noreply.github.com>
Implements comprehensive Tool Search feature for progressive disclosure of MCP and non-core plugin tools, following Hermes Agent design patterns.
Summary
Adds Tool Search β an opt-in progressive-disclosure layer for MCP and non-core plugin tools β with zero overhead when disabled and auto threshold-based activation.
When deferrable tool schemas would consume a large share of the model context window, replaces them with three bridge tools (tool_search, tool_describe, tool_call) and loads individual schemas on demand. Core SDK tools never defer.
Default: tool_search=False (opt-in). Zero overhead when disabled or when auto threshold is not met.
Changes
Phase P0 Implementation Complete
Core Module (tools/tool_search.py):
Agent Integration:
MCP Integration:
Core Tools Classification:
Testing:
API Usage
Design Invariants (from Hermes)
Test Results
β All core functionality tests pass
β Config creation and validation
β Tool classification (core vs deferrable)
β BM25 search with fallback
β Bridge tool assembly and unwrapping
β Agent integration
β MCP tool marking
Future Phases
Fixes #1775
π€ Generated with Claude Code
Co-Authored-By: Claude noreply@anthropic.com
Summary by CodeRabbit
Release Notes
New Features
tool_searchparameter.ToolSearchConfigenables configuration of search limits, thresholds, and core tool lists.Tests