Summary
Implement deterministic lexical search for upstream MCP tools through catalog/search-tools.
Context
Epic: #908
Spec: https://github.com/andreasronge/ptc_runner/blob/main/Plans/ptc-runner-mcp-catalog-exposure.md (§7.4)
Search should stay deterministic and local: no embeddings, no LLM summaries, and stable ordering for identical catalog state.
Acceptance criteria
catalog/search-tools searches server names, metadata, tool names, descriptions, input arg keys, and MCP annotations using deterministic lexical scoring.
- Results are ordered by exact/prefix/substring match score with stable
{server, tool} tie-breaking.
- Unloaded upstreams with metadata matches return server-level results by default.
:load true loads candidate catalogs and can return tool-level matches from newly loaded catalogs.
- Tests cover deterministic ordering, invalid options, empty query, unloaded server-level matches,
:load true, and result-size truncation.
Implementation outline
Integration points (following existing catalog builtin patterns)
-
Analyzer (lib/ptc_runner/lisp/analyze.ex):
- Add
:search-tools to the known catalog symbols whitelist (~line 239)
- Add
catalog_core_ast_tag(:"search-tools") → :catalog_search_tools (~line 1301)
- Add
dispatch_list_form clauses for 1-2 args (query, optional opts) (~line 454)
- Update error message for unknown catalog functions to include
catalog/search-tools (~lines 243-248, 455-459)
-
Evaluator (lib/ptc_runner/lisp/eval.ex):
- Add
do_eval({:catalog_search_tools, arg_asts}, eval_ctx) clause (~line 125)
- Add
catalog_op_args(:search_tools, ...) clauses (~line 1397)
-
CatalogBuiltins (mcp_server/lib/ptc_runner_mcp/catalog_builtins.ex):
- Add
dispatch(:search_tools, [query | rest], ...) clause
- Implement
do_search_tools/5 with lexical scoring
- Option parsing/validation:
:limit (default 8, max 50), :load (default false, must be boolean)
- Empty/whitespace query → programmer fault
Search algorithm (spec §7.4)
- Tokenize query into lowercase tokens (split on whitespace, underscores, hyphens, camelCase boundaries)
- For each tool across all loaded servers, score against:
- Server name
- Operator-provided upstream description and capabilities
- Tool name
- Tool description
- Input schema property names (arg keys)
- MCP annotations when present
- Scoring tiers:
- Exact token match → highest score
- Prefix match → medium score
- Substring match → lower score
- Name field boost → small bonus for matches in server/tool names
- Tie-breaking:
{server_name, tool_name} ascending (deterministic)
- Return top
:limit results
Server-level matches for unloaded upstreams
When :load false (default):
- Score unloaded upstream metadata (name, description, capabilities) against query
- If score > 0, return a server-level entry:
{:server "name" :tool nil :summary "..." :catalog_loaded false
:next "(catalog/list-tools "name" {:limit 20})"}
When :load true:
- Load candidate upstream catalogs via
ensure_started before scoring
- Can return tool-level matches from newly loaded catalogs
Result size capping
Apply maybe_cap_list_result/2 (already exists) to the scored results, same as catalog/list-tools.
Edge cases
- Empty or whitespace-only query string → programmer fault
:limit outside 1..50 → programmer fault
:load with non-boolean value → programmer fault
- No matching results → return
[] (empty list, not nil)
- All upstreams unloaded with
:load false → return only server-level matches
- Query matches only metadata of unloaded upstream → server-level result
- Tokenization of camelCase, snake_case, kebab-case tool names
- Result size exceeds
max_catalog_result_bytes → truncate or world fault per §8.1
Test scenarios
Core Lisp tests (test/catalog_builtins_test.exs)
- Analyzer accepts
catalog/search-tools with 1 arg (query only)
- Analyzer accepts
catalog/search-tools with 2 args (query + opts)
- Analyzer rejects
catalog/search-tools with 0 or 3+ args
- Unknown catalog member error message includes
catalog/search-tools
catalog/search-tools without catalog_exec raises aggregator-mode fault
- World fault from search returns nil
- Search results usable in program logic (map, filter, etc.)
MCP server tests (mcp_server/test/ptc_runner_mcp/catalog_builtins_test.exs)
- Deterministic ordering: same query always produces same result order
- Exact match scores higher than prefix/substring
- Tool name matches rank higher than description-only matches
- Empty query is programmer fault
- Invalid
:limit is programmer fault
- Invalid
:load (non-boolean) is programmer fault
- Server-level matches for unloaded upstreams (
:load false)
:load true loads catalogs and returns tool-level matches
- Result-size truncation via
max_catalog_result_bytes
- Multi-server search returns results from all loaded servers
Blocked by
Blocked by: #911 (closed)
Out of scope
- Semantic/vector retrieval
- LLM-generated summaries
- Fuzzy/edit-distance matching
Automation State
| Field |
Value |
| Status |
SUCCESS |
| PR |
#924 |
| Branch |
claude/912-catalog-search-tools |
| Attempts |
1 |
Details: Implemented catalog/search-tools as the 5th catalog builtin following spec §7.4. Full analyzer/evaluator/implementation pipeline with 16 new tests covering scoring, validation, server-level matches, :load true, and result capping.
Summary
Implement deterministic lexical search for upstream MCP tools through
catalog/search-tools.Context
Epic: #908
Spec: https://github.com/andreasronge/ptc_runner/blob/main/Plans/ptc-runner-mcp-catalog-exposure.md (§7.4)
Search should stay deterministic and local: no embeddings, no LLM summaries, and stable ordering for identical catalog state.
Acceptance criteria
catalog/search-toolssearches server names, metadata, tool names, descriptions, input arg keys, and MCP annotations using deterministic lexical scoring.{server, tool}tie-breaking.:load trueloads candidate catalogs and can return tool-level matches from newly loaded catalogs.:load true, and result-size truncation.Implementation outline
Integration points (following existing catalog builtin patterns)
Analyzer (
lib/ptc_runner/lisp/analyze.ex)::search-toolsto the known catalog symbols whitelist (~line 239)catalog_core_ast_tag(:"search-tools")→:catalog_search_tools(~line 1301)dispatch_list_formclauses for 1-2 args (query, optional opts) (~line 454)catalog/search-tools(~lines 243-248, 455-459)Evaluator (
lib/ptc_runner/lisp/eval.ex):do_eval({:catalog_search_tools, arg_asts}, eval_ctx)clause (~line 125)catalog_op_args(:search_tools, ...)clauses (~line 1397)CatalogBuiltins (
mcp_server/lib/ptc_runner_mcp/catalog_builtins.ex):dispatch(:search_tools, [query | rest], ...)clausedo_search_tools/5with lexical scoring:limit(default 8, max 50),:load(default false, must be boolean)Search algorithm (spec §7.4)
{server_name, tool_name}ascending (deterministic):limitresultsServer-level matches for unloaded upstreams
When
:load false(default):{:server "name" :tool nil :summary "..." :catalog_loaded false :next "(catalog/list-tools "name" {:limit 20})"}When
:load true:ensure_startedbefore scoringResult size capping
Apply
maybe_cap_list_result/2(already exists) to the scored results, same ascatalog/list-tools.Edge cases
:limitoutside 1..50 → programmer fault:loadwith non-boolean value → programmer fault[](empty list, not nil):load false→ return only server-level matchesmax_catalog_result_bytes→ truncate or world fault per §8.1Test scenarios
Core Lisp tests (
test/catalog_builtins_test.exs)catalog/search-toolswith 1 arg (query only)catalog/search-toolswith 2 args (query + opts)catalog/search-toolswith 0 or 3+ argscatalog/search-toolscatalog/search-toolswithoutcatalog_execraises aggregator-mode faultMCP server tests (
mcp_server/test/ptc_runner_mcp/catalog_builtins_test.exs):limitis programmer fault:load(non-boolean) is programmer fault:load false):load trueloads catalogs and returns tool-level matchesmax_catalog_result_bytesBlocked by
Blocked by: #911(closed)Out of scope
Automation State
SUCCESSclaude/912-catalog-search-toolsDetails: Implemented
catalog/search-toolsas the 5th catalog builtin following spec §7.4. Full analyzer/evaluator/implementation pipeline with 16 new tests covering scoring, validation, server-level matches,:load true, and result capping.