Skip to content

MCP catalog exposure: deterministic catalog search #912

@andreasronge

Description

@andreasronge

Summary

Implement deterministic lexical search for upstream MCP tools through catalog/search-tools.

Context

Epic: #908

Spec: https://github.com/andreasronge/ptc_runner/blob/main/Plans/ptc-runner-mcp-catalog-exposure.md (§7.4)

Search should stay deterministic and local: no embeddings, no LLM summaries, and stable ordering for identical catalog state.

Acceptance criteria

  • catalog/search-tools searches server names, metadata, tool names, descriptions, input arg keys, and MCP annotations using deterministic lexical scoring.
  • Results are ordered by exact/prefix/substring match score with stable {server, tool} tie-breaking.
  • Unloaded upstreams with metadata matches return server-level results by default.
  • :load true loads candidate catalogs and can return tool-level matches from newly loaded catalogs.
  • Tests cover deterministic ordering, invalid options, empty query, unloaded server-level matches, :load true, and result-size truncation.

Implementation outline

Integration points (following existing catalog builtin patterns)

  1. Analyzer (lib/ptc_runner/lisp/analyze.ex):

    • Add :search-tools to the known catalog symbols whitelist (~line 239)
    • Add catalog_core_ast_tag(:"search-tools"):catalog_search_tools (~line 1301)
    • Add dispatch_list_form clauses for 1-2 args (query, optional opts) (~line 454)
    • Update error message for unknown catalog functions to include catalog/search-tools (~lines 243-248, 455-459)
  2. Evaluator (lib/ptc_runner/lisp/eval.ex):

    • Add do_eval({:catalog_search_tools, arg_asts}, eval_ctx) clause (~line 125)
    • Add catalog_op_args(:search_tools, ...) clauses (~line 1397)
  3. CatalogBuiltins (mcp_server/lib/ptc_runner_mcp/catalog_builtins.ex):

    • Add dispatch(:search_tools, [query | rest], ...) clause
    • Implement do_search_tools/5 with lexical scoring
    • Option parsing/validation: :limit (default 8, max 50), :load (default false, must be boolean)
    • Empty/whitespace query → programmer fault

Search algorithm (spec §7.4)

  1. Tokenize query into lowercase tokens (split on whitespace, underscores, hyphens, camelCase boundaries)
  2. For each tool across all loaded servers, score against:
    • Server name
    • Operator-provided upstream description and capabilities
    • Tool name
    • Tool description
    • Input schema property names (arg keys)
    • MCP annotations when present
  3. Scoring tiers:
    • Exact token match → highest score
    • Prefix match → medium score
    • Substring match → lower score
    • Name field boost → small bonus for matches in server/tool names
  4. Tie-breaking: {server_name, tool_name} ascending (deterministic)
  5. Return top :limit results

Server-level matches for unloaded upstreams

When :load false (default):

  • Score unloaded upstream metadata (name, description, capabilities) against query
  • If score > 0, return a server-level entry:
    {:server "name" :tool nil :summary "..." :catalog_loaded false
     :next "(catalog/list-tools "name" {:limit 20})"}

When :load true:

  • Load candidate upstream catalogs via ensure_started before scoring
  • Can return tool-level matches from newly loaded catalogs

Result size capping

Apply maybe_cap_list_result/2 (already exists) to the scored results, same as catalog/list-tools.

Edge cases

  • Empty or whitespace-only query string → programmer fault
  • :limit outside 1..50 → programmer fault
  • :load with non-boolean value → programmer fault
  • No matching results → return [] (empty list, not nil)
  • All upstreams unloaded with :load false → return only server-level matches
  • Query matches only metadata of unloaded upstream → server-level result
  • Tokenization of camelCase, snake_case, kebab-case tool names
  • Result size exceeds max_catalog_result_bytes → truncate or world fault per §8.1

Test scenarios

Core Lisp tests (test/catalog_builtins_test.exs)

  • Analyzer accepts catalog/search-tools with 1 arg (query only)
  • Analyzer accepts catalog/search-tools with 2 args (query + opts)
  • Analyzer rejects catalog/search-tools with 0 or 3+ args
  • Unknown catalog member error message includes catalog/search-tools
  • catalog/search-tools without catalog_exec raises aggregator-mode fault
  • World fault from search returns nil
  • Search results usable in program logic (map, filter, etc.)

MCP server tests (mcp_server/test/ptc_runner_mcp/catalog_builtins_test.exs)

  • Deterministic ordering: same query always produces same result order
  • Exact match scores higher than prefix/substring
  • Tool name matches rank higher than description-only matches
  • Empty query is programmer fault
  • Invalid :limit is programmer fault
  • Invalid :load (non-boolean) is programmer fault
  • Server-level matches for unloaded upstreams (:load false)
  • :load true loads catalogs and returns tool-level matches
  • Result-size truncation via max_catalog_result_bytes
  • Multi-server search returns results from all loaded servers

Blocked by

Blocked by: #911 (closed)

Out of scope

  • Semantic/vector retrieval
  • LLM-generated summaries
  • Fuzzy/edit-distance matching

Automation State

Field Value
Status SUCCESS
PR #924
Branch claude/912-catalog-search-tools
Attempts 1

Details: Implemented catalog/search-tools as the 5th catalog builtin following spec §7.4. Full analyzer/evaluator/implementation pipeline with 16 new tests covering scoring, validation, server-level matches, :load true, and result capping.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions