Skip to content

feat: attach graph freshness provenance to MCP tool responses#604

Open
SHudici wants to merge 1 commit into
tirth8205:mainfrom
SHudici:feat/response-provenance
Open

feat: attach graph freshness provenance to MCP tool responses#604
SHudici wants to merge 1 commit into
tirth8205:mainfrom
SHudici:feat/response-provenance

Conversation

@SHudici

@SHudici SHudici commented Jul 3, 2026

Copy link
Copy Markdown

Problem

Tool responses carry no freshness signal. An agent asking query_graph(callers_of, X) gets an answer that may describe a graph built hours ago, on a different branch, at a different commit — and has no way to tell. The build metadata already exists (last_updated, git_branch, git_head_sha in the metadata table); it's just never surfaced where decisions are made.

Fix

Every graph-backed tool response now carries a compact _graph provenance envelope:

"_graph": {
    "updated_at": "2026-07-03T18:22:41",
    "age_seconds": 5121,
    "built_on_branch": "main",
    "built_at_sha": "b72413c9d0aa1e..."
}

Cost: one read-only SQLite open + a 3-row SELECT per tool call (off-loop for async tools), ~4 lines of JSON per response.

Testing

13 new tests: metadata read (all fields), URI-hostile repo paths (% and #), future-timestamp clamping, unparseable-timestamp omits age_seconds, optional branch/sha, None for missing last_updated / missing graph DB / invalid repo root, envelope attach, no-provenance pass-through, non-dict pass-through, existing-_graph preservation, and one end-to-end registered-tool response. The #46/#136 async/to_thread guards still pass. Full suite passes.

🤖 Generated with Claude Code

Agents consuming graph tools have no way to tell whether an answer
reflects the current tree: the graph may have been built hours ago,
on a different branch, at a different commit. Every response now
carries a compact `_graph` envelope:

    "_graph": {
        "updated_at": "2026-07-03T18:22:41",
        "age_seconds": 5121,
        "built_on_branch": "main",
        "built_at_sha": "b72413c9d0aa..."
    }

- `graph_provenance()` (tools/_common.py) reads `last_updated`,
  `git_branch`, `git_head_sha` from the graph's metadata table via a
  read-only SQLite connection. The db path is escaped with
  `Path.as_uri()` so URI-significant characters (#, %) in repo paths
  cannot derail the SQLite URI parser. Best-effort by design: any
  failure (no graph, unreadable DB, missing metadata) returns None
  and never fails the tool call.
- `with_provenance()` attaches the envelope to dict responses only,
  skips results that already carry `_graph`, and passes everything
  else through untouched.
- All 27 graph-backed MCP tools in main.py wrap their returns. For
  the five async tools the wrap runs inside the asyncio.to_thread
  worker (the envelope read opens the graph DB, and the event loop
  must never touch disk — tirth8205#46, tirth8205#136). Excluded: get_docs_section_tool,
  list_repos_tool, cross_repo_search_tool (not backed by a single
  repo graph).

13 new tests covering metadata read, URI-hostile paths (% and # in
the repo path), age clamping, unparseable timestamps, optional
fields, all no-op paths, and one end-to-end registered-tool response.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant