Context
Claude Context is increasingly being used as a cross-client MCP code-search layer: Claude Code, Cursor, Codex, Gemini CLI, Windsurf, VS Code, etc. The README framing — “entire codebase as context” via on-demand hybrid retrieval — is exactly where downstream agent observability starts to get blurry.
When a coding agent answers badly, today I can usually see model/tool spans or token/cost metrics. What is harder to prove is:
- which retrieved code chunks were actually returned to the agent;
- whether two similar queries returned duplicate chunks;
- whether a result was stale relative to the indexed snapshot;
- whether a client/harness later transformed, clipped, or suppressed those chunks;
- how to debug this without logging raw code, prompts, tool arguments, or transcripts.
Proposal
Would you consider exposing an optional, privacy-preserving search/result receipt from search_code responses and/or logs?
A minimal shape could be something like:
{
"event": "context.search.returned",
"query_hash": "sha256:...",
"codebase_path_hash": "sha256:...",
"index_snapshot_id": "...",
"result_count": 8,
"results": [
{
"source.path": "src/auth/session.ts",
"source.range": "L42-L91",
"source.bytes_hash": "sha256:...",
"chunk.hash": "sha256:...",
"rank": 1,
"score_bucket": "high"
}
],
"duplicate.dedupe_key": "sha256:..."
}
Important constraint: this should not require raw code text, raw user query, prompts, memory contents, tool args, or transcripts. Paths/ranges/hashes/counts/categories are enough for many debugging and eval workflows.
Why this may fit Claude Context
This is narrower than full agent telemetry. Claude Context does not need to know whether Claude Code/Cursor ultimately placed each chunk in the final model prompt. But it is the best layer to attest:
- “these indexed chunks matched this search request”;
- “this snapshot/version of the codebase was used”;
- “these result identities can be correlated by a harness later.”
Then an external harness/OTel wrapper can emit a separate context.input.loaded event if/when those chunks enter the agent session.
Related work / motivation
I have been testing this idea from the observability side in OpenTelemetry GenAI and Claude Code telemetry wrappers:
The feedback so far pushed the design toward a layered model:
- search/retrieval tools expose result receipts;
- harnesses/agents expose loaded context receipts;
- both stay privacy-first by default.
Would a search_code receipt/log be in scope for Claude Context, or would you rather keep this outside the MCP server as a wrapper around calls?
Context
Claude Context is increasingly being used as a cross-client MCP code-search layer: Claude Code, Cursor, Codex, Gemini CLI, Windsurf, VS Code, etc. The README framing — “entire codebase as context” via on-demand hybrid retrieval — is exactly where downstream agent observability starts to get blurry.
When a coding agent answers badly, today I can usually see model/tool spans or token/cost metrics. What is harder to prove is:
Proposal
Would you consider exposing an optional, privacy-preserving search/result receipt from
search_coderesponses and/or logs?A minimal shape could be something like:
{ "event": "context.search.returned", "query_hash": "sha256:...", "codebase_path_hash": "sha256:...", "index_snapshot_id": "...", "result_count": 8, "results": [ { "source.path": "src/auth/session.ts", "source.range": "L42-L91", "source.bytes_hash": "sha256:...", "chunk.hash": "sha256:...", "rank": 1, "score_bucket": "high" } ], "duplicate.dedupe_key": "sha256:..." }Important constraint: this should not require raw code text, raw user query, prompts, memory contents, tool args, or transcripts. Paths/ranges/hashes/counts/categories are enough for many debugging and eval workflows.
Why this may fit Claude Context
This is narrower than full agent telemetry. Claude Context does not need to know whether Claude Code/Cursor ultimately placed each chunk in the final model prompt. But it is the best layer to attest:
Then an external harness/OTel wrapper can emit a separate
context.input.loadedevent if/when those chunks enter the agent session.Related work / motivation
I have been testing this idea from the observability side in OpenTelemetry GenAI and Claude Code telemetry wrappers:
The feedback so far pushed the design toward a layered model:
Would a
search_codereceipt/log be in scope for Claude Context, or would you rather keep this outside the MCP server as a wrapper around calls?