Expose privacy-preserving search result receipts

## Context

Claude Context is increasingly being used as a cross-client MCP code-search layer: Claude Code, Cursor, Codex, Gemini CLI, Windsurf, VS Code, etc. The README framing — “entire codebase as context” via on-demand hybrid retrieval — is exactly where downstream agent observability starts to get blurry.

When a coding agent answers badly, today I can usually see model/tool spans or token/cost metrics. What is harder to prove is:

- which retrieved code chunks were actually returned to the agent;
- whether two similar queries returned duplicate chunks;
- whether a result was stale relative to the indexed snapshot;
- whether a client/harness later transformed, clipped, or suppressed those chunks;
- how to debug this without logging raw code, prompts, tool arguments, or transcripts.

## Proposal

Would you consider exposing an optional, privacy-preserving **search/result receipt** from `search_code` responses and/or logs?

A minimal shape could be something like:

```json
{
  "event": "context.search.returned",
  "query_hash": "sha256:...",
  "codebase_path_hash": "sha256:...",
  "index_snapshot_id": "...",
  "result_count": 8,
  "results": [
    {
      "source.path": "src/auth/session.ts",
      "source.range": "L42-L91",
      "source.bytes_hash": "sha256:...",
      "chunk.hash": "sha256:...",
      "rank": 1,
      "score_bucket": "high"
    }
  ],
  "duplicate.dedupe_key": "sha256:..."
}
```

Important constraint: this should not require raw code text, raw user query, prompts, memory contents, tool args, or transcripts. Paths/ranges/hashes/counts/categories are enough for many debugging and eval workflows.

## Why this may fit Claude Context

This is narrower than full agent telemetry. Claude Context does not need to know whether Claude Code/Cursor ultimately placed each chunk in the final model prompt. But it is the best layer to attest:

1. “these indexed chunks matched this search request”; 
2. “this snapshot/version of the codebase was used”; 
3. “these result identities can be correlated by a harness later.”

Then an external harness/OTel wrapper can emit a separate `context.input.loaded` event if/when those chunks enter the agent session.

## Related work / motivation

I have been testing this idea from the observability side in OpenTelemetry GenAI and Claude Code telemetry wrappers:

- https://github.com/open-telemetry/semantic-conventions-genai/issues/181
- https://github.com/TechNickAI/claude_telemetry/issues/8
- https://github.com/TechNickAI/claude_telemetry/pull/9

The feedback so far pushed the design toward a layered model:

- search/retrieval tools expose **result receipts**;
- harnesses/agents expose **loaded context receipts**;
- both stay privacy-first by default.

Would a `search_code` receipt/log be in scope for Claude Context, or would you rather keep this outside the MCP server as a wrapper around calls?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose privacy-preserving search result receipts #382

Context

Proposal

Why this may fit Claude Context

Related work / motivation

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Expose privacy-preserving search result receipts #382

Description

Context

Proposal

Why this may fit Claude Context

Related work / motivation

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions