Skip to content

Consider privacy-preserving context input events #8

@caioribeiroclw-pixel

Description

@caioribeiroclw-pixel

Hi — useful wrapper. One observability gap I keep running into with Claude Code/Cursor-style agent traces is that tool/model spans answer "what ran?", but not always "what context was loaded into the session before the model call?"

For headless/debug workflows, a privacy-preserving context input event would be useful alongside tool calls, tokens, cost, and timing.

A minimal event could avoid raw prompt/context content and record only categorical + hashed evidence, e.g.:

{
  "event.name": "context.input.loaded",
  "gen_ai.conversation.id": "...",
  "context.input.kind": "agent_instructions",
  "context.input.source.path": "AGENTS.md",
  "context.input.source.bytes_hash": "sha256:...",
  "context.input.delivered.hash": "sha256:...",
  "context.input.loaded_by": "native_file_discovery|hook|mcp|generated_fallback",
  "context.input.activation": "session_start|on_demand",
  "context.input.scope": "repo|workspace|session",
  "context.input.duplicate.suppression_policy": "suppress_equal_dedupe_key_within_scope"
}

Why I think this matters:

  • Claude Code now has native OTel, and wrappers like this can make headless runs debuggable.
  • Skills/hooks/MCP/context files can all add instructions before a model call.
  • If the same AGENTS.md/skill/rule is discovered through two paths, normal tool-call traces may not show whether it was loaded once, duplicated, transformed, or clipped.
  • Logging raw prompt/context is often too sensitive, but hashes + paths + categorical fields are usually enough to debug "wrong/missing/duplicate context" failures.

I put together a small public fixture that converts a local session JSONL into OTel-like context.input.loaded events without raw prompts, raw context, tool args, memory contents, or transcript bodies:

https://github.com/caioribeiroclw-pixel/pluribus/tree/main/examples/context-input-evidence

Question: would a small, opt-in context-input event like this fit claude_telemetry's scope, or is it better left to a separate pre/post-processor that enriches traces after the run?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions