feat(parsers): add agent-trace.dev v1 sidecar parser by joahg · Pull Request #12 · ai4curation/ai-blame

joahg · 2026-05-10T01:15:23Z

What

Adds a third parser, AgentTraceParser, that consumes the open
agent-trace.dev v1 spec (.agent-trace/*.json
sidecars) and turns each files[].conversations[].ranges[] entry into
an EditRecord. Registered in ParserRegistry::new() ahead of the
existing Claude / Codex parsers and auto-discovered via a new
get_agent_trace_dirs() (<cwd>/.agent-trace/, ~/.agent-trace/).

Why

ai-blame today supports Claude Code and Codex/Copilot via bespoke
parsers per agent. The agent-trace.dev v1 spec is an open,
agent-agnostic JSON sidecar format intended exactly for this
attribution use case, so a single parser unlocks support for every
producer of the spec — first-party emitters or third-party exporters
that convert an agent's native logs — instead of growing the
parser-per-agent matrix forever.

End-to-end smoke test

Ran against 21 real spec records validated against the upstream
v1 schema:

$ ai-blame stats -t .agent-trace
Files with edits (all files): 41
Total successful edits: 126

$ ai-blame report -t .agent-trace
=== Summary ===
File                                               | Edits | First Edit       | Last Edit       
---------------------------------------------------------------------------------------------------
i18n.md                                            |     2 | 2026-04-13 16:28 | 2026-04-13 16:28
SKILL.md                                           |     6 | 2026-04-17 16:54 | 2026-04-17 16:54
verify-nx-required-tasks.test.ts                   |     6 | 2026-05-08 22:30 | 2026-05-08 22:30
…

Limitations (documented in the parser docstring + README)

agent-trace.dev records carry attribution ranges + content_hash
(spec §6.3) rather than raw old_string / new_string patches. As a
result:

Command	Works on agent-trace records?
`stats`, `timeline`, `report`, `transcript`	✅ Fully
`blame`, `annotate`	⚠️ Marks touched lines but cannot do the patch-walk reconstruction the native parsers do

A future iteration could use content_hash for position-independent
attribution — left as follow-up.

Mapping

Each spec range becomes one EditRecord:

file_path — files[].path (already repo-relative per spec).
timestamp — record's top-level timestamp (per-revision, not per-edit).
model — contributor.model_id (range override → conversation default → unknown).
session_id — conversation.url → related[type=session].url → unknown.
agent_tool / agent_version — record's top-level tool block.
is_create / change_size — heuristic: ranges starting at line 1 are treated as create-shaped; change_size is the line span.

Tests

7 new unit tests in parsers::agent_trace::tests covering parse
output, file-pattern filter, session-id fallback to related[],
can_parse acceptance + rejection (incl. refusing to steal .jsonl
files from sibling parsers), and collect_trace_files extension
filtering.

$ cargo test --lib
test result: ok. 22 passed; 0 failed; 0 ignored

cargo fmt --check and cargo clippy --lib are both clean.

Adds support for the open, vendor-neutral agent-trace.dev v1 spec (https://agent-trace.dev/) as a third parser alongside the existing Claude Code and Codex/Copilot native log parsers. Why --- ai-blame previously had to grow one bespoke parser per agent. The agent-trace.dev v1 spec is an open, agent-agnostic JSON sidecar format designed exactly for this attribution use case, and any producer of the spec (first-party emitter or third-party exporter from native logs) is unlocked at once. What ---- - New `src/parsers/agent_trace.rs` reads `.agent-trace/*.json` records conforming to the v1 schema and emits one `EditRecord` per `files[].conversations[].ranges[]` entry. - Registered in `ParserRegistry::new()` first so the unambiguous `.json` + spec-shaped discriminator wins before the `.jsonl` parsers see the file. - `get_agent_trace_dirs()` discovers `<cwd>/.agent-trace/` and `~/.agent-trace/` automatically and feeds them into `get_all_trace_dirs()`. - Smoke-tested end-to-end against 21 real spec records: 126 edits across 41 files extracted via `ai-blame stats` / `report`. Limitations ----------- agent-trace.dev records carry attribution ranges + `content_hash` (spec §6.3) rather than raw `old_string` / `new_string` patches, so `stats` / `timeline` / `report` / `transcript` work fully but `blame` / `annotate` cannot perform the same patch-walk reconstruction as the native parsers. Documented in the parser module docstring and in the README. Tests ----- 7 new unit tests in `parsers::agent_trace::tests` covering parse, filter, fallback, can_parse acceptance/rejection, and `collect_trace_files` extension filtering. Full `cargo test` is green (22 passing); `cargo fmt --check` and `cargo clippy` clean. Amp-Thread-ID: https://ampcode.com/threads/T-019e0ee7-99b6-72e7-a9eb-e71bba013ceb Co-authored-by: Amp <amp@ampcode.com>

When the spec's top-level `tool` block is absent (some emitters drop the whole block because the spec requires both `name` and `version` when `tool` is present, and they only have a name) we were falling all the way back to `agent_tool=agent-trace` / `model=unknown`, even though the conversation URL or related-session URN clearly identified the agent. Now we walk a per-conversation derivation chain: - `agent_tool` ← `tool.name` → conversation URL host (ampcode.com → amp, cursor.{sh,com} → cursor, claude.ai → claude-code, {openai,chatgpt}.com → codex, *.block.xyz → goose) → session URN agent slug (`urn:*:session:<agent>:<id>`) → `agent-trace`. - `model` ← per-range `contributor.model_id` → conversation `contributor.model_id` → `<contributor.type> (model unspecified)` (e.g. `ai (model unspecified)`) → `unknown`. - `session_id` ← trailing path/URN segment of conversation `url` → trailing segment of `related[type=session]` → full URL/URN → `unknown`. Trailing-segment matches what the native parsers use, so cross-tool correlation works. Also derives per-conversation rather than per-record so a single record covering multiple agents (e.g. a hand-off) attributes each conversation correctly. Verified end-to-end against the same 21 real records: previously every edit displayed as `agent-trace / unknown`; now they correctly attribute as `amp / ai (model unspecified)`. Tests: 4 new (agent_url_host_mapping, agent_session_urn_extraction, derives_agent_from_amp_url_when_tool_block_absent, record_tool_name_takes_priority_over_url_sniffing); existing tests updated for trailing-segment session_id. cargo test: 26 passing; fmt + clippy clean. Amp-Thread-ID: https://ampcode.com/threads/T-019e0ee7-99b6-72e7-a9eb-e71bba013ceb Co-authored-by: Amp <amp@ampcode.com>

joahg · 2026-05-10T01:37:33Z

🤖 Sent by Joah's AI agent:

End-to-end output against 21 real .agent-trace/*.json v1 sidecars (126 edits across 41 files):

`stats`

$ ai-blame stats -t .agent-trace
Trace directory: ".agent-trace"
Trace files: 0          ← legacy .jsonl-only counter
  Session traces: 0
  Agent traces: 0

Files with edits (all files): 41
Total successful edits: 126

`timeline -n 0`

All 126 entries correctly attributed (representative sample):

=== Timeline of Actions ===
Showing 126 most recent edits

Timestamp            Action     File                                               Model                     Agent
-----------------------------------------------------------------------------------------------------------------------------
2026-05-08 22:30:52  CREATED    .../square-web-platform-stamps/apps/OWNERS.yaml    ai (model unspecified)    amp
2026-05-08 22:30:52  CREATED    .../verify-nx-required-tasks.ts                    ai (model unspecified)    amp
2026-05-08 22:30:52  CREATED    .../square-web-platform-stamps/scripts/ci.sh       ai (model unspecified)    amp
2026-05-08 22:30:52  CREATED    .../package-json-change-detector.spec.ts           ai (model unspecified)    amp
2026-05-08 22:30:52  CREATED    .../platform/infra-project-scaffolder/src/new.ts   ai (model unspecified)    amp
…
2026-04-23 22:49:47  CREATED    .../MenuShow/hooks/useFetchMenuItems.ts            ai (model unspecified)    amp
2026-04-17 16:54:52  CREATED    .agents/skills/saas-experiments/SKILL.md           ai (model unspecified)    amp
2026-04-17 16:54:52  CREATED    .agents/skills/feature-flags/SKILL.md              ai (model unspecified)    amp
2026-04-17 16:54:52  CREATED    .agents/skills/tracking-events/SKILL.md            ai (model unspecified)    amp
2026-04-13 18:27:13  CREATED    libs/trust/shared-ui/stylelint.config.mjs          ai (model unspecified)    amp
2026-04-13 18:27:13  CREATED    .../components/address/{ca,us}/address.module.css  ai (model unspecified)    amp
2026-04-13 17:37:42  CREATED    apps/managerbot/managerbot-e2e/tests/e2e/…         ai (model unspecified)    amp
2026-04-13 17:37:42  CREATED    libs/shared/util-tests/src/market/actions.ts       ai (model unspecified)    amp
2026-04-13 17:10:27  CREATED    MODULES.yaml                                       ai (model unspecified)    amp
2026-04-13 16:28:17  CREATED    .agents/checks/{i18n,testing}.md                   ai (model unspecified)    amp
2026-04-13 14:12:21  CREATED    .ai-usage-marker                                   ai (model unspecified)    amp
2026-04-13 14:10:09  CREATED    libs/shared/util-tests/src/initialize.ts           ai (model unspecified)    amp
2026-04-10 22:02:01  CREATED    libs/shared/types-protos/protogen.config.ts        ai (model unspecified)    amp

Total edits found: 126

`report`

Per-file summary + output plan + sidecar previews:

$ ai-blame report -t .agent-trace
Scanning traces in: ".agent-trace"

=== Summary ===
File                                               | Edits | First Edit       | Last Edit
---------------------------------------------------------------------------------------------
i18n.md                                            |     2 | 2026-04-13 16:28 | 2026-04-13 16:28
testing.md                                         |     2 | 2026-04-13 16:28 | 2026-04-13 16:28
SKILL.md                                           |     6 | 2026-04-17 16:54 | 2026-04-17 16:54
SKILL.md                                           |     4 | 2026-04-17 16:54 | 2026-04-17 16:54
…41 rows total…
initialize.ts                                      |    10 | 2026-04-13 14:10 | 2026-04-13 14:10
actions.ts                                         |     8 | 2026-04-13 17:37 | 2026-04-13 17:37
verify-owners-file.ts                              |     2 | 2026-04-13 14:12 | 2026-04-13 14:12

=== Output Plan ===
File                                               | Policy     | destination
---------------------------------------------------------------------------------------------
i18n.md                                            | sidecar    | .agents/checks/i18n.history.yaml
SKILL.md                                           | sidecar    | .agents/skills/saas-experiments/SKILL.history.yaml
ci.sh                                              | sidecar    | …/scripts/ci.history.yaml
OWNERS.yaml                                        | append     | in-place
…41 rows total…

=== YAML Preview: .agents/checks/i18n.md ===
edit_history:
- timestamp: 2026-04-13T16:28:17.539Z
  model: ai (model unspecified)
  action: CREATED
  agent_tool: amp
- timestamp: 2026-04-13T16:28:17.539Z
  model: ai (model unspecified)
  action: EDITED
  agent_tool: amp

=== YAML Preview: .agents/skills/ci-analytics/SKILL.md ===
edit_history:
- timestamp: 2026-04-17T16:54:52.894Z
  model: ai (model unspecified)
  action: CREATED
  agent_tool: amp
- timestamp: 2026-04-17T16:54:52.894Z
  model: ai (model unspecified)
  action: EDITED
  agent_tool: amp
…
… and 36 more files (use --show-all to see all)

The model column reads ai (model unspecified) because these particular sidecars set contributor.type: "ai" but no model_id. Records that include a model_id (e.g. anthropic/claude-opus-4-5) populate the column literally — covered by the parses_spec_record_into_edits test.

The Agent: amp column is recovered from the conversation URL host (ampcode.com) even though these records have no top-level tool block — the new derivation chain on the second commit (agent-trace: derive agent + model from any available signal) is what unlocks this.

joahg marked this pull request as ready for review May 10, 2026 01:16

joahg force-pushed the joah/agent-trace-dev-parser branch from 67c80d4 to 63a36fd Compare May 10, 2026 01:19

joahg closed this May 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(parsers): add agent-trace.dev v1 sidecar parser#12

feat(parsers): add agent-trace.dev v1 sidecar parser#12
joahg wants to merge 2 commits into
ai4curation:mainfrom
joahg:joah/agent-trace-dev-parser

joahg commented May 10, 2026 •

edited

Loading

Uh oh!

joahg commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joahg commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

End-to-end smoke test

Limitations (documented in the parser docstring + README)

Mapping

Tests

Uh oh!

joahg commented May 10, 2026

stats

timeline -n 0

report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

joahg commented May 10, 2026 •

edited

Loading

`stats`

`timeline -n 0`

`report`