Skip to content

Releases: Siddhant-K-code/agent-trace

v0.36.0 — Langfuse Export and OTLP Behavioral Metrics

17 May 13:22
a5feb00

Choose a tag to compare

Export eval scores and behavioral metrics to external observability backends.

What's new

agent-strace export now supports two backends:

Langfuse — push eval scores as Langfuse trace scores:

agent-strace export --scores --backend langfuse \
  --langfuse-host https://cloud.langfuse.com \
  --langfuse-public-key pk-... \
  --langfuse-secret-key sk-...

OTLP gauge metrics — push behavioral metrics as OpenTelemetry gauges:

agent-strace export --metrics --backend otlp \
  --otlp-endpoint http://localhost:4318

Exported metrics: agent.session.tokens, agent.session.duration_ms, agent.session.error_rate, agent.session.tool_calls, agent.session.file_writes, agent.session.cost_usd

v0.35.0 — Optimize: AGENTS.md Improvement Proposals

17 May 13:22
5e611d4

Choose a tag to compare

Automatically propose improvements to your AGENTS.md based on patterns in trace failures.

What's new

agent-strace optimize scans recent sessions for failure patterns and generates concrete AGENTS.md additions to prevent recurrence.

agent-strace optimize
agent-strace optimize --since-days 14 --output proposals.md
agent-strace optimize --llm --base-url http://localhost:11434/v1 --model llama3
  • Heuristic mode (default): no LLM required — detects error-no-change loops, wide blast radius, high retry rates, long stalls
  • LLM mode: sends failure summaries to any OpenAI-compatible endpoint for richer proposals
  • Output: ready-to-paste AGENTS.md snippets with rationale

v0.34.0 — Eval Trend Dashboard

17 May 13:22
983a1af

Choose a tag to compare

Track how eval scores change over time with a visual trend dashboard.

What's new

agent-strace dashboard --trend renders an HTML sparkline dashboard showing score trends across sessions for every scorer in your eval config.

agent-strace dashboard --trend
agent-strace dashboard --trend --output trend.html
agent-strace dashboard --annotate SESSION_ID "Fixed retry loop"
  • Per-scorer sparklines with pass/fail coloring
  • Annotations: attach notes to specific sessions (stored in .agent-traces/annotations.json)
  • Terminal summary table alongside the HTML output
  • Works with any eval config — no extra setup

v0.33.0 — Behavioral Drift Detection

17 May 13:22
b97d8c5

Choose a tag to compare

Detect when an agent's behavior has shifted between sessions without needing an LLM.

What's new

agent-strace drift computes a behavioral fingerprint for each session across 6 dimensions — tool mix, error rate, retry rate, file blast radius, token rate, and duration — then measures Jensen-Shannon divergence between two sessions or a session and a rolling baseline.

agent-strace drift SESSION_A SESSION_B
agent-strace drift --baseline 7d SESSION_ID
  • Fingerprints are <2 KB JSON, storable alongside traces
  • Drift score 0.0–1.0; configurable alert threshold (--threshold)
  • No LLM required — pure statistical comparison
  • Output: per-dimension breakdown + overall drift score

v0.32.1 — Behavioral Drift, Eval Trend Dashboard, Optimize, Langfuse Export, LLM Judge

17 May 12:15
f404833

Choose a tag to compare

Batch release covering five features shipped between v0.32.0 and v0.33.0. Each feature is also available as a standalone release (v0.33.0–v0.37.0).

  • Behavioral drift detection (agent-strace drift) — Jensen-Shannon divergence across 6 behavioral dimensions
  • Eval trend dashboard (agent-strace dashboard --trend) — sparkline HTML dashboard with annotations
  • Optimize (agent-strace optimize) — propose AGENTS.md improvements from trace failures
  • Langfuse export (agent-strace export --scores --backend langfuse) — push eval scores to Langfuse
  • OTLP behavioral metrics (agent-strace export --metrics --backend otlp) — gauge metrics to any OTLP backend
  • LLM-as-judge scorer — score sessions via any OpenAI-compatible endpoint
  • Dataset auto-sampling (eval dataset auto) — 6 signal filters to build eval datasets from traces
  • Eval CI baseline (eval ci --baseline) — regression gate with GitHub Actions PR comment output

v0.32.0 — Self-Contained HTML Session Replay Viewer

19 Apr 11:45

Choose a tag to compare

Self-Contained HTML Session Replay Viewer

agent-strace replay --format html generates a single-file HTML viewer for any recorded session. No server, no dependencies — open it in any browser.

agent-strace replay --format html
agent-strace replay --format html --output review.html SESSION_ID

Viewer features:

  • Animated event timeline with configurable playback speed (up to 4×)
  • Scrubber bar for jumping to any point in the session
  • Running cost counter updated as events play
  • Click-to-expand event detail (full JSON payload)
  • Color-coded event types: tool calls, LLM requests, file ops, errors
  • Pause/resume and show-all controls
  • Dark theme, zero external dependencies (no CDN, no fonts)

All event data is embedded as a JSON constant in the HTML file. Useful for sharing sessions with teammates or attaching to PR reviews without requiring them to install anything.

v0.31.0 — Agent Standup Report from Session Trace

19 Apr 11:43

Choose a tag to compare

Agent Standup Report from Session Trace

agent-strace standup generates a structured standup report from a session trace — no LLM call required.

agent-strace standup
agent-strace standup --session SESSION_ID

Report sections:

What the agent did:

  • Files read and modified
  • Approaches tried, including abandoned ones (detected from retry patterns)
  • New dependencies added (npm install, pip install, etc.)

What it was uncertain about:

  • TODO / FIXME / assumption comments written into files

What to review carefully:

  • Large changes (>100 lines), new dependencies, auth and migration patterns

Stats: tool calls, context resets, retries, errors

Useful for async teams where the agent runs overnight and a human needs a quick brief before picking up the work.

v0.30.0 — On-Call Readiness Report for Agent-Modified Files

19 Apr 11:42

Choose a tag to compare

On-Call Readiness Report for Agent-Modified Files

When an agent has been writing code, the human on call may not have read it. agent-strace oncall cross-references agent-modified files from the trace store against git history to surface cognitive gaps before a rotation.

agent-strace oncall --rotation-start 2026-04-25
agent-strace oncall --rotation-start 2026-04-25 --scope "src/payments/**"

For each file the agent has written in the last N days, the report shows:

  • How long ago it was modified
  • How many lines changed (from git log --numstat)
  • Estimated reading time (~200 lines/minute)
  • Total catch-up time before rotation

--scope filters to a file glob. --since-days controls how far back to scan sessions (default: 30).

v0.29.0 — Context Freshness Check Before Starting a Session

19 Apr 11:42

Choose a tag to compare

Context Freshness Check Before Starting a Session

Before handing a task to an agent, it helps to know how stale its last view of the codebase is. agent-strace freshness compares the current state against what the agent last saw, using git diff between the last session timestamp and HEAD.

agent-strace freshness
agent-strace freshness --since 2026-04-01 --scope "src/**"

Report includes:

  • Files changed since the last session (or since --since date)
  • Per-file change type (modified / added / deleted / renamed) and line count
  • Freshness score 0–100 (100 = nothing changed since last session)
  • Estimated catch-up reading time for in-scope files

Scope is auto-detected from CLAUDE.md / AGENTS.md scope sections, or overridden with --scope. No API calls required.

v0.28.0 — A2A Protocol Support with Cross-Agent Trace Correlation

19 Apr 11:42

Choose a tag to compare

A2A Protocol Support with Cross-Agent Trace Correlation

First-class support for agent-to-agent calls following the Google A2A spec. A2A calls are captured as TOOL_CALL events with event_subtype=a2a_call, so they are backward-compatible with all existing replay and export tooling.

agent-strace a2a-tree
agent-strace a2a-tree SESSION_ID --format json

New capabilities:

  • Detects A2A calls by path, header, and body heuristics
  • Builds the full agent call graph by following sub_session_id links and parent_session_id back-references
  • Renders the call graph as an ASCII tree
  • Exports the graph as OTLP-compatible spans for Jaeger, Tempo, or any OpenTelemetry backend

Child sessions are linked via parent_session_id and parent_event_id in session metadata.