Releases · Siddhant-K-code/agent-trace

17 May 13:22

Siddhant-K-code

v0.36.0

a5feb00

v0.36.0 — Langfuse Export and OTLP Behavioral Metrics

Export eval scores and behavioral metrics to external observability backends.

What's new

agent-strace export now supports two backends:

Langfuse — push eval scores as Langfuse trace scores:

agent-strace export --scores --backend langfuse \
  --langfuse-host https://cloud.langfuse.com \
  --langfuse-public-key pk-... \
  --langfuse-secret-key sk-...

OTLP gauge metrics — push behavioral metrics as OpenTelemetry gauges:

agent-strace export --metrics --backend otlp \
  --otlp-endpoint http://localhost:4318

Exported metrics: agent.session.tokens, agent.session.duration_ms, agent.session.error_rate, agent.session.tool_calls, agent.session.file_writes, agent.session.cost_usd

Assets 2

17 May 13:22

Siddhant-K-code

v0.35.0

5e611d4

v0.35.0 — Optimize: AGENTS.md Improvement Proposals

Automatically propose improvements to your AGENTS.md based on patterns in trace failures.

What's new

agent-strace optimize scans recent sessions for failure patterns and generates concrete AGENTS.md additions to prevent recurrence.

agent-strace optimize
agent-strace optimize --since-days 14 --output proposals.md
agent-strace optimize --llm --base-url http://localhost:11434/v1 --model llama3

Heuristic mode (default): no LLM required — detects error-no-change loops, wide blast radius, high retry rates, long stalls
LLM mode: sends failure summaries to any OpenAI-compatible endpoint for richer proposals
Output: ready-to-paste AGENTS.md snippets with rationale

Assets 2

17 May 13:22

Siddhant-K-code

v0.34.0

983a1af

v0.34.0 — Eval Trend Dashboard

Track how eval scores change over time with a visual trend dashboard.

What's new

agent-strace dashboard --trend renders an HTML sparkline dashboard showing score trends across sessions for every scorer in your eval config.

agent-strace dashboard --trend
agent-strace dashboard --trend --output trend.html
agent-strace dashboard --annotate SESSION_ID "Fixed retry loop"

Per-scorer sparklines with pass/fail coloring
Annotations: attach notes to specific sessions (stored in .agent-traces/annotations.json)
Terminal summary table alongside the HTML output
Works with any eval config — no extra setup

Assets 2

17 May 13:22

Siddhant-K-code

v0.33.0

b97d8c5

v0.33.0 — Behavioral Drift Detection

Detect when an agent's behavior has shifted between sessions without needing an LLM.

What's new

agent-strace drift computes a behavioral fingerprint for each session across 6 dimensions — tool mix, error rate, retry rate, file blast radius, token rate, and duration — then measures Jensen-Shannon divergence between two sessions or a session and a rolling baseline.

agent-strace drift SESSION_A SESSION_B
agent-strace drift --baseline 7d SESSION_ID

Fingerprints are <2 KB JSON, storable alongside traces
Drift score 0.0–1.0; configurable alert threshold (--threshold)
No LLM required — pure statistical comparison
Output: per-dimension breakdown + overall drift score

Assets 2

17 May 12:15

github-actions

v0.32.1

f404833

v0.32.1 — Behavioral Drift, Eval Trend Dashboard, Optimize, Langfuse Export, LLM Judge

Batch release covering five features shipped between v0.32.0 and v0.33.0. Each feature is also available as a standalone release (v0.33.0–v0.37.0).

Behavioral drift detection (agent-strace drift) — Jensen-Shannon divergence across 6 behavioral dimensions
Eval trend dashboard (agent-strace dashboard --trend) — sparkline HTML dashboard with annotations
Optimize (agent-strace optimize) — propose AGENTS.md improvements from trace failures
Langfuse export (agent-strace export --scores --backend langfuse) — push eval scores to Langfuse
OTLP behavioral metrics (agent-strace export --metrics --backend otlp) — gauge metrics to any OTLP backend
LLM-as-judge scorer — score sessions via any OpenAI-compatible endpoint
Dataset auto-sampling (eval dataset auto) — 6 signal filters to build eval datasets from traces
Eval CI baseline (eval ci --baseline) — regression gate with GitHub Actions PR comment output

Assets 2

19 Apr 11:45

Siddhant-K-code

v0.32.0

c72f97c

v0.32.0 — Self-Contained HTML Session Replay Viewer

Self-Contained HTML Session Replay Viewer

agent-strace replay --format html generates a single-file HTML viewer for any recorded session. No server, no dependencies — open it in any browser.

agent-strace replay --format html
agent-strace replay --format html --output review.html SESSION_ID

Viewer features:

Animated event timeline with configurable playback speed (up to 4×)
Scrubber bar for jumping to any point in the session
Running cost counter updated as events play
Click-to-expand event detail (full JSON payload)
Color-coded event types: tool calls, LLM requests, file ops, errors
Pause/resume and show-all controls
Dark theme, zero external dependencies (no CDN, no fonts)

All event data is embedded as a JSON constant in the HTML file. Useful for sharing sessions with teammates or attaching to PR reviews without requiring them to install anything.

Assets 2

19 Apr 11:43

Siddhant-K-code

v0.31.0

55b6473

v0.31.0 — Agent Standup Report from Session Trace

Agent Standup Report from Session Trace

agent-strace standup generates a structured standup report from a session trace — no LLM call required.

agent-strace standup
agent-strace standup --session SESSION_ID

Report sections:

What the agent did:

Files read and modified
Approaches tried, including abandoned ones (detected from retry patterns)
New dependencies added (npm install, pip install, etc.)

What it was uncertain about:

TODO / FIXME / assumption comments written into files

What to review carefully:

Large changes (>100 lines), new dependencies, auth and migration patterns

Stats: tool calls, context resets, retries, errors

Useful for async teams where the agent runs overnight and a human needs a quick brief before picking up the work.

Assets 2

19 Apr 11:42

Siddhant-K-code

v0.30.0

c6c7f7a

v0.30.0 — On-Call Readiness Report for Agent-Modified Files

On-Call Readiness Report for Agent-Modified Files

When an agent has been writing code, the human on call may not have read it. agent-strace oncall cross-references agent-modified files from the trace store against git history to surface cognitive gaps before a rotation.

agent-strace oncall --rotation-start 2026-04-25
agent-strace oncall --rotation-start 2026-04-25 --scope "src/payments/**"

For each file the agent has written in the last N days, the report shows:

How long ago it was modified
How many lines changed (from git log --numstat)
Estimated reading time (~200 lines/minute)
Total catch-up time before rotation

--scope filters to a file glob. --since-days controls how far back to scan sessions (default: 30).

Assets 2

19 Apr 11:42

Siddhant-K-code

v0.29.0

4625f4c

v0.29.0 — Context Freshness Check Before Starting a Session

Context Freshness Check Before Starting a Session

Before handing a task to an agent, it helps to know how stale its last view of the codebase is. agent-strace freshness compares the current state against what the agent last saw, using git diff between the last session timestamp and HEAD.

agent-strace freshness
agent-strace freshness --since 2026-04-01 --scope "src/**"

Report includes:

Files changed since the last session (or since --since date)
Per-file change type (modified / added / deleted / renamed) and line count
Freshness score 0–100 (100 = nothing changed since last session)
Estimated catch-up reading time for in-scope files

Scope is auto-detected from CLAUDE.md / AGENTS.md scope sections, or overridden with --scope. No API calls required.

Assets 2

19 Apr 11:42

Siddhant-K-code

v0.28.0

f91168d

v0.28.0 — A2A Protocol Support with Cross-Agent Trace Correlation

A2A Protocol Support with Cross-Agent Trace Correlation

First-class support for agent-to-agent calls following the Google A2A spec. A2A calls are captured as TOOL_CALL events with event_subtype=a2a_call, so they are backward-compatible with all existing replay and export tooling.

agent-strace a2a-tree
agent-strace a2a-tree SESSION_ID --format json

New capabilities:

Detects A2A calls by path, header, and body heuristics
Builds the full agent call graph by following sub_session_id links and parent_session_id back-references
Renders the call graph as an ASCII tree
Exports the graph as OTLP-compatible spans for Jaeger, Tempo, or any OpenTelemetry backend

Child sessions are linked via parent_session_id and parent_event_id in session metadata.

Assets 2

Uh oh!

Releases: Siddhant-K-code/agent-trace

v0.36.0 — Langfuse Export and OTLP Behavioral Metrics

What's new

Uh oh!

v0.35.0 — Optimize: AGENTS.md Improvement Proposals

What's new

Uh oh!

v0.34.0 — Eval Trend Dashboard

What's new

Uh oh!

v0.33.0 — Behavioral Drift Detection

What's new

Uh oh!

v0.32.1 — Behavioral Drift, Eval Trend Dashboard, Optimize, Langfuse Export, LLM Judge

Uh oh!

v0.32.0 — Self-Contained HTML Session Replay Viewer

Self-Contained HTML Session Replay Viewer

Uh oh!

v0.31.0 — Agent Standup Report from Session Trace

Agent Standup Report from Session Trace

Uh oh!

v0.30.0 — On-Call Readiness Report for Agent-Modified Files

On-Call Readiness Report for Agent-Modified Files

Uh oh!

v0.29.0 — Context Freshness Check Before Starting a Session

Context Freshness Check Before Starting a Session

Uh oh!

v0.28.0 — A2A Protocol Support with Cross-Agent Trace Correlation

A2A Protocol Support with Cross-Agent Trace Correlation

Uh oh!