Summary
The GSD Journal captures structured events for auto-mode iterations but is missing several OpenTelemetry-inspired concepts that would significantly improve forensics diagnosis quality. This issue tracks five targeted improvements, ordered by value.
1. Correlate journal events to pi session (highest value)
Problem: The journal (unit-start/unit-end) and the pi session JSONL (LLM calls, tool executions) are completely disconnected. Forensics infers the link by timestamp proximity, which is fragile and wastes LLM context on full-file scanning.
Fix: Add sessionId and messageOffset to unit-start:
```typescript
{
eventType: "unit-start",
data: {
unitId, unitType,
sessionId: "abc123", // pi session file identifier
messageOffset: 42 // message count at unit start
}
}
```
Impact: Forensics can jump directly from unit-end { status: "error" } to the exact tool call that failed, without scanning the whole session file.
2. Explicit durations on unit-end
Problem: Duration must be computed by pairing unit-start.ts and unit-end.ts timestamps. Forensics can't query slow units directly.
Fix: Add durationMs to unit-end:
```typescript
{ eventType: "unit-end", data: { unitId, status, artifactVerified, durationMs: 142000 } }
```
Impact: Timeout anomaly detection (queryJournal({ eventType: "unit-end" }) + filter on durationMs) works from journal alone without cross-referencing activity logs.
3. Structured error detail on unit-end
Problem: unit-end { status: "error" } carries no error detail in the journal. Forensics must parse the pi session JSONL to find what went wrong.
Fix:
```typescript
{
eventType: "unit-end",
data: {
unitId, status: "error",
error: "Bash tool failed: permission denied on /etc/hosts",
errorType: "tool-error" | "timeout" | "context-overflow" | "unknown"
}
}
```
Impact: Forensics can classify failure modes and generate a summary section from journal-only data.
4. Resource attributes on iteration-start
Problem: Journal entries carry no metadata about the GSD version or model in use. Forensics fetches this from GSD_VERSION env and metrics.json separately, making regression correlation manual.
Fix: Add a resource block to iteration-start:
```typescript
{
eventType: "iteration-start",
data: { iteration },
resource: { gsdVersion: "2.48.0", model: "anthropic/claude-sonnet-4-20250514", cwd: "/..." }
}
```
Impact: Forensics can answer "did this regression start after the model changed?" from journal alone.
5. Cross-iteration causal links for recovery chains
Problem: causedBy only works within a single flowId. When stuck detection fires and the next iteration is a recovery attempt, there is no journal link between them.
Fix: Emit causedBy on the recovery iteration's iteration-start pointing to the stuck-detected event:
```typescript
// iteration N+1 recovery
{ flowId: "flow-bbb", seq: 1, eventType: "iteration-start",
causedBy: { flowId: "flow-aaa", seq: 5 } // points to stuck-detected
}
```
Impact: Forensics reconstructs the full recovery chain (stuck → cache-invalidate → retry → still-stuck → hard-stop) from the data model rather than inferring it from timestamps.
What NOT to add
- OTLP export / external collectors — GSD is a local tool
- Sampling — 100% event capture is correct for auto-mode frequency
- Combined span-style start/end — the two-event model is better for crash forensics (a crash leaves
unit-start with no matching unit-end, which is precisely the signal)
Affected files
src/resources/extensions/gsd/journal.ts — JournalEntry type + emitJournalEvent
src/resources/extensions/gsd/auto/loop.ts — iteration-start emit
src/resources/extensions/gsd/auto/phases.ts — unit-start, unit-end emits
src/resources/extensions/gsd/auto/loop-deps.ts — LoopDeps.emitJournalEvent signature
src/resources/extensions/gsd/forensics.ts — update journal summary section to use new fields
src/resources/extensions/gsd/tests/journal*.test.ts — update fixtures
Summary
The GSD Journal captures structured events for auto-mode iterations but is missing several OpenTelemetry-inspired concepts that would significantly improve forensics diagnosis quality. This issue tracks five targeted improvements, ordered by value.
1. Correlate journal events to pi session (highest value)
Problem: The journal (
unit-start/unit-end) and the pi session JSONL (LLM calls, tool executions) are completely disconnected. Forensics infers the link by timestamp proximity, which is fragile and wastes LLM context on full-file scanning.Fix: Add
sessionIdandmessageOffsettounit-start:```typescript
{
eventType: "unit-start",
data: {
unitId, unitType,
sessionId: "abc123", // pi session file identifier
messageOffset: 42 // message count at unit start
}
}
```
Impact: Forensics can jump directly from
unit-end { status: "error" }to the exact tool call that failed, without scanning the whole session file.2. Explicit durations on unit-end
Problem: Duration must be computed by pairing
unit-start.tsandunit-end.tstimestamps. Forensics can't query slow units directly.Fix: Add
durationMstounit-end:```typescript
{ eventType: "unit-end", data: { unitId, status, artifactVerified, durationMs: 142000 } }
```
Impact: Timeout anomaly detection (
queryJournal({ eventType: "unit-end" })+ filter ondurationMs) works from journal alone without cross-referencing activity logs.3. Structured error detail on unit-end
Problem:
unit-end { status: "error" }carries no error detail in the journal. Forensics must parse the pi session JSONL to find what went wrong.Fix:
```typescript
{
eventType: "unit-end",
data: {
unitId, status: "error",
error: "Bash tool failed: permission denied on /etc/hosts",
errorType: "tool-error" | "timeout" | "context-overflow" | "unknown"
}
}
```
Impact: Forensics can classify failure modes and generate a summary section from journal-only data.
4. Resource attributes on iteration-start
Problem: Journal entries carry no metadata about the GSD version or model in use. Forensics fetches this from
GSD_VERSIONenv andmetrics.jsonseparately, making regression correlation manual.Fix: Add a
resourceblock toiteration-start:```typescript
{
eventType: "iteration-start",
data: { iteration },
resource: { gsdVersion: "2.48.0", model: "anthropic/claude-sonnet-4-20250514", cwd: "/..." }
}
```
Impact: Forensics can answer "did this regression start after the model changed?" from journal alone.
5. Cross-iteration causal links for recovery chains
Problem:
causedByonly works within a singleflowId. When stuck detection fires and the next iteration is a recovery attempt, there is no journal link between them.Fix: Emit
causedByon the recovery iteration'siteration-startpointing to thestuck-detectedevent:```typescript
// iteration N+1 recovery
{ flowId: "flow-bbb", seq: 1, eventType: "iteration-start",
causedBy: { flowId: "flow-aaa", seq: 5 } // points to stuck-detected
}
```
Impact: Forensics reconstructs the full recovery chain (
stuck → cache-invalidate → retry → still-stuck → hard-stop) from the data model rather than inferring it from timestamps.What NOT to add
unit-startwith no matchingunit-end, which is precisely the signal)Affected files
src/resources/extensions/gsd/journal.ts—JournalEntrytype +emitJournalEventsrc/resources/extensions/gsd/auto/loop.ts—iteration-startemitsrc/resources/extensions/gsd/auto/phases.ts—unit-start,unit-endemitssrc/resources/extensions/gsd/auto/loop-deps.ts—LoopDeps.emitJournalEventsignaturesrc/resources/extensions/gsd/forensics.ts— update journal summary section to use new fieldssrc/resources/extensions/gsd/tests/journal*.test.ts— update fixtures