You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-**SQLite query layer** — `FlightDB` indexes JSONL files into SQLite for cross-session queries, aggregation by tool, and daily trends.
142
130
-**Alert detection** — Error-recovery anomalies (different tool called after error), loop detection (same tool 5x in 60s).
143
-
-**Experiment registry** — `src/experiments.ts` stores one JSON file per experiment at `~/.flight/experiments/<name>.json`. Race-safe creation via O_EXCL (`flag: "wx"`); `createOrUpdateExperiment` merges patches (arrays replace). `flight run --experiment` auto-registers and prints a hint on first use.
The experiment registry provides a lightweight, file-per-experiment store at `~/.flight/experiments/<name>.json`. It lets you group and compare runs across multiple sessions.
"notes": "Compare against bench-b with streaming enabled"
295
-
}
296
-
```
297
-
298
-
### Workflow
299
-
300
-
```bash
301
-
# Register an experiment with metadata
302
-
flight experiment new bench-a --description "Baseline" --tags fast,cheap --model claude-sonnet-4
303
-
304
-
# Start runs that belong to this experiment
305
-
flight run --agent my-agent --experiment bench-a
306
-
flight run --agent my-agent --experiment bench-b
307
-
308
-
# List all experiments with run counts
309
-
flight experiment list
310
-
311
-
# Inspect a specific experiment and its runs
312
-
flight experiment show bench-a
313
-
314
-
# Compare two experiments head-to-head
315
-
flight experiment diff bench-a bench-b
316
-
317
-
# Export all runs as research JSONL (for offline analysis)
318
-
flight experiment export bench-a | jq .
319
-
```
320
-
321
-
Unknown experiments are **auto-registered** on first `flight run --experiment <name>`, with a one-line stderr hint pointing to `flight experiment new` for adding metadata. The registry files are plain JSON and fully human-editable.
0 commit comments