You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Two additions to `analyze`'s table output:
1. `--show-path` flag renders the full chain of node ids for each
longest path (previously only source and end were shown).
2. A Bottleneck column names the slowest model on each path and its
execution time. Directly answers "which model should I optimize
first" — the bottleneck on the #1 path is the first-order target.
When the same bottleneck appears across multiple rows it's a shared
upstream that several paths depend on; optimizing it benefits all
of them (shared-node leverage).
The weights are threaded through via the existing Dag; no new data
collection. JSON / JSONL outputs are unchanged — they already carry
the full path, and adding a redundant bottleneck field would duplicate
information derivable from path + weights.
Tests: 4 new (2 formatter, 2 CLI). Suite: 46 passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: CHANGELOG.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,6 +9,8 @@ All notable changes to this project will be documented in this file. The format
9
9
-`dbt-dag-opt replay` subcommand: reconstructs the observed schedule from `run_results.json`'s `thread_id` + per-phase `timing` data, joined against `manifest.json`'s `parent_map`. Reports per-thread utilization, observed critical path (walked backwards from the last-completing node), and top idle gaps with parent-node attribution.
10
10
- Output formats for `replay`: `text` (rich terminal summary, default) and `json` (full replay report, including raw events).
11
11
- Integration fixture at `tests/fixtures/dbt_dugout/` — a real Snowflake dbt run (57 nodes, 4 threads) used to smoke-test `replay` end-to-end.
12
+
-`analyze --show-path`: render the full chain of node ids for each longest path in the table output.
13
+
-`analyze` table now includes a **Bottleneck** column naming the slowest model on each path. A bottleneck that appears across multiple rows is a shared-node optimization target.
Copy file name to clipboardExpand all lines: README.md
+3Lines changed: 3 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -67,9 +67,12 @@ dbt-dag-opt analyze [OPTIONS]
67
67
--token TEXT dbt Cloud API token [env: DBT_CLOUD_TOKEN]
68
68
-f, --format [json|jsonl|table] Output format [default: table]
69
69
-n, --top INTEGER Show only top N paths (0 = all) [default: 10]
70
+
--show-path Render the full chain of node ids (table format)
70
71
-o, --output PATH Write output to a file instead of stdout
71
72
```
72
73
74
+
The table includes a **Bottleneck** column that names the slowest model on each path. First-order optimization target: the bottleneck model on the #1 path. Watch for a bottleneck that repeats across multiple paths — that's shared-node leverage (optimizing one model helps several paths at once).
75
+
73
76
### `replay` — what actually happened
74
77
75
78
`analyze` is theoretical — it reports the DAG-structural lower bound on wall-clock. `replay` reads the *observed* schedule. Every result in `run_results.json` carries a `thread_id` and per-phase `timing` with start/end timestamps, so we can reconstruct:
0 commit comments