Skip to content

analyze: add --show-path flag + Bottleneck column#4

Merged
trouze merged 1 commit into
mainfrom
feat/analyze-full-path
Apr 24, 2026
Merged

analyze: add --show-path flag + Bottleneck column#4
trouze merged 1 commit into
mainfrom
feat/analyze-full-path

Conversation

@trouze
Copy link
Copy Markdown
Owner

@trouze trouze commented Apr 24, 2026

Summary

Two additions to analyze's table output to answer the question "which model should I actually optimize?":

  1. --show-path flag renders the full chain of node ids for each longest path. Default output is unchanged (source + end of path).
  2. Bottleneck column names the slowest model on each path and its execution time. The bottleneck on the Rewrite as production-ready Python package (v0.1.0 on PyPI) #1 row is the first-order optimization target. When the same bottleneck repeats across multiple rows, that's shared-node leverage — one upstream that several paths depend on.

Observed against the dbt_dugout fixture: fct_player_salaries is the bottleneck on 3 of the top 4 paths.

Design notes

  • Weights are threaded through via the existing Dag; no new data collection.
  • JSON / JSONL outputs are unchanged — they already carry the full path, and a redundant bottleneck field would be derivable from path + weights.
  • Slack-based analysis (how far can you speed up before another path becomes critical?) is deferred — it belongs with the whatif simulator, not the static analyzer.

Test plan

  • pytest — 46 passed (4 new; 42 pre-existing)
  • ruff check . + mypy src — clean
  • Manual smoke test against tests/fixtures/dbt_dugout/ in both default and --show-path modes.

🤖 Generated with Claude Code

Two additions to `analyze`'s table output:

1. `--show-path` flag renders the full chain of node ids for each
   longest path (previously only source and end were shown).

2. A Bottleneck column names the slowest model on each path and its
   execution time. Directly answers "which model should I optimize
   first" — the bottleneck on the #1 path is the first-order target.
   When the same bottleneck appears across multiple rows it's a shared
   upstream that several paths depend on; optimizing it benefits all
   of them (shared-node leverage).

The weights are threaded through via the existing Dag; no new data
collection. JSON / JSONL outputs are unchanged — they already carry
the full path, and adding a redundant bottleneck field would duplicate
information derivable from path + weights.

Tests: 4 new (2 formatter, 2 CLI). Suite: 46 passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@trouze trouze merged commit e35072a into main Apr 24, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant