Skip to content

add replay command: reconstruct the observed schedule#2

Merged
trouze merged 1 commit into
mainfrom
feat/replay
Apr 24, 2026
Merged

add replay command: reconstruct the observed schedule#2
trouze merged 1 commit into
mainfrom
feat/replay

Conversation

@trouze
Copy link
Copy Markdown
Owner

@trouze trouze commented Apr 24, 2026

analyze predicts a theoretical lower bound on wall-clock from the DAG's critical path. replay does the complementary thing: reads the observed schedule out of run_results.json (thread_id + per-phase timing) and joins against manifest.json's parent_map to report:

  • per-thread utilization (busy vs. idle, events executed)
  • observed critical path, walked backwards from the last-completing node by picking, at each step, the parent whose completion time was closest to this node's start
  • top idle gaps, each attributed to the parent node the thread was waiting on (or flagged as scheduler overhead if all parents were already done)

CLI: dbt-dag-opt replay --manifest ... --run-results ... with text (default) and json formats. Same file/cloud input modes as analyze.

Tests: synthetic 4-node 2-thread fixture with known expected chain + integration fixture at tests/fixtures/dbt_dugout/ (real 57-node Snowflake run) smoke-tested through CLI. Suite: 42 passed (+13 replay tests).

`analyze` predicts a theoretical lower bound on wall-clock from the DAG's
critical path. `replay` does the complementary thing: reads the observed
schedule out of run_results.json (thread_id + per-phase timing) and joins
against manifest.json's parent_map to report:

- per-thread utilization (busy vs. idle, events executed)
- observed critical path, walked backwards from the last-completing node
  by picking, at each step, the parent whose completion time was closest
  to this node's start
- top idle gaps, each attributed to the parent node the thread was
  waiting on (or flagged as scheduler overhead if all parents were
  already done)

CLI: `dbt-dag-opt replay --manifest ... --run-results ...` with `text`
(default) and `json` formats. Same file/cloud input modes as `analyze`.

Tests: synthetic 4-node 2-thread fixture with known expected chain +
integration fixture at tests/fixtures/dbt_dugout/ (real 57-node Snowflake
run) smoke-tested through CLI. Suite: 42 passed (+13 replay tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@trouze trouze merged commit 8ce1c16 into main Apr 24, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant