replay: Snowflake cost overlay (--warehouse-size) by trouze · Pull Request #6 · trouze/dbt-dag-opt

trouze · 2026-04-24T22:15:53Z

Summary

Adds a cost overlay to replay that translates the observed schedule into dollars. Lands on main under [Unreleased]; eventual v0.2.0 will bundle this with the whatif simulator (no intermediate tag).

Four numbers frame the new output:

Run cost — what this run actually billed (wall-clock × warehouse rate, with the 60s floor applied).
Critical-path floor — irreducible cost of the slowest dependency chain.
Headroom — run − floor; what better parallelization could save.
Idle cost — $ equivalent of thread-idle warehouse-seconds (distinct from headroom — idleness includes long-tail critical path that can't be parallelized away).

Design

New src/dbt_dag_opt/cost.py: pure functions + frozen dataclasses (CostInputs, CostReport). compute_cost() takes primitives, not a ReplayReport, so the future whatif simulator can call it against simulated schedules and diff two CostReports. A thin cost_inputs_from_replay() adapter keeps the CLI ergonomic.
ReplayReport stays cost-free — the CLI composes replay + cost and passes both into render_replay(report, fmt, cost=...). Keeps replay.py and cost.py independently testable.
Snowflake 60s minimum-billing floor applied by default (matches how Snowflake actually bills); opt out with --no-minimum-billing.
--warehouse-size and --credits-per-hour (non-Snowflake escape hatch) are mutually exclusive. --rate-per-credit and --no-minimum-billing require one of them.

New flags

--warehouse-size TEXT       Snowflake size (XS, S, M, L, XL, 2XL…6XL)
--credits-per-hour FLOAT    Raw credits/hour for non-Snowflake adapters
--rate-per-credit FLOAT     USD per credit (default 2.0)
--no-minimum-billing        Skip the 60s Snowflake floor

Sample output (dbt_dugout fixture, `--warehouse-size L`)

                        Cost estimate
┌─────────────────────┬─────────────────────────────────────┐
│ Warehouse           │ L (8 credits/hr @ $2.00/credit)     │
│ Effective rate      │ $0.00444/s                          │
│ Billed wall-clock   │ 60.000 s  (raised from 0.556 s)     │
│ Run cost            │ $0.27                               │
│ Critical-path floor │ $0.27                               │
│ Headroom            │ $0.0000  (0%)                       │
│ Thread idleness     │ 0.408 s  (18% of warehouse-seconds) │
│ Idle cost           │ $0.0489  (18%)                      │
└─────────────────────┴─────────────────────────────────────┘

The sub-second fixture run ends up floored at 60s — a real Snowflake-billed signal ("at this runtime you're paying the billing minimum; no amount of parallelization matters unless you run longer"). Idle cost stays visible because waste_fraction is computed on raw wall-clock × thread_count.

Test plan

7 new tests in tests/test_cost.py (53 total; all pass).
ruff check . clean.
mypy --strict src clean.
CLI smoke: --warehouse-size L renders Cost estimate; JSON cost key has all 12 fields.
CLI smoke: --rate-per-credit without size → BadParameter, exit 2.
CLI smoke: --credits-per-hour 12 --no-minimum-billing renders custom (12 credits/hr) row with raw wall-clock.

🤖 Generated with Claude Code

Translates wall-clock + thread-idleness + critical-path-seconds into dollars so users can answer "what did this run cost, how much was waste, and what's the floor I'm working against?" - New dbt_dag_opt.cost module: CostInputs / CostReport dataclasses, compute_cost(), credits_per_hour_for() with Snowflake size aliases (XS…6XL, long-form and punctuation tolerant), and cost_inputs_from_replay() adapter. Primitive-driven so a future whatif simulator can call compute_cost() against simulated schedules and diff two CostReports. - replay CLI flags: --warehouse-size, --credits-per-hour (non- Snowflake escape hatch, mutually exclusive with size), --rate-per-credit (default $2.00, Standard On-Demand), and --no-minimum-billing (opt out of Snowflake's 60s floor, applied by default). - render_replay() accepts optional cost and adds a "Cost estimate" table (text) or top-level "cost" key (JSON). Four framed numbers: Run cost, Critical-path floor, Headroom, Idle cost. - 7 new tests (53 total). ruff + mypy --strict clean. Does not bump __version__; ships alongside whatif in the eventual v0.2.0 release. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

CI (Linux, color-on) rendered the BadParameter error with inline ANSI escapes that split "--warehouse-size" across style boundaries, so the plain substring check failed. Local macOS runs didn't reproduce because rich skipped color (no TTY). Fix: strip ANSI before searching. Also moved _ANSI_RE below imports (ruff E402) and added .actrc so contributors can reproduce CI locally via `act -j test`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

trouze and others added 2 commits April 24, 2026 17:15

trouze merged commit 5d912c1 into main Apr 24, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

replay: Snowflake cost overlay (--warehouse-size)#6

replay: Snowflake cost overlay (--warehouse-size)#6
trouze merged 2 commits into
mainfrom
feat/cost-overlay

trouze commented Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

trouze commented Apr 24, 2026

Summary

Design

New flags

Sample output (dbt_dugout fixture, --warehouse-size L)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Sample output (dbt_dugout fixture, `--warehouse-size L`)