|
| 1 | +--- |
| 2 | +name: dynamic-tables-guidance |
| 3 | +title: Dynamic Tables Guidance |
| 4 | +summary: Decide when to use Dynamic Tables vs MVs, streams+tasks, or dbt, and design production-ready DT pipelines. |
| 5 | +description: "Use when choosing between Dynamic Tables and alternatives (materialized views, streams+tasks, dbt), designing multi-layer DT pipelines, debugging FULL-refresh fallback, or hardening DTs for production. Covers comparison matrices, decision flowcharts, common pitfalls, monitoring queries, and hybrid DT+task patterns. Triggers: dynamic tables guidance, when to use DT, DT vs MV, DT vs streams tasks, DT vs dbt, DT pitfalls, DT best practices, DT pipeline design, target lag, downstream lag, full refresh fallback." |
| 6 | +tools: |
| 7 | + - snowflake_sql_execute |
| 8 | + - Bash |
| 9 | + - Read |
| 10 | + - Write |
| 11 | + - Edit |
| 12 | + - Glob |
| 13 | + - Grep |
| 14 | +prompt: Should I use Dynamic Tables or streams+tasks for my CDC pipeline? |
| 15 | +language: en |
| 16 | +status: Published |
| 17 | +author: Snowflake Solutions Team |
| 18 | +type: snowflake |
| 19 | +--- |
| 20 | + |
| 21 | +# Dynamic Tables Guidance |
| 22 | + |
| 23 | +## Overview |
| 24 | + |
| 25 | +Dynamic Tables (DTs) are declarative, auto-refreshing materialized queries. You write a `SELECT`, set a `TARGET_LAG`, and Snowflake keeps results fresh on the schedule you pick. This skill helps you decide when DTs are the right tool, design multi-layer pipelines, and avoid the failure modes that catch real teams in production. |
| 26 | + |
| 27 | +Use this skill when picking between DTs, materialized views, streams+tasks, or dbt — or when an existing DT pipeline is misbehaving (full-refresh fallback, lag drift, runaway cost). |
| 28 | + |
| 29 | +## Quick Decision Flowchart |
| 30 | + |
| 31 | +``` |
| 32 | +Need to transform data in Snowflake? |
| 33 | + ├─ Single table, accelerate queries? → Materialized View |
| 34 | + ├─ Multi-step SQL pipeline, fresh data? → Dynamic Tables |
| 35 | + ├─ Stream-static joins / append-only? → Custom Incremental DTs (PrPr) |
| 36 | + ├─ Cross-warehouse portability or dbt tests? → dbt models |
| 37 | + ├─ Procedural logic, IF/ELSE, API calls? → Streams + Tasks |
| 38 | + └─ Sub-15-second latency? → Streams + Tasks |
| 39 | +``` |
| 40 | + |
| 41 | +## Comparison Matrix |
| 42 | + |
| 43 | +| Dimension | Dynamic Tables | Materialized Views | Streams + Tasks | dbt | |
| 44 | +|---|---|---|---|---| |
| 45 | +| Refresh | Target lag (15s+) | Auto, near-real-time | Manual schedule/trigger | Batch (`dbt run`) | |
| 46 | +| SQL support | Full SELECT, JOINs, windows | Single table only | Full + procedural | Full + Jinja | |
| 47 | +| Chaining | `TARGET_LAG = DOWNSTREAM` | No | Manual DAG | Ref graph | |
| 48 | +| Incremental | Built-in for supported ops | Auto | You write it | Manual `is_incremental` | |
| 49 | +| Side effects | None | None | Email, API, externals | None | |
| 50 | +| Cost | Your warehouse | Serverless | Your warehouse | Your warehouse | |
| 51 | + |
| 52 | +**Rule of thumb:** Start with DTs. Reach for streams+tasks only when you need procedural logic, side effects, or sub-15s latency. Use dbt when you need its testing framework or cross-warehouse portability. |
| 53 | + |
| 54 | +## Pipeline Pattern: Bronze → Silver → Gold |
| 55 | + |
| 56 | +```sql |
| 57 | +-- Bronze: parse raw VARIANT |
| 58 | +CREATE DYNAMIC TABLE bronze_events |
| 59 | + TARGET_LAG = DOWNSTREAM |
| 60 | + WAREHOUSE = pipeline_wh |
| 61 | + AS SELECT |
| 62 | + record_content:event_id::STRING AS event_id, |
| 63 | + record_content:event_type::STRING AS event_type, |
| 64 | + record_content:timestamp::TIMESTAMP_NTZ AS event_ts |
| 65 | + FROM raw_events_topic; |
| 66 | + |
| 67 | +-- Silver: business logic + joins |
| 68 | +CREATE DYNAMIC TABLE silver_purchases |
| 69 | + TARGET_LAG = DOWNSTREAM |
| 70 | + WAREHOUSE = pipeline_wh |
| 71 | + AS SELECT e.event_id, e.event_ts, p.product_name, p.category |
| 72 | + FROM bronze_events e |
| 73 | + JOIN products p ON e.payload:product_id::STRING = p.product_id |
| 74 | + WHERE e.event_type = 'purchase'; |
| 75 | + |
| 76 | +-- Gold: only the leaf has a time-based lag |
| 77 | +CREATE DYNAMIC TABLE gold_hourly_sales |
| 78 | + TARGET_LAG = '5 minutes' |
| 79 | + WAREHOUSE = pipeline_wh |
| 80 | + AS SELECT DATE_TRUNC('hour', event_ts) AS sales_hour, category, |
| 81 | + COUNT(*) AS order_count |
| 82 | + FROM silver_purchases |
| 83 | + GROUP BY 1, 2; |
| 84 | +``` |
| 85 | + |
| 86 | +**Key rule:** Only the leaf DT has a time-based `TARGET_LAG`. Intermediates use `DOWNSTREAM` so Snowflake derives their lag from the leaf. |
| 87 | + |
| 88 | +## Monitoring Essentials |
| 89 | + |
| 90 | +```sql |
| 91 | +-- Health check |
| 92 | +SELECT name, scheduling_state, last_completed_refresh_state, |
| 93 | + refresh_mode, time_within_target_lag_ratio |
| 94 | +FROM TABLE(INFORMATION_SCHEMA.DYNAMIC_TABLES()) |
| 95 | +ORDER BY time_within_target_lag_ratio ASC; |
| 96 | + |
| 97 | +-- Recent refresh history |
| 98 | +SELECT name, state, refresh_action, |
| 99 | + DATEDIFF('second', refresh_start_time, refresh_end_time) AS duration_sec |
| 100 | +FROM TABLE(INFORMATION_SCHEMA.DYNAMIC_TABLE_REFRESH_HISTORY( |
| 101 | + NAME_PREFIX => '<db>.<schema>')) |
| 102 | +ORDER BY refresh_start_time DESC LIMIT 20; |
| 103 | + |
| 104 | +-- Errors only |
| 105 | +SELECT name, state, state_message |
| 106 | +FROM TABLE(INFORMATION_SCHEMA.DYNAMIC_TABLE_REFRESH_HISTORY( |
| 107 | + NAME_PREFIX => '<db>.<schema>', ERROR_ONLY => TRUE)); |
| 108 | +``` |
| 109 | + |
| 110 | +For account-wide DT cost, query `SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY` filtered to refresh queries. |
| 111 | + |
| 112 | +## Common Mistakes |
| 113 | + |
| 114 | +- **`SELECT *` in DT definition** — breaks incremental refresh on schema changes. Always use explicit column lists. |
| 115 | +- **Time-based lag on every layer** — only the leaf should have a time-based lag. Use `TARGET_LAG = DOWNSTREAM` on intermediates. |
| 116 | +- **Change tracking off on base tables** — DTs require `CHANGE_TRACKING = TRUE` for incremental refresh. Check with `SHOW TABLES`. |
| 117 | +- **Falling back to FULL refresh silently** — check `refresh_mode_reason` if `refresh_mode` shows `FULL` when you expected `INCREMENTAL`. |
| 118 | +- **Missing PRIMARY KEY RELY** — without it, `INSERT OVERWRITE` reprocesses everything downstream and you lose incremental chains. |
| 119 | +- **DISTINCT/UNION fanout** — these operators force full refresh. Refactor with `QUALIFY ROW_NUMBER()` or `UNION ALL` where possible. |
| 120 | +- **Sharing one warehouse with interactive queries** — DT refreshes will compete with user queries. Use a dedicated warehouse. |
| 121 | +- **No `INITIALIZATION_WAREHOUSE` for big initial loads** — first refresh on large DTs can OOM a small warehouse. Set a larger init WH, then unset. |
| 122 | + |
| 123 | +## Production Readiness Checklist |
| 124 | + |
| 125 | +- [ ] Explicit column lists (no `SELECT *`) |
| 126 | +- [ ] `CHANGE_TRACKING = TRUE` on all base tables |
| 127 | +- [ ] Intermediates use `TARGET_LAG = DOWNSTREAM` |
| 128 | +- [ ] Leaf target lag ≥ all upstream lags |
| 129 | +- [ ] `refresh_mode` is `INCREMENTAL` (verify `refresh_mode_reason`) |
| 130 | +- [ ] Dedicated warehouse for DT refreshes |
| 131 | +- [ ] Monitoring on `time_within_target_lag_ratio > 0.95` |
| 132 | +- [ ] Alerting on refresh failures |
| 133 | +- [ ] `INITIALIZATION_WAREHOUSE` set for large initial loads |
| 134 | + |
| 135 | +## Hybrid Pattern: DT + Task for Side Effects |
| 136 | + |
| 137 | +```sql |
| 138 | +CREATE STREAM gold_metrics_stream ON DYNAMIC TABLE gold_metrics; |
| 139 | + |
| 140 | +CREATE TASK notify_on_refresh |
| 141 | + WAREHOUSE = ops_wh |
| 142 | + WHEN SYSTEM$STREAM_HAS_DATA('gold_metrics_stream') |
| 143 | +AS |
| 144 | +BEGIN |
| 145 | + LET change_count INT := (SELECT COUNT(*) FROM gold_metrics_stream); |
| 146 | + CALL SYSTEM$SEND_EMAIL('team@co.com', 'Metrics Updated', |
| 147 | + change_count || ' rows changed'); |
| 148 | + CREATE OR REPLACE TEMP TABLE _consume AS SELECT * FROM gold_metrics_stream; |
| 149 | +END; |
| 150 | +``` |
| 151 | + |
| 152 | +## Workflow |
| 153 | + |
| 154 | +1. **Assess fit** — run the decision flowchart. If DTs aren't the right tool, stop here. |
| 155 | +2. **Pick refresh mode** — `AUTO` (default), `INCREMENTAL` (force, fail if ineligible), `FULL`, or `CUSTOM_INCREMENTAL` (PrPr). |
| 156 | +3. **Design layers** — Bronze→Silver→Gold with `DOWNSTREAM` on intermediates. |
| 157 | +4. **Harden** — apply the production checklist. |
| 158 | + |
| 159 | +⚠️ STOPPING POINT: Before running `CREATE OR REPLACE DYNAMIC TABLE` against existing pipelines, show the user the planned DDL and confirm. Replacing a DT triggers a full reload and may invalidate downstream incremental chains. |
| 160 | + |
| 161 | +⚠️ STOPPING POINT: Before applying `ALTER DYNAMIC TABLE ... SUSPEND` or `DROP DYNAMIC TABLE`, confirm with the user — downstream DTs depending on the target will stop refreshing. |
| 162 | + |
| 163 | +## Stopping Points |
| 164 | + |
| 165 | +- Workflow Step 4 — confirm DDL before `CREATE OR REPLACE DYNAMIC TABLE` on existing pipelines |
| 166 | +- Workflow Step 4 — confirm before `ALTER ... SUSPEND` or `DROP DYNAMIC TABLE` |
| 167 | + |
| 168 | +## References |
| 169 | + |
| 170 | +- `references/pitfalls-and-pks.md` — full pitfall deep-dive plus `PRIMARY KEY RELY`, `IMMUTABLE WHERE`, `BACKFILL FROM` |
| 171 | +- `references/custom-incremental.md` — Custom Incremental DTs (PrPr) syntax and patterns |
| 172 | +- `references/dcm-for-dts.md` — Database Change Management for git-native DT deployment |
| 173 | +- Built-in Cortex Code skill: `dynamic-tables` |
0 commit comments