Skip to content

Commit 622bc38

Browse files
committed
Stage dynamic-tables-guidance from Snowflake-Solutions
1 parent be318e8 commit 622bc38

5 files changed

Lines changed: 619 additions & 0 deletions

File tree

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
Snowflake Skills License
2+
3+
© 2026 Snowflake Inc. All rights reserved.
4+
5+
LICENSE: Use of these materials (including all code, prompts, assets, files, and other components of these skills (collectively, “Skills”)) is governed by your agreement with Snowflake for the Service. If no separate agreement exists, use is governed by Snowflake’s Terms of Service (available at: https://www.snowflake.com/en/legal/terms-of-service/).
6+
7+
Your applicable agreement is referred to as the "Agreement." "Service" is as defined in the Agreement.
8+
9+
ADDITIONAL RESTRICTIONS: Notwithstanding anything in the Agreement to the contrary, you may not:
10+
11+
* Extract from the Service or retain copies of the Skills outside use with the Service;
12+
* Reproduce or copy the Skills , except for temporary copies created automatically during authorized use of the Service;
13+
* Create derivative works based on the Skills;
14+
* Distribute, sublicense, or transfer the Skills to any third party;
15+
* Make, offer to sell, sell, or import any inventions embodied in the Skills; nor,
16+
* Reverse engineer, decompile, or disassemble the Skills.
17+
18+
The receipt, viewing, or possession of the Skills does not convey or imply any license or right beyond those expressly granted above.
19+
20+
Snowflake retains all rights, title, and interest in the Skills, including all copyrights, trademarks, patents, and all other applicable intellectual property rights.
21+
22+
THE SKILLS ARE PROVIDED “AS IS,” WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SKILLS OR THE USE OR OTHER DEALINGS IN THE SKILLS.
Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
---
2+
name: dynamic-tables-guidance
3+
title: Design Dynamic Tables
4+
summary: Decide when Dynamic Tables fit, design the pipeline, and ship it production-ready without the usual full-refresh traps.
5+
description: "Use when you need to decide between Dynamic Tables, materialized views, streams+tasks, or dbt for a Snowflake pipeline, design a multi-layer DT DAG, debug a DT that fell back to FULL refresh, or harden a DT pipeline for production. Triggers: dynamic tables, DT design, DT vs MV, DT vs streams tasks, DT vs dbt, DT pitfalls, DT best practices, target lag, downstream lag, refresh mode, INCREMENTAL, FULL refresh, IMMUTABLE WHERE, BACKFILL FROM, primary key RELY, DT monitoring, DT pipeline."
6+
tools:
7+
- snowflake_sql_execute
8+
- snowflake_object_search
9+
- Read
10+
- Write
11+
- Edit
12+
- Grep
13+
prompt: Help me design a Dynamic Tables pipeline for my bronze/silver/gold workflow and avoid the full-refresh trap.
14+
language: en
15+
status: Published
16+
author: Snowflake Solutions Team
17+
type: snowflake
18+
---
19+
20+
# Design Dynamic Tables
21+
22+
## Overview
23+
24+
Dynamic Tables (DTs) are declarative, auto-refreshing materialized queries. You write a `SELECT`, set a `TARGET_LAG`, and Snowflake keeps the results fresh, picking INCREMENTAL or FULL refresh automatically. This skill helps you choose DTs over the alternatives, design a clean DAG, and ship it without the common gotchas.
25+
26+
## Quick Decision
27+
28+
| Need | Use |
29+
|------|-----|
30+
| Single-table query acceleration | Materialized View |
31+
| Multi-step SQL pipeline, continuous freshness | **Dynamic Tables** |
32+
| Stream-static joins, append-only patterns | Custom Incremental DTs (PrPr) |
33+
| Cross-warehouse portability, `dbt test` | dbt models |
34+
| Procedural logic, IF/ELSE, API calls, notifications | Streams + Tasks |
35+
| Sub-15-second latency | Streams + Tasks |
36+
37+
DTs win when the work is pure SQL transforms inside Snowflake and you want self-orchestration. Reach for streams+tasks only when you hit procedural logic, side effects, or sub-15s latency.
38+
39+
## Pipeline Pattern: Bronze → Silver → Gold
40+
41+
```sql
42+
-- Intermediate layers: TARGET_LAG = DOWNSTREAM
43+
CREATE DYNAMIC TABLE bronze_events
44+
TARGET_LAG = DOWNSTREAM
45+
WAREHOUSE = pipeline_wh
46+
AS
47+
SELECT record_content:event_id::STRING AS event_id,
48+
record_content:event_type::STRING AS event_type,
49+
record_content:timestamp::TIMESTAMP_NTZ AS event_ts
50+
FROM raw_events;
51+
52+
-- Leaf layer: only one with a time-based lag
53+
CREATE DYNAMIC TABLE gold_hourly_sales
54+
TARGET_LAG = '5 minutes'
55+
WAREHOUSE = pipeline_wh
56+
AS
57+
SELECT DATE_TRUNC('hour', event_ts) AS sales_hour,
58+
COUNT(*) AS order_count
59+
FROM bronze_events
60+
GROUP BY 1;
61+
```
62+
63+
Rule: only the leaf DT gets a time-based `TARGET_LAG`; everything upstream uses `DOWNSTREAM`. Use a dedicated warehouse to isolate refresh cost from interactive queries.
64+
65+
## Monitoring
66+
67+
```sql
68+
SELECT name, scheduling_state, last_completed_refresh_state,
69+
refresh_mode, time_within_target_lag_ratio
70+
FROM TABLE(INFORMATION_SCHEMA.DYNAMIC_TABLES())
71+
ORDER BY time_within_target_lag_ratio ASC;
72+
73+
SELECT name, state, state_message, refresh_action
74+
FROM TABLE(INFORMATION_SCHEMA.DYNAMIC_TABLE_REFRESH_HISTORY(
75+
NAME_PREFIX => '<db>.<schema>', ERROR_ONLY => TRUE
76+
))
77+
ORDER BY refresh_start_time DESC LIMIT 10;
78+
```
79+
80+
Alert when `time_within_target_lag_ratio < 0.95` or refresh failures appear in `SNOWFLAKE.ACCOUNT_USAGE.DYNAMIC_TABLE_REFRESH_HISTORY`.
81+
82+
## Production Checklist
83+
84+
- Explicit column lists (no `SELECT *` — adds break incremental)
85+
- Change tracking enabled on base tables
86+
- Intermediates use `TARGET_LAG = DOWNSTREAM`; leaf lag ≥ all upstream lags
87+
- `refresh_mode = INCREMENTAL` confirmed (check `refresh_mode_reason` if FULL)
88+
- Dedicated refresh warehouse; `INITIALIZATION_WAREHOUSE` for big first loads
89+
- `IMMUTABLE WHERE` on partitions that never change (compliance, cost)
90+
- `PRIMARY KEY ... RELY` set so downstream DTs stay incremental
91+
- Failure alerting wired up
92+
93+
## Common Mistakes
94+
95+
- **`SELECT *` everywhere.** Schema drift forces FULL refresh. Always list columns.
96+
- **Time-based lag on every layer.** Causes redundant refreshes. Only the leaf gets a time lag; intermediates use `DOWNSTREAM`.
97+
- **Leaf lag tighter than upstream.** Snowflake can't honor it. Leaf lag must be ≥ max upstream lag.
98+
- **Forgetting change tracking.** Without it, refreshes go FULL. Enable on base tables explicitly or let Snowflake auto-enable on first DT creation.
99+
- **No `PRIMARY KEY RELY`.** Causes `INSERT OVERWRITE` reprocessing and breaks incremental-after-full chains downstream.
100+
- **`DISTINCT` over wide rows.** Triggers fanout and FULL refresh. Pre-aggregate or use `QUALIFY ROW_NUMBER()`.
101+
- **Misusing `IMMUTABLE WHERE`.** It freezes rows; if upstream rows in that range change later, results drift silently.
102+
- **Treating DTs as a streaming engine.** Minimum lag is 15s (preview). Use streams+tasks for sub-second pipelines.
103+
- **Calling external functions with side effects.** Not supported. Wrap with a stream+task on the leaf DT.
104+
105+
## Workflow
106+
107+
1. Use the decision table to confirm DTs fit. Stop and route otherwise.
108+
2. Map layers (Bronze/Silver/Gold), pick `TARGET_LAG` per layer, assign warehouses.
109+
3. Apply the production checklist. Verify `refresh_mode = INCREMENTAL` after first refresh.
110+
4. For stream-static joins or append-only patterns, see `references/custom-incremental.md`.
111+
5. For git-native deployment via DCM, see `references/dcm-for-dts.md`.
112+
113+
## References
114+
115+
- `references/pitfalls-and-pks.md` — full pitfalls list, `PRIMARY KEY RELY`, `IMMUTABLE WHERE`, `BACKFILL FROM`
116+
- `references/custom-incremental.md` — Custom Incremental DTs (PrPr) syntax and patterns
117+
- `references/dcm-for-dts.md` — DCM `DEFINE DYNAMIC TABLE` workflow
Lines changed: 176 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
# Custom Incremental Dynamic Tables (Private Preview)
2+
3+
Custom incremental DTs let you define refresh logic using **imperative DML** (MERGE or INSERT INTO) instead of a declarative SELECT. This unlocks patterns that standard DTs can't express efficiently.
4+
5+
**When to use:** Standard DTs should always be your first choice. Use custom incremental only when:
6+
- You need **stream-static joins** (fact stream + dimension snapshot)
7+
- You need **append-only pipelines** (only process inserts, ignore updates/deletes)
8+
- You need **user-defined semantics** (audit deletes, soft-delete, running aggregates)
9+
10+
## Syntax
11+
12+
```sql
13+
CREATE OR REPLACE DYNAMIC TABLE my_dt (
14+
col1 TYPE, col2 TYPE -- explicit columns required
15+
)
16+
TARGET_LAG = '5 minutes'
17+
WAREHOUSE = my_wh
18+
REFRESH_MODE = CUSTOM_INCREMENTAL
19+
[ BACKFILL FROM existing_table ]
20+
REFRESH USING (
21+
-- MERGE INTO SELF or INSERT INTO SELF
22+
);
23+
```
24+
25+
Key concepts:
26+
- `SELF` references the DT being created (you cannot use the DT's name)
27+
- `CHANGES(INFORMATION => { DEFAULT | APPEND_ONLY })` consumes changes since last refresh
28+
- Tables outside `CHANGES()` are read as static snapshots at refresh time
29+
- Explicit column schema is required (no `AS SELECT` inference)
30+
31+
## Pattern: Stream-Static Join (Append-Only)
32+
33+
Enrich new events with current dimension data. Only new events are processed — dimension changes don't trigger reprocessing.
34+
35+
```sql
36+
CREATE OR REPLACE DYNAMIC TABLE enriched_clicks (
37+
click_id INT, user_id INT, page_title STRING,
38+
section STRING, click_ts TIMESTAMP
39+
)
40+
TARGET_LAG = DOWNSTREAM
41+
WAREHOUSE = my_wh
42+
REFRESH USING (
43+
INSERT INTO SELF
44+
SELECT c.click_id, c.user_id, p.page_title, p.section, c.click_ts
45+
FROM clicks CHANGES(INFORMATION => APPEND_ONLY) AS c
46+
LEFT OUTER JOIN pages AS p ON c.page_id = p.page_id
47+
);
48+
```
49+
50+
## Pattern: Stream-Static Join (MERGE with Updates/Deletes)
51+
52+
When the fact table has updates and deletes, use MERGE with `ROW_NUMBER()` dedup:
53+
54+
```sql
55+
CREATE OR REPLACE DYNAMIC TABLE enriched_inventory (
56+
sku_id INT, product_name STRING, category STRING,
57+
warehouse_name STRING, region STRING, qty_on_hand INT
58+
)
59+
TARGET_LAG = DOWNSTREAM
60+
WAREHOUSE = my_wh
61+
REFRESH USING (
62+
MERGE INTO SELF AS tgt
63+
USING (
64+
SELECT sku_id, product_name, category, warehouse_name, region,
65+
qty_on_hand, action
66+
FROM (
67+
SELECT s.sku_id, p.product_name, p.category,
68+
w.warehouse_name, w.region, s.qty_on_hand,
69+
s.METADATA$ACTION AS action,
70+
ROW_NUMBER() OVER (
71+
PARTITION BY s.sku_id
72+
ORDER BY CASE s.METADATA$ACTION WHEN 'INSERT' THEN 0 ELSE 1 END
73+
) AS rn
74+
FROM stock CHANGES(INFORMATION => DEFAULT) AS s
75+
LEFT OUTER JOIN products AS p ON s.product_id = p.product_id
76+
LEFT OUTER JOIN warehouses AS w ON s.warehouse_id = w.warehouse_id
77+
)
78+
WHERE rn = 1
79+
) AS src
80+
ON tgt.sku_id = src.sku_id
81+
WHEN MATCHED AND src.action = 'DELETE' THEN DELETE
82+
WHEN MATCHED AND src.action = 'INSERT' THEN
83+
UPDATE SET tgt.product_name = src.product_name,
84+
tgt.category = src.category,
85+
tgt.warehouse_name = src.warehouse_name,
86+
tgt.region = src.region,
87+
tgt.qty_on_hand = src.qty_on_hand
88+
WHEN NOT MATCHED AND src.action = 'INSERT' THEN
89+
INSERT (sku_id, product_name, category, warehouse_name, region, qty_on_hand)
90+
VALUES (src.sku_id, src.product_name, src.category, src.warehouse_name,
91+
src.region, src.qty_on_hand)
92+
);
93+
```
94+
95+
## Example: Stream-Static Join End-to-End
96+
97+
A complete walkthrough showing how a stream-static join works in practice. Scenario: an IoT pipeline where sensor readings (high-volume, append-only) are enriched with device metadata (low-volume, rarely changes).
98+
99+
```sql
100+
-- 1. Setup: fact table (append-only sensor readings) + dimension table (device registry)
101+
CREATE TABLE sensor_readings (
102+
reading_id INT AUTOINCREMENT,
103+
device_id INT,
104+
temperature FLOAT,
105+
humidity FLOAT,
106+
reading_ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP()
107+
);
108+
ALTER TABLE sensor_readings SET CHANGE_TRACKING = TRUE;
109+
110+
CREATE TABLE devices (
111+
device_id INT PRIMARY KEY,
112+
device_name STRING,
113+
location STRING,
114+
floor INT
115+
);
116+
117+
-- 2. Custom incremental DT: enrich readings with device info
118+
-- - sensor_readings is the STREAM side (CHANGES => APPEND_ONLY)
119+
-- - devices is the STATIC side (read in full at each refresh, changes ignored)
120+
CREATE OR REPLACE DYNAMIC TABLE enriched_readings (
121+
reading_id INT,
122+
device_id INT,
123+
device_name STRING,
124+
location STRING,
125+
floor INT,
126+
temperature FLOAT,
127+
humidity FLOAT,
128+
reading_ts TIMESTAMP
129+
)
130+
TARGET_LAG = '1 minute'
131+
WAREHOUSE = iot_wh
132+
REFRESH USING (
133+
INSERT INTO SELF
134+
SELECT
135+
r.reading_id, r.device_id,
136+
d.device_name, d.location, d.floor,
137+
r.temperature, r.humidity, r.reading_ts
138+
FROM sensor_readings CHANGES(INFORMATION => APPEND_ONLY) AS r
139+
LEFT OUTER JOIN devices AS d ON r.device_id = d.device_id
140+
);
141+
```
142+
143+
**What happens at each refresh:**
144+
1. `CHANGES(APPEND_ONLY)` returns only new sensor readings since last refresh
145+
2. Each new reading is joined to the **current** device metadata (static snapshot)
146+
3. Results are appended to the DT — previously enriched rows are never touched
147+
4. If a device name changes in `devices`, old readings keep the old name — only new readings pick up the update
148+
149+
**Why this matters:** A standard DT would reprocess ALL readings whenever a device name changes (since it depends on `devices`). The custom incremental version only processes new readings, making it orders of magnitude cheaper for high-volume fact tables with slowly-changing dimensions.
150+
151+
---
152+
153+
## Pattern: Audit Deletes Log
154+
155+
Append-only log of every deletion from a source table:
156+
157+
```sql
158+
CREATE OR REPLACE DYNAMIC TABLE deletions_log (id INT, name STRING, email STRING)
159+
TARGET_LAG = DOWNSTREAM
160+
WAREHOUSE = my_wh
161+
INITIALIZE = ON_SCHEDULE
162+
REFRESH USING (
163+
INSERT INTO SELF
164+
SELECT * EXCLUDE (METADATA$ISUPDATE, METADATA$ACTION)
165+
FROM users CHANGES(INFORMATION => DEFAULT)
166+
WHERE NOT METADATA$ISUPDATE AND METADATA$ACTION = 'DELETE'
167+
);
168+
```
169+
170+
## Limitations (PrPr)
171+
172+
- No cloning or replication
173+
- No DCM/dbt integration yet
174+
- No data governance policies on custom incremental DTs
175+
- No CREATE OR ALTER — must use CREATE OR REPLACE
176+
- Correctness is the user's responsibility (not delayed-view semantics)
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# DCM for Dynamic Tables
2+
3+
DCM (Database Change Management) provides **git-native infrastructure-as-code** for DT pipelines. Define your DTs declaratively, version them in git, and deploy with `snow dcm plan``snow dcm deploy`.
4+
5+
## Why DCM for DTs
6+
7+
- **Version controlled** — DT definitions live in git alongside your other infrastructure
8+
- **Repeatable deployments** — same definitions deploy to dev/staging/prod via templating
9+
- **Schema evolution** — change a DT definition, redeploy, DCM handles the diff
10+
- **Full pipeline IaC** — database, schema, warehouses, tables, DTs, roles, and grants in one project
11+
12+
## DCM DT Syntax (DEFINE)
13+
14+
```sql
15+
DEFINE DYNAMIC TABLE {{ database }}.{{ schema }}.BRONZE_EVENTS
16+
TARGET_LAG = DOWNSTREAM
17+
WAREHOUSE = {{ database }}_DT_WH
18+
AS
19+
SELECT
20+
record_content:event_id::STRING AS event_id,
21+
record_content:event_type::STRING AS event_type,
22+
record_content:user_id::STRING AS user_id,
23+
record_content:timestamp::TIMESTAMP_NTZ AS event_ts,
24+
record_content:payload AS payload
25+
FROM {{ database }}.{{ schema }}.RAW_EVENTS_TOPIC;
26+
27+
DEFINE DYNAMIC TABLE {{ database }}.{{ schema }}.GOLD_HOURLY_SALES
28+
TARGET_LAG = '5 minutes'
29+
WAREHOUSE = {{ database }}_DT_WH
30+
AS
31+
SELECT
32+
DATE_TRUNC('hour', event_ts) AS sales_hour,
33+
category,
34+
COUNT(DISTINCT event_id) AS order_count,
35+
SUM(line_total) AS revenue
36+
FROM {{ database }}.{{ schema }}.SILVER_PURCHASES
37+
GROUP BY 1, 2;
38+
```
39+
40+
## DCM Manifest (manifest.yml)
41+
42+
```yaml
43+
manifest_version: 2
44+
type: DCM_PROJECT
45+
default_target: 'DEV'
46+
targets:
47+
DEV:
48+
project_name: '{{DATABASE}}.{{SCHEMA}}.MY_PROJECT'
49+
project_owner: SYSADMIN
50+
templating_config: 'DEV'
51+
templating:
52+
defaults:
53+
database: 'MY_DB'
54+
schema: 'PUBLIC'
55+
configurations:
56+
DEV:
57+
database: 'MY_DB_DEV'
58+
PROD:
59+
database: 'MY_DB_PROD'
60+
```
61+
62+
## DCM Workflow
63+
64+
```bash
65+
snow dcm raw-analyze dcm/ -c <connection>
66+
snow dcm plan dcm/ -c <connection> --save-output
67+
snow dcm deploy dcm/ -c <connection> --alias "v1-initial"
68+
```
69+
70+
**Tip:** Put all DT definitions in a single `dynamic_tables.sql` file within `dcm/definitions/`. DCM processes all `.sql` files in that directory. Use Jinja templating (`{{ database }}`, `{{ schema }}`) for environment portability.

0 commit comments

Comments
 (0)