Skip to content

Commit a4a17fa

Browse files
update warnings and docs
1 parent 2b81e68 commit a4a17fa

File tree

3 files changed

+41
-10
lines changed

3 files changed

+41
-10
lines changed

docs/concepts/models/model_kinds.md

Lines changed: 40 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -935,7 +935,13 @@ SQLMesh achieves this by adding a `valid_from` and `valid_to` column to your mod
935935

936936
Therefore, you can use these models to not only tell you what the latest value is for a given record but also what the values were anytime in the past. Note that maintaining this history does come at a cost of increased storage and compute and this may not be a good fit for sources that change frequently since the history could get very large.
937937

938-
**Note**: Partial data [restatement](../plans.md#restatement-plans) is not supported for this model kind, which means that the entire table will be recreated from scratch if restated. This may lead to data loss, so data restatement is disabled for models of this kind by default.
938+
**Note**: SCD Type 2 models support [restatements](../plans.md#restatement-plans) with specific limitations:
939+
940+
- **Full restatements**: The entire table will be recreated from scratch when no start date is specified
941+
- **Partial restatements**: You can specify a start date to restate data from a certain point onwards to the latest interval. The end date will always be set to the latest interval's end date, regardless of what end date you specify
942+
- **Partial sections**: Restatements of specific sections (discontinued ranges) of the table are not supported
943+
944+
Data restatement is disabled for models of this kind by default (`disable_restatement true`). To enable restatements, set `disable_restatement false` in your model configuration.
939945

940946
There are two ways to tracking changes: By Time (Recommended) or By Column.
941947

@@ -1283,11 +1289,11 @@ This is the most accurate representation of the menu based on the source data pr
12831289

12841290
### Processing Source Table with Historical Data
12851291

1286-
The most common case for SCD Type 2 is creating history for a table that it doesn't have it already.
1292+
The most common case for SCD Type 2 is creating history for a table that it doesn't have it already.
12871293
In the example of the restaurant menu, the menu just tells you what is offered right now, but you want to know what was offered over time.
12881294
In this case, the default setting of `None` for `batch_size` is the best option.
12891295

1290-
Another use case though is processing a source table that already has history in it.
1296+
Another use case though is processing a source table that already has history in it.
12911297
A common example of this is a "daily snapshot" table that is created by a source system that takes a snapshot of the data at the end of each day.
12921298
If your source table has historical records, like a "daily snapshot" table, then set `batch_size` to `1` to process each interval (each day if a `@daily` cron) in sequential order.
12931299
That way the historical records will be properly captured in the SCD Type 2 table.
@@ -1433,11 +1439,14 @@ GROUP BY
14331439
id
14341440
```
14351441

1436-
### Reset SCD Type 2 Model (clearing history)
1442+
### SCD Type 2 Restatements
14371443

14381444
SCD Type 2 models are designed by default to protect the data that has been captured because it is not possible to recreate the history once it has been lost.
14391445
However, there are cases where you may want to clear the history and start fresh.
1440-
For this use use case you will want to start by setting `disable_restatement` to `false` in the model definition.
1446+
1447+
#### Enabling Restatements
1448+
1449+
To enable restatements for an SCD Type 2 model, set `disable_restatement` to `false` in the model definition:
14411450

14421451
```sql linenums="1" hl_lines="5"
14431452
MODEL (
@@ -1449,16 +1458,39 @@ MODEL (
14491458
);
14501459
```
14511460

1452-
Plan/apply this change to production.
1453-
Then you will want to [restate the model](../plans.md#restatement-plans).
1461+
#### Full Restatements (Clearing All History)
1462+
1463+
To clear all history and recreate the entire table from scratch:
14541464

14551465
```bash
14561466
sqlmesh plan --restate-model db.menu_items
14571467
```
14581468

14591469
!!! warning
14601470

1461-
This will remove the historical data on the model which in most situations cannot be recovered.
1471+
This will remove **all** historical data on the model which in most situations cannot be recovered.
1472+
1473+
#### Partial Restatements (From a Specific Date)
1474+
1475+
You can restate data from a specific start date onwards. This will:
1476+
- Delete all records with `valid_from >= start_date`
1477+
- Reprocess the data from the start date to the latest interval
1478+
1479+
```bash
1480+
sqlmesh plan --restate-model db.menu_items --start "2023-01-15"
1481+
```
1482+
1483+
!!! note
1484+
1485+
If you specify an end date for SCD Type 2 restatements, it will be ignored and automatically set to the latest interval's end date.
1486+
1487+
```bash
1488+
# This end date will be ignored and set to the latest interval
1489+
sqlmesh plan --restate-model db.menu_items --start "2023-01-15" --end "2023-01-20"
1490+
```
1491+
1492+
1493+
#### Re-enabling Protection
14621494

14631495
Once complete you will want to remove `disable_restatement` on the model definition which will set it back to `true` and prevent accidental data loss.
14641496

sqlmesh/core/plan/builder.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -405,7 +405,7 @@ def _build_restatements(
405405
elif (not self._is_dev or not snapshot.is_paused) and snapshot.disable_restatement:
406406
self._console.log_warning(
407407
f"Cannot restate model '{snapshot.name}'. "
408-
"Restatement is disabled for this model to prevent possible data loss."
408+
"Restatement is disabled for this model to prevent possible data loss. "
409409
"If you want to restate this model, change the model's `disable_restatement` setting to `false`."
410410
)
411411
continue

sqlmesh/core/snapshot/definition.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -807,7 +807,6 @@ def get_removal_interval(
807807
get_console().log_warning(
808808
f"SCD Type 2 model '{self.model.name}' does not support end date in restatements.\n"
809809
f"Requested end date [{to_ts(requested_end)}] doesn't match latest interval end date [{to_ts(latest_end)}].\n"
810-
f"You can set start date but end date must be the latest interval so ."
811810
)
812811

813812
removal_interval = self.inclusive_exclusive(requested_start, latest_end, strict)

0 commit comments

Comments
 (0)