Skip to content

Commit 69bc361

Browse files
committed
docs: added microbatch documentation and references
1 parent 06769eb commit 69bc361

File tree

1 file changed

+26
-1
lines changed

1 file changed

+26
-1
lines changed

Diff for: README.md

+26-1
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ pip install dbt-clickhouse
3131
- [x] Table materialization
3232
- [x] View materialization
3333
- [x] Incremental materialization
34+
- [x] Microbatch incremental materialization
3435
- [x] Materialized View materializations (uses the `TO` form of MATERIALIZED VIEW, experimental)
3536
- [x] Seeds
3637
- [x] Sources
@@ -114,13 +115,23 @@ your_profile_name:
114115
| primary_key | Like order_by, a ClickHouse primary key expression. If not specified, ClickHouse will use the order by expression as the primary key | |
115116
| unique_key | A tuple of column names that uniquely identify rows. Used with incremental models for updates. | |
116117
| inserts_only | If set to True for an incremental model, incremental updates will be inserted directly to the target table without creating intermediate table. It has been deprecated in favor of the `append` incremental `strategy`, which operates in the same way. If `inserts_only` is set, `incremental_strategy` is ignored. | |
117-
| incremental_strategy | Incremental model update strategy: `delete+insert`, `append`, or `insert_overwrite`. See the following Incremental Model Strategies | `default` |
118+
| incremental_strategy | Incremental model update strategy: `delete+insert`, `append`, `insert_overwrite`, or `microbatch`. See the following Incremental Model Strategies | `default` |
118119
| incremental_predicates | Additional conditions to be applied to the incremental materialization (only applied to `delete+insert` strategy | |
119120
| settings | A map/dictionary of "TABLE" settings to be used to DDL statements like 'CREATE TABLE' with this model | |
120121
| query_settings | A map/dictionary of ClickHouse user level settings to be used with `INSERT` or `DELETE` statements in conjunction with this model | |
121122
| ttl | A TTL expression to be used with the table. The TTL expression is a string that can be used to specify the TTL for the table. | |
122123
| indexes | A list of indexes to create, available only for `table` materialization. For examples look at ([#397](https://github.com/ClickHouse/dbt-clickhouse/pull/397)) | |
123124

125+
## Microbatch Configuration
126+
127+
| Option | Description | Default if any |
128+
|--------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------|
129+
| event_time | The column indicating "at what time did the row occur." Required for your microbatch model and any direct parents that should be filtered. | |
130+
| begin | The "beginning of time" for the microbatch model. This is the starting point for any initial or full-refresh builds. For example, a daily-grain microbatch model run on 2024-10-01 with begin = '2023-10-01 will process 366 batches (it's a leap year!) plus the batch for "today." | |
131+
| batch_size | The granularity of your batches. Supported values are `hour`, `day`, `month`, and `year` | |
132+
| lookback | Process X batches prior to the latest bookmark to capture late-arriving records. | 1 |
133+
| concurrent_batches | Overrides dbt's auto detect for running batches concurrently (at the same time). Read more about [configuring concurrent batches](https://docs.getdbt.com/docs/build/incremental-microbatch#configure-concurrent_batches). Setting to true runs batches concurrently (in parallel). false runs batches sequentially (one after the other). | |
134+
124135
## Column Configuration
125136

126137
| Option | Description | Default if any |
@@ -221,6 +232,20 @@ caveats to using this strategy:
221232
incremental predicates should only include sub-queries on data that will not be modified during the incremental
222233
materialization.
223234

235+
### The Microbatch Strategy (Requires dbt-core >= 1.9)
236+
237+
The incremental strategy `microbatch` has been a dbt-core feature since version 1.9, designed to handle large
238+
time-series data transformations efficiently. In dbt-clickhouse, it builds on top of the existing `delete_insert`
239+
incremental strategy by splitting the increment into predefined time-series batches based on the `event_time` and
240+
`batch_size` model configurations.
241+
242+
Beyond handling large transformations, microbatch provides the ability to:
243+
- [Reprocess failed batches](https://docs.getdbt.com/docs/build/incremental-microbatch#retry).
244+
- Auto-detect [parallel batch execution](https://docs.getdbt.com/docs/build/parallel-batch-execution).
245+
- Eliminate the need for complex conditional logic in [backfilling](https://docs.getdbt.com/docs/build/incremental-microbatch#backfills).
246+
247+
For detailed microbatch usage, refer to the [official documentation](https://docs.getdbt.com/docs/build/incremental-microbatch).
248+
224249
### The Append Strategy
225250

226251
This strategy replaces the `inserts_only` setting in previous versions of dbt-clickhouse. This approach simply appends

0 commit comments

Comments
 (0)