feat: microbatch strategy #404

canbekley · 2025-01-06T10:32:02Z

Adds support for the incremental "microbatch" strategy #398, that comes with dbt-core 1.9.0 #403
Currently only works for non-distributed incremental model, as dbt-core has hard-coded config.materialization = 'incremental' into a microbatching condition. Maybe this can be patched by the adapater though?

Checklist

Unit and integration tests covering the common scenarios were added
A human-readable description of the changes was provided to include in CHANGELOG

canbekley · 2025-01-13T11:03:37Z

@BentsiLeviav may I ask for a review on this? it would be an important addition for our pipelines

BentsiLeviav · 2025-01-22T12:13:00Z

Hi @canbekley

I'll review this in the next following days.
This feature depends on #403 , so feel free to answer there once you have time.
In addition, could you please add an indication of this feature to the README.md file? an explanation of how this was implemented would be appreciated

pheepa · 2025-02-11T16:00:52Z

Hi, that's a great feature - thanks for implementing it!

I don't quite understand the logic behind requiring partition_by for this method if microbatching is handled via a series of delete_insert strategy operations.

Wouldn't it make more sense to either remove this restriction or implement it using insert_overwrite, similar to how it's done in dbt-bigquery? Reference.

In Snowflake, delete_insert are used without requiring partition_by: Reference.

Since ClickHouse struggles when handling a large number of complex mutations (my experience), I assume it would be better to go with insert_overwrite.

canbekley · 2025-02-18T16:25:50Z

@pheepa I don't see where the microbatch strategy is requiring partition_by. You can also check the integration test model, which doesn't have a partition_by configuration, but a required unique_key instead. I have refactored some of the existing incremental strategy validations by moving them to a adapter method, including some of the insert_overwrite validations. Maybe that part was confusing?

I opted for delete_insert here because I think it is more flexibel, in that it doesn't require a partitioned table, and we can have arbitrary batch sizes independent of existing partition keys.

pheepa · 2025-02-19T07:19:48Z

@canbekley I'm sorry, the refactoring of the incremental strategy validation confused me a little bit (but as an adapter method it makes more sense). Now it looks good to me.
Let's see how delete_insert goes

canbekley · 2025-03-05T15:07:26Z

hi @BentsiLeviav, I've added documentation and fixed an issue with previous python versions. could you have a second look?

canbekley · 2025-03-05T15:26:41Z

@BentsiLeviav Do you maybe have an idea how to circumvent this check in MicrobatchModelRunner._is_incremental(), which currently prevents distributed_incremental models being run in microbatches?

maowerner · 2025-04-04T08:13:33Z

Hi, is there any update on this? I believe this feature is waiting for review :)

canbekley added 2 commits December 27, 2024 12:10

chore(deps): upgrade dbt-core to ~=1.9.0

4ba7554

resolve merge conflicts

95cb496

mshustov requested a review from BentsiLeviav January 20, 2025 16:51

canbekley added 4 commits February 17, 2025 11:48

Merge main into chore/upgrade-to-dbt-core-1.9

39b7e41

deps: upgrade dev requirements

e581078

fix: create relation from source config

21b4a00

deps: pin dbt-core to >= 1.8 for backwards compability

aea21f5

canbekley force-pushed the feat/microbatch-strategy branch from 4f99c6e to 4041b02 Compare February 18, 2025 13:58

canbekley added 9 commits March 14, 2025 15:47

deps: pin dbt-core to >= 1.9

8e97f26

feat: add incremental microbatch strategy

a534910

chore: cleanup branch

3d8f653

docs: move changelog message

0cffd57

fix: changelog cleanup

b235a87

refactor: move incremental strategy validation to adapter method

76e5204

fix: fix typo in jinja code

677d9e7

fix: f-string quotations in python<3.12

885d9ae

docs: added microbatch documentation and references

91fc08e

canbekley force-pushed the feat/microbatch-strategy branch from 69bc361 to 91fc08e Compare March 14, 2025 14:48

mshustov assigned BentsiLeviav Apr 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: microbatch strategy #404

feat: microbatch strategy #404

canbekley commented Jan 6, 2025 •

edited

Loading

canbekley commented Jan 13, 2025

BentsiLeviav commented Jan 22, 2025

pheepa commented Feb 11, 2025

canbekley commented Feb 18, 2025

pheepa commented Feb 19, 2025 •

edited

Loading

canbekley commented Mar 5, 2025

canbekley commented Mar 5, 2025 •

edited

Loading

maowerner commented Apr 4, 2025

feat: microbatch strategy #404

Are you sure you want to change the base?

feat: microbatch strategy #404

Conversation

canbekley commented Jan 6, 2025 • edited Loading

Checklist

canbekley commented Jan 13, 2025

BentsiLeviav commented Jan 22, 2025

pheepa commented Feb 11, 2025

canbekley commented Feb 18, 2025

pheepa commented Feb 19, 2025 • edited Loading

canbekley commented Mar 5, 2025

canbekley commented Mar 5, 2025 • edited Loading

maowerner commented Apr 4, 2025

canbekley commented Jan 6, 2025 •

edited

Loading

pheepa commented Feb 19, 2025 •

edited

Loading

canbekley commented Mar 5, 2025 •

edited

Loading