You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: website/docs/engine-flink/delta-joins.md
+11-8Lines changed: 11 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -158,9 +158,9 @@ There is a known issue ([FLINK-38399](https://issues.apache.org/jira/browse/FLIN
158
158
159
159
- The primary key or the prefix key of the tables must be included as part of the equivalence conditions in the join.
160
160
- The join must be a INNER join.
161
-
- The downstream nodes of the join can accept duplicate changes, such as a sink that provides UPSERT mode without `upsertMaterialize`.
162
-
-When the pk of the sink does not align with (or does not include) the upstream upsert key, the sink will produce a sink materialization (called `upsertMaterialize`).
163
-
-About upsert key and `upsertMaterialize`, more details can be found in this [blog](https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-events).
161
+
- The downstream node of the join must support idempotent updates, typically it's an upsert sink and should not have a `SinkUpsertMaterializer` node before it.
162
+
-Flink planner automatically inserts a `SinkUpsertMaterializer` when the sink’s primary key does not fully cover the upstream update key.
163
+
-You can learn more details about `SinkUpsertMaterializer` by reading this [blog](https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-events).
164
164
- All join inputs should be INSERT-ONLY streams.
165
165
- This is why the option `'table.merge-engine' = 'first_row'` is added to the source table DDL.
166
166
- All upstream nodes of the join should be `TableSourceScan` or `Exchange`.
@@ -179,11 +179,14 @@ There is a known issue ([FLINK-38399](https://issues.apache.org/jira/browse/FLIN
179
179
180
180
- The primary key or the prefix key of the tables must be included as part of the equivalence conditions in the join.
181
181
- The join must be a INNER join.
182
-
- The downstream nodes of the join can accept duplicate changes, such as a sink that provides UPSERT mode without `upsertMaterialize`.
183
-
- When the pk of the sink does not align with (or does not include) the upstream upsert key, the sink will produce a sink materialization (called `upsertMaterialize`).
184
-
- About upsert key and `upsertMaterialize`, more details can be found in this [blog](https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-events).
185
-
- When consuming a CDC stream, the join key used in the delta join must be part of the primary key.
186
-
- All filters must be applied on the upsert key, and neither filters nor projections should contain non-deterministic functions.
182
+
- The downstream node of the join must support idempotent updates, typically it's an upsert sink and should not have a `SinkUpsertMaterializer` node before it.
183
+
- Flink planner automatically inserts a `SinkUpsertMaterializer` when the sink’s primary key does not fully cover the upstream update key.
184
+
- You can learn more details about `SinkUpsertMaterializer` by reading this [blog](https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-events).
185
+
- Since delta join does not support to handle update-before messages, it is necessary to ensure that the entire pipeline can safely discard update-before messages. That means when consuming a CDC stream:
186
+
- The join key used in the delta join must be part of the primary key.
187
+
- The sink's primary key must be the same as the upstream update key.
188
+
- All filters must be applied on the upsert key.
189
+
- Neither filters nor projections should contain non-deterministic functions.
0 commit comments