Skip to content

Commit 6033e1c

Browse files
committed
improve
1 parent 89db61f commit 6033e1c

File tree

1 file changed

+11
-8
lines changed

1 file changed

+11
-8
lines changed

website/docs/engine-flink/delta-joins.md

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -158,9 +158,9 @@ There is a known issue ([FLINK-38399](https://issues.apache.org/jira/browse/FLIN
158158

159159
- The primary key or the prefix key of the tables must be included as part of the equivalence conditions in the join.
160160
- The join must be a INNER join.
161-
- The downstream nodes of the join can accept duplicate changes, such as a sink that provides UPSERT mode without `upsertMaterialize`.
162-
- When the pk of the sink does not align with (or does not include) the upstream upsert key, the sink will produce a sink materialization (called `upsertMaterialize`).
163-
- About upsert key and `upsertMaterialize`, more details can be found in this [blog](https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-events).
161+
- The downstream node of the join must support idempotent updates, typically it's an upsert sink and should not have a `SinkUpsertMaterializer` node before it.
162+
- Flink planner automatically inserts a `SinkUpsertMaterializer` when the sink’s primary key does not fully cover the upstream update key.
163+
- You can learn more details about `SinkUpsertMaterializer` by reading this [blog](https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-events).
164164
- All join inputs should be INSERT-ONLY streams.
165165
- This is why the option `'table.merge-engine' = 'first_row'` is added to the source table DDL.
166166
- All upstream nodes of the join should be `TableSourceScan` or `Exchange`.
@@ -179,11 +179,14 @@ There is a known issue ([FLINK-38399](https://issues.apache.org/jira/browse/FLIN
179179

180180
- The primary key or the prefix key of the tables must be included as part of the equivalence conditions in the join.
181181
- The join must be a INNER join.
182-
- The downstream nodes of the join can accept duplicate changes, such as a sink that provides UPSERT mode without `upsertMaterialize`.
183-
- When the pk of the sink does not align with (or does not include) the upstream upsert key, the sink will produce a sink materialization (called `upsertMaterialize`).
184-
- About upsert key and `upsertMaterialize`, more details can be found in this [blog](https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-events).
185-
- When consuming a CDC stream, the join key used in the delta join must be part of the primary key.
186-
- All filters must be applied on the upsert key, and neither filters nor projections should contain non-deterministic functions.
182+
- The downstream node of the join must support idempotent updates, typically it's an upsert sink and should not have a `SinkUpsertMaterializer` node before it.
183+
- Flink planner automatically inserts a `SinkUpsertMaterializer` when the sink’s primary key does not fully cover the upstream update key.
184+
- You can learn more details about `SinkUpsertMaterializer` by reading this [blog](https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-events).
185+
- Since delta join does not support to handle update-before messages, it is necessary to ensure that the entire pipeline can safely discard update-before messages. That means when consuming a CDC stream:
186+
- The join key used in the delta join must be part of the primary key.
187+
- The sink's primary key must be the same as the upstream update key.
188+
- All filters must be applied on the upsert key.
189+
- Neither filters nor projections should contain non-deterministic functions.
187190

188191
## Future Plan
189192

0 commit comments

Comments
 (0)