Skip to content

Commit 752c551

Browse files
committed
[docs] Clarify the supported flink versions for delta join (#2000)
(cherry picked from commit 4f00709)
1 parent c6d7d8c commit 752c551

File tree

1 file changed

+16
-6
lines changed

1 file changed

+16
-6
lines changed

website/docs/engine-flink/delta-joins.md

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,9 @@ The delta join feature is introduced since Flink 2.1 and still evolving, and its
144144

145145
Refer to the [Delta Join Issue](https://issues.apache.org/jira/browse/FLINK-37836) for the most up-to-date information.
146146

147+
:::warning
148+
There is a known issue ([FLINK-38399](https://issues.apache.org/jira/browse/FLINK-38399)) in Flink ≤ 2.1.1 that prevents certain queries from being translated into delta joins. This has been fixed in Flink 2.1.2 and Flink 2.2.
149+
:::
147150

148151
### Flink 2.1
149152

@@ -155,12 +158,14 @@ Refer to the [Delta Join Issue](https://issues.apache.org/jira/browse/FLINK-3783
155158

156159
- The primary key or the prefix key of the tables must be included as part of the equivalence conditions in the join.
157160
- The join must be a INNER join.
158-
- The downstream nodes of the join can accept duplicate changes, such as a sink that provides UPSERT mode without `upsertMaterialize`.
161+
- The downstream node of the join must support idempotent updates, typically it's an upsert sink and should not have a `SinkUpsertMaterializer` node before it.
162+
- Flink planner automatically inserts a `SinkUpsertMaterializer` when the sink’s primary key does not fully cover the upstream update key.
163+
- You can learn more details about `SinkUpsertMaterializer` by reading this [blog](https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-events).
159164
- All join inputs should be INSERT-ONLY streams.
160165
- This is why the option `'table.merge-engine' = 'first_row'` is added to the source table DDL.
161166
- All upstream nodes of the join should be `TableSourceScan` or `Exchange`.
162167

163-
### Flink 2.2 (upcoming)
168+
### Flink 2.2
164169

165170
#### Supported Features
166171

@@ -172,11 +177,16 @@ Refer to the [Delta Join Issue](https://issues.apache.org/jira/browse/FLINK-3783
172177

173178
#### Limitations
174179

175-
- The primary key or the prefix lookup key of the tables must be included as part of the equivalence conditions in the join.
180+
- The primary key or the prefix key of the tables must be included as part of the equivalence conditions in the join.
176181
- The join must be a INNER join.
177-
- The downstream nodes of the join can accept duplicate changes, such as a sink that provides UPSERT mode.
178-
- When consuming a CDC stream, the join key used in the delta join must be part of the primary key.
179-
- All filters must be applied on the upsert key, and neither filters nor projections should contain non-deterministic functions.
182+
- The downstream node of the join must support idempotent updates, typically it's an upsert sink and should not have a `SinkUpsertMaterializer` node before it.
183+
- Flink planner automatically inserts a `SinkUpsertMaterializer` when the sink’s primary key does not fully cover the upstream update key.
184+
- You can learn more details about `SinkUpsertMaterializer` by reading this [blog](https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-events).
185+
- Since delta join does not support to handle update-before messages, it is necessary to ensure that the entire pipeline can safely discard update-before messages. That means when consuming a CDC stream:
186+
- The join key used in the delta join must be part of the primary key.
187+
- The sink's primary key must be the same as the upstream update key.
188+
- All filters must be applied on the upsert key.
189+
- Neither filters nor projections should contain non-deterministic functions.
180190

181191
## Future Plan
182192

0 commit comments

Comments
 (0)