You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: website/docs/engine-flink/delta-joins.md
+16-6Lines changed: 16 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -144,6 +144,9 @@ The delta join feature is introduced since Flink 2.1 and still evolving, and its
144
144
145
145
Refer to the [Delta Join Issue](https://issues.apache.org/jira/browse/FLINK-37836) for the most up-to-date information.
146
146
147
+
:::warning
148
+
There is a known issue ([FLINK-38399](https://issues.apache.org/jira/browse/FLINK-38399)) in Flink ≤ 2.1.1 that prevents certain queries from being translated into delta joins. This has been fixed in Flink 2.1.2 and Flink 2.2.
149
+
:::
147
150
148
151
### Flink 2.1
149
152
@@ -155,12 +158,14 @@ Refer to the [Delta Join Issue](https://issues.apache.org/jira/browse/FLINK-3783
155
158
156
159
- The primary key or the prefix key of the tables must be included as part of the equivalence conditions in the join.
157
160
- The join must be a INNER join.
158
-
- The downstream nodes of the join can accept duplicate changes, such as a sink that provides UPSERT mode without `upsertMaterialize`.
161
+
- The downstream node of the join must support idempotent updates, typically it's an upsert sink and should not have a `SinkUpsertMaterializer` node before it.
162
+
- Flink planner automatically inserts a `SinkUpsertMaterializer` when the sink’s primary key does not fully cover the upstream update key.
163
+
- You can learn more details about `SinkUpsertMaterializer` by reading this [blog](https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-events).
159
164
- All join inputs should be INSERT-ONLY streams.
160
165
- This is why the option `'table.merge-engine' = 'first_row'` is added to the source table DDL.
161
166
- All upstream nodes of the join should be `TableSourceScan` or `Exchange`.
162
167
163
-
### Flink 2.2 (upcoming)
168
+
### Flink 2.2
164
169
165
170
#### Supported Features
166
171
@@ -172,11 +177,16 @@ Refer to the [Delta Join Issue](https://issues.apache.org/jira/browse/FLINK-3783
172
177
173
178
#### Limitations
174
179
175
-
- The primary key or the prefix lookup key of the tables must be included as part of the equivalence conditions in the join.
180
+
- The primary key or the prefix key of the tables must be included as part of the equivalence conditions in the join.
176
181
- The join must be a INNER join.
177
-
- The downstream nodes of the join can accept duplicate changes, such as a sink that provides UPSERT mode.
178
-
- When consuming a CDC stream, the join key used in the delta join must be part of the primary key.
179
-
- All filters must be applied on the upsert key, and neither filters nor projections should contain non-deterministic functions.
182
+
- The downstream node of the join must support idempotent updates, typically it's an upsert sink and should not have a `SinkUpsertMaterializer` node before it.
183
+
- Flink planner automatically inserts a `SinkUpsertMaterializer` when the sink’s primary key does not fully cover the upstream update key.
184
+
- You can learn more details about `SinkUpsertMaterializer` by reading this [blog](https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-events).
185
+
- Since delta join does not support to handle update-before messages, it is necessary to ensure that the entire pipeline can safely discard update-before messages. That means when consuming a CDC stream:
186
+
- The join key used in the delta join must be part of the primary key.
187
+
- The sink's primary key must be the same as the upstream update key.
188
+
- All filters must be applied on the upsert key.
189
+
- Neither filters nor projections should contain non-deterministic functions.
0 commit comments