-
Notifications
You must be signed in to change notification settings - Fork 458
[doc] Add docs for delta join support with Flink 2.1 #1875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
| The work on Delta Join is still ongoing, so the support for more sql patterns that can be optimized into delta join varies across different versions of Flink. More details can be found at [Delta Join](https://issues.apache.org/jira/browse/FLINK-37836). | ||
|
|
||
| ### Flink 2.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the upcoming Flink 2.2, additional patterns will be supported, as the relevant PRs have already been merged. I'm uncertain whether I should include a description of Flink 2.2 here. So it's entirely up to you.
### Flink 2.2 (upcoming)
#### Supported Features
- Support for optimizing a dual-stream join from CDC sources that do not include delete messages into a delta join.
- Include the `table.delete.behavior` with a non-`ALLOW` option in the source table DDL to ensure that the source table does not emit delete messages.
- Support `Project` and `Filter` between source and delta join.
- Support cache in delta join.
#### Limitations
- The primary key or the prefix lookup key of the tables must be included as part of the equivalence conditions in the join.
- The join must be a INNER join.
- The downstream nodes of the join can accept duplicate changes, such as a sink that provides UPSERT mode.
- When consuming a CDC stream, the join key used in the delta join must be part of the primary key.
- All filters must be applied on the upsert key, and neither filters nor projections should contain non-deterministic functions.
| --- | ||
| title: "DataStream API" | ||
| sidebar_position: 6 | ||
| sidebar_position: 7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
@xuyangzhong i added a few improvements.. please take a look to verify and let me know if you are ok with these |
|
@polyzos Thanks for reviewing. I just do a few minor adjustments on your great improvements. By the way, I think you're right we maybe need a new page to describe prefix keys. Prior to this, only the Lookups page had a brief description: https://fluss.apache.org/docs/0.8/engine-flink/lookups/#prefix-lookup. |
wuchong
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I improved the delta join documentation a bit, including
- Add a image to explain the delta join
Understanding Prefix Keys->Understanding Index Keysas the index key is more generic, Prefix Key is just an implementation of Index Key, and we are supporting the general index key in the next version.- Add "Future Plan"
(cherry picked from commit aa4afe8)

Purpose
Linked issue: close #1739
Brief change log
Add docs about delta join support with Flink 2.1
Tests
Run
npm run startto check.API and Format
None
Documentation
None