Skip to content

[FLINK-38729] Add support for Flink 2.2.0#4294

Merged
lvyanquan merged 12 commits into
masterfrom
FLINK-38729-2
Mar 13, 2026
Merged

[FLINK-38729] Add support for Flink 2.2.0#4294
lvyanquan merged 12 commits into
masterfrom
FLINK-38729-2

Conversation

@lvyanquan
Copy link
Copy Markdown
Contributor

@lvyanquan lvyanquan commented Mar 2, 2026

Introduction

Building upon the foundation where existing modules continue to use Flink 1.20 dependencies, support for Flink 2.x versions is provided through newly added modules.

Development Plan

I plan to complete full support for Flink 2.x versions through three steps:

  1. The first step is to provide support for Flink 2.x versions in the common/runtime/composer modules,
    and perform integration tests and end-to-end tests on these modules based on a simple values pipeline
    connector to verify correctness.
  2. The second step is to implement a MySQL Pipeline connector that supports Flink 2.x versions, as it is
    our most commonly used CDC connector.
  3. The third step is to add support for Flink 2.x versions to existing source/pipeline connectors, if
    feasible.

This PR will complete the work of the first step.

Topics for Discussion

1. Module Design

Question: Is it necessary to design each module with a structure consisting of a common module, a module with 1.x API, and a module with 2.x API, as Paimon does?
My Answer: This would require creating three modules for every module in the project. I think this introduces too many additional modules. Therefore, I will keep existing modules' dependency on Flink 1.x unchanged, and only add new modules that depend on Flink 2.x. I will rewrite classes that depend on the new API, and use the shade plugin to reduce the number of classes that need to be rewritten in the new modules.

2. Test Coverage

Question: Is it necessary to add tests equivalent to those in the 1.x modules for each newly added 2.x module?
My Answer: This is a difficult decision point. Adding sufficient tests can guarantee the correctness and reliability of 2.x modules, but it would introduce a large amount of duplicate code and also increase the time required for CI runs. To avoid the burden of review, I have only added composer tests and e2e tests in this PR to ensure that the support for Flink 2.x is functional. I plan to add more complete tests in subsequent PRs (if necessary).

The above lists the points that I still consider uncertain during the implementation of this PR. Discussions are welcome.

Copy link
Copy Markdown
Member

@yuxiqian yuxiqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just investigated the implementation in apache/fluss#1176. Seems instead of creating a replica for each file that uses incompatible API (between Flink 1.x and 2.x), Hongshun introduces a compatible layer in fluss-flink-common package and puts all incompatible API usages inside, and dispatch it in fluss-flink-1.x and fluss-flink-2.x. I think if we took this approach, the only module that needs to be versioned is flink-cdc-common.

The obvious advantage is we don't need so much code duplication. This PR adds ~20k sloc code (mostly duplicate, and it's hard to review what's changed) while Fluss support is merely +856 -36. Though two codebase could not be compared directly, such huge difference is not negligible. This design also causes maintenance issue, as any following PR modifying related file must remember updating both 1.x and 2.x implementation and keep them in-sync.

I'm not strongly against the approach used in this PR, just wondering if the alternative solution is possible / why it's impossible.

Comment thread flink-cdc-dist-2.x/src/main/flink-cdc-bin/bin/flink-cdc.sh Outdated
@macdoor
Copy link
Copy Markdown

macdoor commented Mar 5, 2026

Hi @yuxiqian,

We've been working on Flink 2.2 compatibility in a fork and have a working implementation following the approach referenced in fluss#1176 — i.e., introducing a version-specific compat module with runtime bridges.

Our branch: https://github.com/macdoor/flink-cdc/tree/feature/opengauss-flink22-compat

Key changes we made to get OpenGauss → Paimon pipelines running on Flink 2.2:

flink-cdc-flink-compat module — two sub-modules: flink-cdc-flink-compat-flink1 (Flink 1.x bridge with SinkFunction/SourceFunction) and flink-cdc-flink-compat-flink2 (Flink 2.x stub classes: Sink$InitContext, CatalogFactory, Catalog).
DataSinkWriterOperator — reflection-based wrapping of SinkWriterOperator to find compatible constructors; fixed getSubtaskIndexCompat() to use getTaskInfo().getIndexOfThisSubtask() (previously returned hardcoded 0, breaking SchemaCoordinator flush synchronization).
DataSinkTranslator — getMethods() (not getDeclaredMethods()) to detect two-phase commit across superclasses; SupportsCommitter adapter proxy (serializable) for sinks that declare createCommitter() without implementing the interface.
Serializers — added resolveSchemaCompatibility(TypeSerializerSnapshot) (Flink 2.x new abstract method) to all custom TypeSerializerSnapshot implementations without @OverRide, so they compile against both 1.x and 2.x.
SourceSplitSerializer — reflection-based LogicalTypeParser.parse() to handle the removed single-arg overload.
PreCommitOperator / schema operators — replaced getRuntimeContext().getIndexOfThisSubtask() with getRuntimeContext().getTaskInfo().getIndexOfThisSubtask().

The pipeline now runs end-to-end on Flink 2.2.0 with a standalone session cluster (OpenGauss source → Paimon sink, all 4 operator stages visible including Sink Committer).

Happy to share details, open a draft PR, or contribute directly to this effort. Let us know what would be most helpful!

@lvyanquan
Copy link
Copy Markdown
Contributor Author

Hi @macdoor, I pulled your branch, but the code in this branch doesn't seem to be complete (it doesn't even compile). Did I miss something? Of course, your implementation in the PR is more concise. If you can get it to compile and run on Flink 2.2, feel free to submit a PR first so we can verify that the tests pass.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 74 out of 74 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread flink-cdc-e2e-tests/flink-cdc-pipeline-e2e-tests/pom.xml Outdated
Copy link
Copy Markdown
Member

@yuxiqian yuxiqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. This PR is already in good shape, and most code duplication has been eliminated. As some popular connectors need to be ported to Flink 2.x before the code freeze, I hope this could be merged soon, and we can polish the shim layer later.

@leonardBang
Copy link
Copy Markdown
Contributor

@lvyanquan Thanks for the refactor and aligned with flussxflink adapter solution, current work looks good to me, +1.

@lvyanquan lvyanquan merged commit 323b2bc into master Mar 13, 2026
34 of 35 checks passed
@ferenc-csaky
Copy link
Copy Markdown
Contributor

@lvyanquan I was swamped and did not have time to check this thoroughly yet, although I'm happy to see this merged and moved forward in general!

1 question:
I think cutting a new CDC major version would enable us to clear out a lot of code and compatibility complexity, and even JDK8. WDYT?

@lvyanquan
Copy link
Copy Markdown
Contributor Author

1 question: I think cutting a new CDC major version would enable us to clear out a lot of code and compatibility complexity, and even JDK8. WDYT?

Hi, ferenc. Thank you for your suggestion.

Maintaining two separate branches (major versions)—one for Flink 1.x (JDK 8) and another for Flink 2.x (JDK 11 or 17)—would result in cleaner code, but it would complicate code merging (since changes unrelated to the Flink version would need to be merged into both branches) and make version releases more difficult.

I think the key issue is whether we will have many new features that depend on Flink 2.x APIs. Based on current feedback from the community, users primarily want a CDC connector that works with Flink 2.x, and there hasn’t been much demand yet for adapting to the new connector capabilities introduced in Flink 2.x. As a result, a new CDC 4.0 branch might not differ significantly in functionality from CDC 3.x. In the current PR, we’ve introduced an Adapter module to provide compatibility with both Flink 1.x and Flink 2.x for existing code, requiring minimal changes to existing modules—sufficient to meet the community’s needs.

I prefer to wait until we either adopt Flink 2.x as our primary dependency version or have a clear need to leverage new features introduced in Flink 2.x—requiring substantial changes to the adapter layer or even causing complete incompatibility—before switching to a major CDC 4.x release. In such a version, we could also introduce significant changes like upgrading the Debezium version. This approach would make it easier for users to understand the necessity of introducing a new major version.

@ferenc-csaky
Copy link
Copy Markdown
Contributor

@lvyanquan That sounds reasonable based on the current demand, thanks for the write up!

@lvyanquan lvyanquan deleted the FLINK-38729-2 branch March 20, 2026 08:02
Mrart pushed a commit to Mrart/flink-cdc that referenced this pull request Mar 26, 2026
Co-authored-by: xiaoxiong.duan@zznode.com <xiaoxiong.duan@zznode.com>
ThorneANN pushed a commit to ThorneANN/flink-cdc that referenced this pull request Mar 31, 2026
Co-authored-by: xiaoxiong.duan@zznode.com <xiaoxiong.duan@zznode.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants