[FLINK-36796][pipeline-connector][oracle]add oracle pipeline connector. #3995

linjianchang · 2025-04-18T09:07:19Z

add oracle pipeline connector.

joyCurry30

Thank you for your contribution. I just left some comment.

joyCurry30 · 2025-04-19T00:48:31Z

flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-oracle/pom.xml

+        <dependency>
+            <groupId>io.debezium</groupId>
+            <artifactId>debezium-core</artifactId>
+            <version>1.9.8.Final</version>


Use ${debezium.version}

And the scope should be "provide".

Have been modified

joyCurry30 · 2025-04-19T00:51:27Z

flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-oracle/pom.xml

+        <dependency>
+            <groupId>com.ververica</groupId>
+            <artifactId>flink-cdc-source-e2e-tests</artifactId>
+            <version>cty-3.0-2.2-SNAPSHOT</version>


Why you dependent on "flink-cdc-source-e2e-tests"?

Have been removed

joyCurry30 · 2025-04-19T00:52:08Z

flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-oracle/pom.xml

+        <dependency>
+            <groupId>org.apache.flink</groupId>
+            <artifactId>flink-connector-test-util</artifactId>
+            <version>3.4-SNAPSHOT</version>


Please use "${project.version}".

Have been modified

joyCurry30 · 2025-04-19T00:56:02Z

flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-oracle/pom.xml

+                                    <include>io.debezium:debezium-core</include>
+                                    <include>io.debezium:debezium-ddl-parser</include>
+                                    <include>io.debezium:debezium-connector-oracle</include>
+                                    <include>io.debezium:debezium-connector-mysql</include>


Could you explain the rationale behind having a dependency on the MySQL CDC connector within the Oracle CDC connector implementation? I'd like to better understand how these components interact in this context.

Have been removed

joyCurry30 · 2025-04-19T00:56:24Z

flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-oracle/pom.xml

+                                    <include>io.debezium:debezium-connector-oracle</include>
+                                    <include>io.debezium:debezium-connector-mysql</include>
+                                    <include>com.ververica:flink-connector-debezium</include>
+                                    <include>com.ververica:flink-connector-mysql-cdc</include>


Same as above.

Have been removed

joyCurry30 · 2025-04-19T00:56:49Z

flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-oracle/pom.xml

+                                    <include>org.antlr:antlr4-runtime</include>
+                                    <include>org.apache.kafka:*</include>
+                                    <include>mysql:mysql-connector-java</include>
+                                    <include>com.zendesk:mysql-binlog-connector-java</include>


Same as above.

Have been removed

joyCurry30 · 2025-04-19T01:08:37Z

...cle/src/main/java/com/apache/flink/cdc/connectors/oracle/source/OracleDataSourceOptions.java

+    public static final ConfigOption<String> JDBC_URL =
+            ConfigOptions.key("jdbc.url")
+                    .stringType()
+                    .noDefaultValue()
+                    .withDescription("The jdbc url.");


Could we clarify the relationship between the JDBC URL and the individual hostname/port parameters?

If we already have a “jdbc.url” configuration field, is there still value in maintaining separate “hostname and “port” parameters? Should these be mutually exclusive?

When using “hostname” and "port" configuration, is there a parameter to explicitly specify the driver type (Thin vs OCI)? This would be crucial for constructing the correct JDBC connection string format.

In The original oracle module [flink-connector-oracle-cdc]OracleJdbcUrlUtils#getConnectionUrlWithSid method，url，hostname and port is not exclusive，when url config is null，url = "jdbc:oracle:thin:@" + hostname + ":" + port + ":" + dbname;，but 19c url is must connect through the service name，like url = "jdbc:oracle:thin:@" + hostname + ":" + port + "/" + dbname; So you can configure the URL to adapt to oracle 19c

joyCurry30 · 2025-04-19T01:14:43Z

...cle/src/main/java/com/apache/flink/cdc/connectors/oracle/source/OracleEventDeserializer.java

+        if (isAddMeta) {
+            map.put(OracleDataSourceOptions.HOSTNAME.key(), hostname);
+            map.put(OracleDataSourceOptions.PORT.key(), port);
+        }


Could you clarify the design intent behind treating hostname and port as common metadata fields?

Since we don't currently support multi-source CDC synchronization, are these fields intended for future extensibility?

Have been removed

linjianchang · 2025-04-23T01:19:18Z

@joyCurry30 Already modified，please review again ,thanks!

joyCurry30

Hi, thank you for your contribution. I left some comments for the doc.

joyCurry30 · 2025-04-27T01:58:30Z

docs/content.zh/docs/connectors/pipeline-connectors/oracle.md

+
+```yaml
+source:
+  type: mysql


“type” should be "oracle".

joyCurry30 · 2025-04-27T01:59:19Z

docs/content.zh/docs/connectors/pipeline-connectors/oracle.md

+- `initial` （默认）：在第一次启动时对受监视的数据库表执行初始快照，并继续读取最新的 binlog。
+- `latest-offset`：首次启动时，从不对受监视的数据库表执行快照， 连接器仅从 binlog 的结尾处开始读取，这意味着连接器只能读取在连接器启动之后的数据更改。


Does oracle cdc use binlog?

joyCurry30 · 2025-04-27T02:01:52Z

docs/content.zh/docs/connectors/pipeline-connectors/oracle.md

+    <tr>
+      <td>
+       XMLTYPE
+        </td>
+      <td>VARCHAR(n)</td>
+      <td></td>
+    </tr>
+    <tr>
+      <td>
+        VARCHAR(n)<br>
+        VARCHAR2(n)<br>
+        NVARCHAR2(n)<br>
+        NCHAR(n)<br>
+        CHAR(n)<br>
+      </td>
+      <td>VARCHAR(n)</td>
+      <td></td>
+    </tr>


These two elements should be merged.

joyCurry30 · 2025-04-27T02:09:56Z

docs/content/docs/connectors/pipeline-connectors/oracle.md

+
+```yaml
+source:
+  type: mysql


Same as above.

joyCurry30 · 2025-04-27T02:09:56Z

docs/content/docs/connectors/pipeline-connectors/oracle.md

+- `initial` (default): Performs an initial snapshot on the monitored database tables upon first startup, and continue to read the latest binlog.
+- `latest-offset`: Never to perform snapshot on the monitored database tables upon first startup, just read from
+


Same as above.

joyCurry30 · 2025-04-27T02:10:17Z

docs/content/docs/connectors/pipeline-connectors/oracle.md

+## 数据类型映射
+


Use English, please.

docs/content/docs/connectors/pipeline-connectors/oracle.md

joyCurry30 · 2025-07-03T02:21:48Z

...le/src/main/java/com/apache/flink/cdc/connectors/oracle/factory/OracleDataSourceFactory.java

+                                .hostname(
+                                        config.getOptional(OracleDataSourceOptions.HOSTNAME).get())
+                                .port(config.getOptional(OracleDataSourceOptions.PORT).get())
+                                .databaseList(
+                                        config.getOptional(OracleDataSourceOptions.DATABASE)
+                                                .get()) // monitor oracledatabase
+                                .tableList(
+                                        config.getOptional(OracleDataSourceOptions.TABLES)
+                                                .get()) // monitor productstable
+                                .username(
+                                        config.getOptional(OracleDataSourceOptions.USERNAME).get())
+                                .password(
+                                        config.getOptional(OracleDataSourceOptions.PASSWORD).get())
+                                .includeSchemaChanges(true);


Use config.get(ConfigOption option).

joyCurry30 · 2025-07-03T02:24:57Z

...le/src/main/java/com/apache/flink/cdc/connectors/oracle/factory/OracleDataSourceFactory.java

+        options.add(OracleDataSourceOptions.SCHEMALIST);
+        options.add(OracleDataSourceOptions.DATABASE);
+        options.add(OracleDataSourceOptions.TABLES);
+        options.add(METADATA_LIST);


Use OracleDataSourceOptions.METADATA_LIST.

joyCurry30 · 2025-07-03T02:26:40Z

...le/src/main/java/com/apache/flink/cdc/connectors/oracle/factory/OracleDataSourceFactory.java

+    private static final String SCAN_STARTUP_MODE_VALUE_INITIAL = "initial";
+    private static final String SCAN_STARTUP_MODE_VALUE_EARLIEST = "earliest-offset";
+    private static final String SCAN_STARTUP_MODE_VALUE_LATEST = "latest-offset";
+    private static final String SCAN_STARTUP_MODE_VALUE_SPECIFIC_OFFSET = "specific-offset";
+    private static final String SCAN_STARTUP_MODE_VALUE_TIMESTAMP = "timestamp";


Use enum replace switch-case.

joyCurry30 · 2025-07-03T02:30:38Z

...tor-oracle/src/main/java/com/apache/flink/cdc/connectors/oracle/source/OracleDataSource.java

+                    builder.hostname(config.getOptional(OracleDataSourceOptions.HOSTNAME).get())
+                            .port(config.getOptional(OracleDataSourceOptions.PORT).get())
+                            .database(
+                                    config.getOptional(OracleDataSourceOptions.DATABASE)
+                                            .get()) // monitor  database
+                            .schemaList(
+                                    config.getOptional(OracleDataSourceOptions.SCHEMALIST)
+                                            .get()) // monitor  schema
+                            .tableList(capturedTables) // monitor
+                            // EMP table
+                            .username(config.getOptional(OracleDataSourceOptions.USERNAME).get())
+                            .password(config.getOptional(OracleDataSourceOptions.PASSWORD).get())


Use config.get(ConfigOption option)

aiwenmo · 2025-07-09T01:54:18Z

flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-oracle/pom.xml

+                                    <shadedPattern>
+                                        com.ververica.cdc.connectors.shaded.org.apache.kafka
+                                    </shadedPattern>
+                                </relocation>
+                                <relocation>
+                                    <pattern>org.antlr</pattern>
+                                    <shadedPattern>
+                                        com.ververica.cdc.connectors.shaded.org.antlr
+                                    </shadedPattern>
+                                </relocation>
+                                <relocation>
+                                    <pattern>com.fasterxml</pattern>
+                                    <shadedPattern>
+                                        com.ververica.cdc.connectors.shaded.com.fasterxml
+                                    </shadedPattern>
+                                </relocation>
+                                <relocation>
+                                    <pattern>com.google</pattern>
+                                    <shadedPattern>
+                                        com.ververica.cdc.connectors.shaded.com.google
+                                    </shadedPattern>
+                                </relocation>
+                                <relocation>
+                                    <pattern>com.esri.geometry</pattern>
+                                    <shadedPattern>com.ververica.cdc.connectors.shaded.com.esri.geometry</shadedPattern>
+                                </relocation>
+                                <relocation>
+                                    <pattern>com.zaxxer</pattern>
+                                    <shadedPattern>
+                                        com.ververica.cdc.connectors.shaded.com.zaxxer
+                                    </shadedPattern>


Why relocate com.ververica.cdc.connectors?
Please use the correct package path.

aiwenmo · 2025-07-09T01:55:58Z

...ne-connector-oracle/src/main/java/com/apache/flink/cdc/connectors/oracle/dto/ColumnInfo.java

+package com.apache.flink.cdc.connectors.oracle.dto;
+


Please use org.apache.

aiwenmo · 2025-07-09T01:56:59Z

...le/src/main/java/com/apache/flink/cdc/connectors/oracle/factory/OracleDataSourceFactory.java

+import com.apache.flink.cdc.connectors.oracle.source.OracleDataSource;
+import com.apache.flink.cdc.connectors.oracle.source.OracleDataSourceOptions;
+import com.apache.flink.cdc.connectors.oracle.utils.OracleSchemaUtils;


Please use org.apache.

aiwenmo · 2025-07-09T01:57:26Z

...le/src/main/java/com/apache/flink/cdc/connectors/oracle/factory/OracleDataSourceFactory.java

+import java.util.Set;
+import java.util.stream.Collectors;
+
+import static com.apache.flink.cdc.connectors.oracle.source.OracleDataSourceOptions.METADATA_LIST;


Please use org.apache.

aiwenmo · 2025-07-09T02:00:07Z

...le/src/main/java/com/apache/flink/cdc/connectors/oracle/factory/OracleDataSourceFactory.java

+        }
+        throw new IllegalArgumentException(
+                String.format(
+                        "[%s] cannot be found in mysql metadata.",


oracle metadata

thanks for review！have modified！

aiwenmo · 2025-10-12T05:16:20Z

...tor-oracle/src/main/java/org/apache/flink/cdc/connectors/oracle/source/OracleDataSource.java

+        dbzProperties.setProperty(
+                "snapshot.locking.mode", config.get(OracleDataSourceOptions.SNAPSHOT_LOCKING_MODE));
+        dbzProperties.setProperty(
+                "snapshot.locking.mode", config.get(OracleDataSourceOptions.SNAPSHOT_LOCKING_MODE));


This is a duplicate code snippet.

aiwenmo · 2025-10-12T05:20:51Z

...tor-oracle/src/main/java/org/apache/flink/cdc/connectors/oracle/source/OracleDataSource.java

+    @Override
+    public EventSourceProvider getEventSourceProvider() {
+        String url = config.get(OracleDataSourceOptions.JDBC_URL);


How should null parameters be handled?

aiwenmo · 2025-10-12T05:23:37Z

...tor-oracle/src/main/java/org/apache/flink/cdc/connectors/oracle/source/OracleDataSource.java

+        switch (modeString.toLowerCase()) {
+            case SCAN_STARTUP_MODE_VALUE_INITIAL:
+                return StartupOptions.initial();
+
+            case SCAN_STARTUP_MODE_VALUE_LATEST:
+                return StartupOptions.latest();
+


How should other situations of StartupOptions be handled?

aiwenmo · 2025-10-12T05:26:11Z

...cle/src/main/java/org/apache/flink/cdc/connectors/oracle/source/OracleDataSourceOptions.java

+    public static final ConfigOption<String> SERVER_TIME_ZONE =
+            ConfigOptions.key("server-time-zone")
+                    .stringType()
+                    .noDefaultValue()
+                    .withDescription(
+                            "The session time zone in database server. If not set, then "
+                                    + "ZoneId.systemDefault() is used to determine the server time zone.");
+
+    public static final ConfigOption<String> SERVER_ID =
+            ConfigOptions.key("server-id")
+                    .stringType()


Does Oracle CDC have these configurations?

aiwenmo · 2025-10-12T05:28:52Z

...cle/src/main/java/org/apache/flink/cdc/connectors/oracle/source/OracleEventDeserializer.java

+    @Override
+    protected Object convertToString(Object dbzObj, Schema schema) {
+        // the Geometry datatype in oracle will be converted to
+        // a String with Json format
+        if (Point.LOGICAL_NAME.equals(schema.name())


I'm not sure if Oracle also needs to be handled in this way.

aiwenmo · 2025-10-12T05:32:47Z

Hi. Thanks for your contribution. @linjianchang

I've found that some code might be useless in Oracle CDC. Please simplify the code carefully to make it as concise as possible.

github-actions bot added base oracle-cdc-connector debezium labels Apr 18, 2025

joyCurry30 reviewed Apr 19, 2025

View reviewed changes

linjianchang force-pushed the master-36796 branch from 820b720 to 5015bbb Compare April 21, 2025 06:27

github-actions bot added the docs Improvements or additions to documentation label Apr 21, 2025

joyCurry30 reviewed Apr 27, 2025

View reviewed changes

linjianchang force-pushed the master-36796 branch from 5015bbb to f284d27 Compare April 27, 2025 02:55

linjianchang force-pushed the master-36796 branch from f284d27 to f6b25cc Compare May 9, 2025 09:26

linjianchang force-pushed the master-36796 branch from f6b25cc to 4f73a6f Compare June 12, 2025 10:38

joyCurry30 reviewed Jul 3, 2025

View reviewed changes

linjianchang force-pushed the master-36796 branch from 4f73a6f to 3479f43 Compare July 4, 2025 03:35

aiwenmo suggested changes Jul 9, 2025

View reviewed changes

linjianchang force-pushed the master-36796 branch from 13c6586 to ba183e3 Compare July 10, 2025 06:07

linjianchang force-pushed the master-36796 branch from ba183e3 to f451287 Compare August 5, 2025 09:54

[FLINK-36796][pipeline-connector][oracle]add oracle pipeline connector.

0237401

linjianchang force-pushed the master-36796 branch from 0c77212 to 0237401 Compare August 5, 2025 10:39

linjianchang requested review from aiwenmo and joyCurry30 October 10, 2025 08:26

aiwenmo reviewed Oct 12, 2025

View reviewed changes

		- `initial` （默认）：在第一次启动时对受监视的数据库表执行初始快照，并继续读取最新的 binlog。
		- `latest-offset`：首次启动时，从不对受监视的数据库表执行快照，连接器仅从 binlog 的结尾处开始读取，这意味着连接器只能读取在连接器启动之后的数据更改。

		- `initial` (default): Performs an initial snapshot on the monitored database tables upon first startup, and continue to read the latest binlog.
		- `latest-offset`: Never to perform snapshot on the monitored database tables upon first startup, just read from

[FLINK-36796][pipeline-connector][oracle]add oracle pipeline connector. #3995

Are you sure you want to change the base?

[FLINK-36796][pipeline-connector][oracle]add oracle pipeline connector. #3995

Uh oh!

Conversation

linjianchang commented Apr 18, 2025

Uh oh!

joyCurry30 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joyCurry30 Apr 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

linjianchang commented Apr 23, 2025

Uh oh!

joyCurry30 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joyCurry30 Apr 19, 2025 •

edited

Loading