Skip to content

Commit 7ddb4cf

Browse files
committed
[FLINK-39288][docs] Add JDK/Flink version requirements, connectors summary and reorganize quickstart docs for Flink 1.20 and 2.2
1 parent 619ab90 commit 7ddb4cf

File tree

32 files changed

+4429
-9
lines changed

32 files changed

+4429
-9
lines changed

docs/content.zh/docs/connectors/pipeline-connectors/fluss.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ Fluss Pipeline 连接器可用作 Pipeline 的 *Data Sink*,将数据写入 [Fl
3030
## What can the connector do?
3131
* 自动创建不存在的表
3232
* 数据同步
33+
* Schema 变更同步(lenient 模式)
3334

3435
How to create Pipeline
3536
----------------
@@ -60,6 +61,7 @@ sink:
6061
pipeline:
6162
name: MySQL to Fluss Pipeline
6263
parallelism: 2
64+
schema.change.behavior: LENIENT
6365
```
6466
6567
Pipeline Connector Options
@@ -140,7 +142,13 @@ Pipeline Connector Options
140142
* 桶数量由 `bucket.num` 选项控制
141143
* 数据分布由 `bucket.key` 选项控制。对于主键表,若未指定分桶键,则分桶键默认为主键(不含分区键);对于无主键的日志表,若未指定分桶键,则数据将随机分配到各个桶中。
142144

143-
* 不支持 schema 变更同步。如果需要忽略 schema 变更,可使用 `schema.change.behavior: IGNORE`。
145+
* 支持在 `lenient` 模式下进行 Schema 变更同步,通过 `schema.change.behavior: lenient` 配置。支持以下 Schema 变更事件:
146+
* **新增列** — 新列会追加到 Fluss 表中。
147+
* **删除列** — 在 lenient 模式下不会真正删除列,而是忽略该删除操作,后续写入时将该列的值设为 null。
148+
* **重命名列** — 在 lenient 模式下,此操作会被转换为新增列 + 将旧列类型修改为可空的序列。
149+
* **修改列类型** — 不支持。
150+
151+
要启用 Schema 变更同步,请在 pipeline 中配置 `schema.change.behavior: lenient`。如果想要忽略所有 Schema 变更,使用 `schema.change.behavior: IGNORE`。
144152

145153
* 关于数据同步, Pipeline 连接器使用 [Fluss Java Client](https://fluss.apache.org/docs/apis/java-client/) 向 Fluss 写入数据.
146154

@@ -236,6 +244,21 @@ Data Type Mapping
236244
<td>BYTES</td>
237245
<td></td>
238246
</tr>
247+
<tr>
248+
<td>ARRAY</td>
249+
<td>ARRAY</td>
250+
<td>元素类型递归映射。</td>
251+
</tr>
252+
<tr>
253+
<td>MAP</td>
254+
<td>MAP</td>
255+
<td>键和值类型递归映射。</td>
256+
</tr>
257+
<tr>
258+
<td>ROW</td>
259+
<td>ROW</td>
260+
<td>字段类型递归映射。</td>
261+
</tr>
239262
</tbody>
240263
</table>
241264
</div>

docs/content.zh/docs/connectors/pipeline-connectors/postgres.md

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,15 +27,14 @@ under the License.
2727
# Postgres Connector
2828

2929
Postgres CDC Pipeline 连接器允许从 Postgres 数据库读取快照数据和增量数据,并提供端到端的整库数据同步能力。 本文描述了如何设置 Postgres CDC Pipeline 连接器。
30-
注意:因为Postgres的wal log日志中展示没有办法解析表结构变更记录,因此Postgres CDC Pipeline Source暂时不支持同步表结构变更。
3130

3231
## 示例
3332

3433
从 Postgres 读取数据同步到 Fluss 的 Pipeline 可以定义如下:
3534

3635
```yaml
3736
source:
38-
type: posgtres
37+
type: postgres
3938
name: Postgres Source
4039
hostname: 127.0.0.1
4140
port: 5432
@@ -45,6 +44,7 @@ source:
4544
tables: adb.\.*.\.*
4645
decoding.plugin.name: pgoutput
4746
slot.name: pgtest
47+
schema-change.enabled: true
4848

4949
sink:
5050
type: fluss
@@ -59,6 +59,7 @@ sink:
5959
pipeline:
6060
name: Postgres to Fluss Pipeline
6161
parallelism: 4
62+
schema.change.behavior: lenient
6263
```
6364
6465
## 连接器配置项
@@ -282,6 +283,17 @@ pipeline:
282283
默认值为 false。
283284
</td>
284285
</tr>
286+
<tr>
287+
<td>schema-change.enabled</td>
288+
<td>optional</td>
289+
<td style="word-wrap: break-word;">false</td>
290+
<td>Boolean</td>
291+
<td>
292+
是否开启 Postgres 源的 Schema 变更推导。开启后,连接器会通过对比 pgoutput Relation 消息与缓存的 Schema 来推导 Schema 变更事件(新增列、删除列、重命名列、修改列类型)。<br>
293+
需要将 <code>decoding.plugin.name</code> 设置为 <code>pgoutput</code>。<br>
294+
默认值为 false。
295+
</td>
296+
</tr>
285297
</tbody>
286298
</table>
287299
</div>

docs/content.zh/docs/get-started/introduction.md

Lines changed: 44 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,46 @@ Flink CDC 深度集成并由 Apache Flink 驱动,提供以下核心功能:
3737
* ✅ 整库同步
3838
* ✅具备表结构变更自动同步的能力(Schema Evolution),
3939

40+
## 环境要求
41+
42+
Flink CDC 有以下环境要求:
43+
44+
* **JDK**:JDK 11 或更高版本(Flink CDC 从 3.6.0 版本开始基于 JDK 11 构建)
45+
* **Apache Flink**:Flink 1.20.x 或 Flink 2.2.x
46+
47+
{{< hint info >}}
48+
在运行 Flink CDC 之前,请确保已安装正确的 JDK 版本。您可以使用 `java -version` 命令验证 Java 版本。
49+
{{< /hint >}}
50+
51+
## 支持的连接器
52+
53+
Flink CDC 提供了丰富的连接器生态系统,用于与各种外部系统进行交互:
54+
55+
| 连接器 | 类型 |
56+
|-----------|------|
57+
| MySQL | [Source Connector]({{< ref "docs/connectors/flink-sources/mysql-cdc" >}}) / [Pipeline Source Connector]({{< ref "docs/connectors/pipeline-connectors/mysql" >}}) |
58+
| Oracle | [Source Connector]({{< ref "docs/connectors/flink-sources/oracle-cdc" >}}) / [Pipeline Source Connector]({{< ref "docs/connectors/pipeline-connectors/oracle" >}}) |
59+
| PostgreSQL | [Source Connector]({{< ref "docs/connectors/flink-sources/postgres-cdc" >}}) / [Pipeline Source Connector]({{< ref "docs/connectors/pipeline-connectors/postgres" >}}) |
60+
| Db2 | [Source Connector]({{< ref "docs/connectors/flink-sources/db2-cdc" >}}) |
61+
| MongoDB | [Source Connector]({{< ref "docs/connectors/flink-sources/mongodb-cdc" >}}) |
62+
| SQL Server | [Source Connector]({{< ref "docs/connectors/flink-sources/sqlserver-cdc" >}}) |
63+
| TiDB | [Source Connector]({{< ref "docs/connectors/flink-sources/tidb-cdc" >}}) |
64+
| Vitess | [Source Connector]({{< ref "docs/connectors/flink-sources/vitess-cdc" >}}) |
65+
| Apache Doris | [Pipeline Sink Connector]({{< ref "docs/connectors/pipeline-connectors/doris" >}}) |
66+
| Elasticsearch | [Pipeline Sink Connector]({{< ref "docs/connectors/pipeline-connectors/elasticsearch" >}}) |
67+
| Fluss | [Pipeline Sink Connector]({{< ref "docs/connectors/pipeline-connectors/fluss" >}}) |
68+
| Hudi | [Pipeline Sink Connector]({{< ref "docs/connectors/pipeline-connectors/hudi" >}}) |
69+
| Iceberg | [Pipeline Sink Connector]({{< ref "docs/connectors/pipeline-connectors/iceberg" >}}) |
70+
| Kafka | [Pipeline Sink Connector]({{< ref "docs/connectors/pipeline-connectors/kafka" >}}) |
71+
| MaxCompute | [Pipeline Sink Connector]({{< ref "docs/connectors/pipeline-connectors/maxcompute" >}}) |
72+
| OceanBase | [Pipeline Sink Connector]({{< ref "docs/connectors/pipeline-connectors/oceanbase" >}}) |
73+
| Paimon | [Pipeline Sink Connector]({{< ref "docs/connectors/pipeline-connectors/paimon" >}}) |
74+
| StarRocks | [Pipeline Sink Connector]({{< ref "docs/connectors/pipeline-connectors/starrocks" >}}) |
75+
76+
{{< hint info >}}
77+
有关每个连接器的详细信息,包括支持的版本、功能和配置选项,请参考[连接器]({{< ref "docs/connectors" >}})部分。
78+
{{< /hint >}}
79+
4080
## 如何使用 Flink CDC
4181

4282
Flink CDC 提供了基于 `YAML` 格式的用户 API,更适合于数据集成场景。以下是一个 `YAML` 文件的示例,它定义了一个数据管道(Pipeline),该Pipeline从 MySQL 捕获实时变更,并将它们同步到 Apache Doris:
@@ -77,8 +117,10 @@ pipeline:
77117

78118
查看快速入门指南,了解如何建立一个 Flink CDC Pipeline:
79119

80-
- [MySQL to Apache Doris]({{< ref "docs/get-started/quickstart/mysql-to-doris" >}})
81-
- [MySQL to StarRocks]({{< ref "docs/get-started/quickstart/mysql-to-starrocks" >}})
120+
| 示例 | 版本 |
121+
|---------|---------|
122+
| MySQL to Apache Doris | [1.20.x]({{< ref "docs/get-started/quickstart-for-1.20/mysql-to-doris" >}}) / [2.2.x]({{< ref "docs/get-started/quickstart-for-2.2/mysql-to-doris" >}}) |
123+
| MySQL to StarRocks | [1.20.x]({{< ref "docs/get-started/quickstart-for-1.20/mysql-to-starrocks" >}}) / [2.2.x]({{< ref "docs/get-started/quickstart-for-2.2/mysql-to-starrocks" >}}) |
82124

83125
### 理解核心概念
84126

docs/content.zh/docs/get-started/quickstart/_index.md renamed to docs/content.zh/docs/get-started/quickstart-for-1.20/_index.md

File renamed without changes.

docs/content.zh/docs/get-started/quickstart/cdc-up-quickstart-guide.md renamed to docs/content.zh/docs/get-started/quickstart-for-1.20/cdc-up-quickstart-guide.md

File renamed without changes.

docs/content.zh/docs/get-started/quickstart/mysql-to-doris.md renamed to docs/content.zh/docs/get-started/quickstart-for-1.20/mysql-to-doris.md

File renamed without changes.

docs/content.zh/docs/get-started/quickstart/mysql-to-kafka.md renamed to docs/content.zh/docs/get-started/quickstart-for-1.20/mysql-to-kafka.md

File renamed without changes.

docs/content.zh/docs/get-started/quickstart/mysql-to-starrocks.md renamed to docs/content.zh/docs/get-started/quickstart-for-1.20/mysql-to-starrocks.md

File renamed without changes.

0 commit comments

Comments
 (0)