Skip to content

Commit 41f8726

Browse files
authored
Merge branch 'apache:dev' into dev
2 parents 024b048 + 3cf09f6 commit 41f8726

File tree

104 files changed

+4648
-1030
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

104 files changed

+4648
-1030
lines changed

.licenserc.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ header:
3838
- '**/*.ini'
3939
- '**/*.svg'
4040
- '**/*.txt'
41+
- '**/*.csv'
4142
- '**/.gitignore'
4243
- '**/LICENSE'
4344
- '**/NOTICE'

docs/en/connector-v2/sink/Doris.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ The internal implementation of Doris sink connector is cached and imported by st
4747
| table | String | Yes | - | The table name of `Doris` table, use `${table_name}` to represent the upstream table name |
4848
| table.identifier | String | Yes | - | The name of `Doris` table, it will deprecate after version 2.3.5, please use `database` and `table` instead. |
4949
| sink.label-prefix | String | Yes | - | The label prefix used by stream load imports. In the 2pc scenario, global uniqueness is required to ensure the EOS semantics of SeaTunnel. |
50-
| sink.enable-2pc | bool | No | false | Whether to enable two-phase commit (2pc), the default is false. For two-phase commit, please refer to [here](https://doris.apache.org/docs/dev/sql-manual/sql-statements/Data-Manipulation-Statements/Load/STREAM-LOAD/). |
50+
| sink.enable-2pc | bool | No | false | Whether to enable two-phase commit (2pc), the default is false. For two-phase commit, please refer to [here](https://doris.apache.org/docs/data-operate/transaction?_highlight=two&_highlight=phase#stream-load-2pc). |
5151
| sink.enable-delete | bool | No | - | Whether to enable deletion. This option requires Doris table to enable batch delete function (0.15+ version is enabled by default), and only supports Unique model. you can get more detail at this [link](https://doris.apache.org/docs/dev/data-operate/delete/batch-delete-manual/) |
5252
| sink.check-interval | int | No | 10000 | check exception with the interval while loading |
5353
| sink.max-retries | int | No | 3 | the max retry times if writing records to database failed |

docs/en/connector-v2/source/Iceberg.md

Lines changed: 29 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -71,11 +71,12 @@ libfb303-xxx.jar
7171

7272
## Source Options
7373

74-
| Name | Type | Required | Default | Description |
74+
| Name | Type | Required | Default | Description |
7575
|--------------------------|---------|----------|----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
7676
| catalog_name | string | yes | - | User-specified catalog name. |
7777
| namespace | string | yes | - | The iceberg database name in the backend catalog. |
78-
| table | string | yes | - | The iceberg table name in the backend catalog. |
78+
| table | string | no | - | The iceberg table name in the backend catalog. |
79+
| table_list | string | no | - | The iceberg table list in the backend catalog. |
7980
| iceberg.catalog.config | map | yes | - | Specify the properties for initializing the Iceberg catalog, which can be referenced in this file:"https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/CatalogProperties.java" |
8081
| hadoop.config | map | no | - | Properties passed through to the Hadoop configuration |
8182
| iceberg.hadoop-conf-path | string | no | - | The specified loading paths for the 'core-site.xml', 'hdfs-site.xml', 'hive-site.xml' files. |
@@ -87,6 +88,7 @@ libfb303-xxx.jar
8788
| use_snapshot_id | long | no | - | Instructs this scan to look for use the given snapshot ID. |
8889
| use_snapshot_timestamp | long | no | - | Instructs this scan to look for use the most recent snapshot as of the given time in milliseconds. timestamp – the timestamp in millis since the Unix epoch |
8990
| stream_scan_strategy | enum | no | FROM_LATEST_SNAPSHOT | Starting strategy for stream mode execution, Default to use `FROM_LATEST_SNAPSHOT` if don’t specify any value,The optional values are:<br/>TABLE_SCAN_THEN_INCREMENTAL: Do a regular table scan then switch to the incremental mode.<br/>FROM_LATEST_SNAPSHOT: Start incremental mode from the latest snapshot inclusive.<br/>FROM_EARLIEST_SNAPSHOT: Start incremental mode from the earliest snapshot inclusive.<br/>FROM_SNAPSHOT_ID: Start incremental mode from a snapshot with a specific id inclusive.<br/>FROM_SNAPSHOT_TIMESTAMP: Start incremental mode from a snapshot with a specific timestamp inclusive. |
91+
| increment.scan-interval | long | no | 2000 | The interval of increment scan(mills) |
9092
| common-options | | no | - | Source plugin common parameters, please refer to [Source Common Options](../source-common-options.md) for details. |
9193

9294
## Task Example
@@ -101,25 +103,6 @@ env {
101103
102104
source {
103105
Iceberg {
104-
schema {
105-
fields {
106-
f2 = "boolean"
107-
f1 = "bigint"
108-
f3 = "int"
109-
f4 = "bigint"
110-
f5 = "float"
111-
f6 = "double"
112-
f7 = "date"
113-
f9 = "timestamp"
114-
f10 = "timestamp"
115-
f11 = "string"
116-
f12 = "bytes"
117-
f13 = "bytes"
118-
f14 = "decimal(19,9)"
119-
f15 = "array<int>"
120-
f16 = "map<string, int>"
121-
}
122-
}
123106
catalog_name = "seatunnel"
124107
iceberg.catalog.config={
125108
type = "hadoop"
@@ -141,6 +124,31 @@ sink {
141124
}
142125
```
143126

127+
### Multi-Table Read:
128+
129+
```hocon
130+
source {
131+
Iceberg {
132+
catalog_name = "seatunnel"
133+
iceberg.catalog.config = {
134+
type = "hadoop"
135+
warehouse = "file:///tmp/seatunnel/iceberg/hadoop/"
136+
}
137+
namespace = "database1"
138+
table_list = [
139+
{
140+
table = "table_1
141+
},
142+
{
143+
table = "table_2
144+
}
145+
]
146+
147+
plugin_output = "iceberg"
148+
}
149+
}
150+
```
151+
144152
### Hadoop S3 Catalog:
145153

146154
```hocon

docs/zh/connector-v2/sink/Doris.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ Doris Sink连接器的内部实现是通过stream load批量缓存和导入的
4646
| table | String | Yes | - | `Doris` 表名, 使用 `${table_name}` 表示上游表名。 |
4747
| table.identifier | String | Yes | - | `Doris` 表的名称,2.3.5 版本后将弃用,请使用 `database``table` 代替。 |
4848
| sink.label-prefix | String | Yes | - | stream load导入使用的标签前缀。 在2pc场景下,需要全局唯一性来保证SeaTunnel的EOS语义。 |
49-
| sink.enable-2pc | bool | No | false | 是否启用两阶段提交(2pc),默认为 false。 对于两阶段提交,请参考[此处](https://doris.apache.org/docs/dev/sql-manual/sql-statements/Data-Manipulation-Statements/Load/STREAM-LOAD/)|
49+
| sink.enable-2pc | bool | No | false | 是否启用两阶段提交(2pc),默认为 false。 对于两阶段提交,请参考[此处](https://doris.apache.org/docs/data-operate/transaction?_highlight=two&_highlight=phase#stream-load-2pc)|
5050
| sink.enable-delete | bool | No | - | 是否启用删除。 该选项需要Doris表开启批量删除功能(0.15+版本默认开启),且仅支持Unique模型。 您可以在此[link](https://doris.apache.org/docs/dev/data-operate/delete/batch-delete-manual/)获得更多详细信息 |
5151
| sink.check-interval | int | No | 10000 | 加载过程中检查异常时间间隔。 |
5252
| sink.max-retries | int | No | 3 | 向数据库写入记录失败时的最大重试次数。 |

0 commit comments

Comments
 (0)