Skip to content

Commit 93bba1a

Browse files
authored
[Improve][Docs] Iceberg connector adds parameter documentation related to Kerberos authentication (#10704)
1 parent 815cfc0 commit 93bba1a

4 files changed

Lines changed: 186 additions & 5 deletions

File tree

docs/en/connectors/sink/Iceberg.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,23 @@ libfb303-xxx.jar
8080
| data_save_mode | Enum | no | APPEND_DATA | the data save mode, please refer to `data_save_mode` below |
8181
| custom_sql | string | no | - | Custom `delete` data sql for data save mode. e.g: `delete from ... where ...` |
8282
| iceberg.table.commit-branch | string | no | - | Default branch for commits |
83+
| krb5_path | string | no | /etc/krb5.conf | The path of `krb5.conf`, used for Kerberos authentication. |
84+
| kerberos_principal | string | no | - | The principal for Kerberos authentication. |
85+
| kerberos_keytab_path | string | no | - | The keytab file path for Kerberos authentication. |
86+
87+
## Sink Option descriptions
88+
89+
### krb5_path [string]
90+
91+
The path of `krb5.conf`, used for Kerberos authentication.
92+
93+
### kerberos_principal [string]
94+
95+
The principal for Kerberos authentication.
96+
97+
### kerberos_keytab_path [string]
98+
99+
The keytab file path for Kerberos authentication.
83100

84101
## Task Example
85102

@@ -234,6 +251,42 @@ sink {
234251
}
235252
```
236253

254+
### Kerberos Authentication
255+
256+
The following example demonstrates how to configure Iceberg sink with Kerberos authentication when using Hadoop catalog with HDFS:
257+
258+
```hocon
259+
sink {
260+
Iceberg {
261+
catalog_name = "seatunnel_test"
262+
iceberg.catalog.config = {
263+
type = "hadoop"
264+
warehouse = "hdfs://your_cluster/tmp/seatunnel/iceberg/"
265+
}
266+
namespace = "seatunnel_namespace"
267+
table = "iceberg_sink_table"
268+
iceberg.table.write-props = {
269+
write.format.default = "parquet"
270+
write.target-file-size-bytes = 536870912
271+
}
272+
krb5_path = "/etc/krb5.conf"
273+
kerberos_principal = "hive/your_host@EXAMPLE.COM"
274+
kerberos_keytab_path = "/path/to/your.keytab"
275+
iceberg.table.primary-keys = "id"
276+
iceberg.table.partition-keys = "f_datetime"
277+
iceberg.table.upsert-mode-enabled = true
278+
iceberg.table.schema-evolution-enabled = true
279+
case_sensitive = true
280+
}
281+
}
282+
```
283+
284+
Description:
285+
286+
- `krb5_path`: The path to the `krb5.conf` file used for Kerberos authentication.
287+
- `kerberos_principal`: The principal for Kerberos authentication in the format `primary/instance@REALM`.
288+
- `kerberos_keytab_path`: The keytab file path for Kerberos authentication.
289+
237290
### Multiple table
238291

239292
#### example1

docs/en/connectors/source/Iceberg.md

Lines changed: 35 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -89,10 +89,13 @@ libfb303-xxx.jar
8989
| end_snapshot_id | long | no | - | Instructs this scan to look for changes up to a particular snapshot (inclusive). |
9090
| use_snapshot_id | long | no | - | Instructs this scan to look for use the given snapshot ID. |
9191
| use_snapshot_timestamp | long | no | - | Instructs this scan to look for use the most recent snapshot as of the given time in milliseconds. timestamp – the timestamp in millis since the Unix epoch |
92-
| stream_scan_strategy | enum | no | FROM_LATEST_SNAPSHOT | Starting strategy for stream mode execution, Default to use `FROM_LATEST_SNAPSHOT` if dont specify any value,The optional values are:<br/>TABLE_SCAN_THEN_INCREMENTAL: Do a regular table scan then switch to the incremental mode.<br/>FROM_LATEST_SNAPSHOT: Start incremental mode from the latest snapshot inclusive.<br/>FROM_EARLIEST_SNAPSHOT: Start incremental mode from the earliest snapshot inclusive.<br/>FROM_SNAPSHOT_ID: Start incremental mode from a snapshot with a specific id inclusive.<br/>FROM_SNAPSHOT_TIMESTAMP: Start incremental mode from a snapshot with a specific timestamp inclusive. |
92+
| stream_scan_strategy | enum | no | FROM_LATEST_SNAPSHOT | Starting strategy for stream mode execution, Default to use `FROM_LATEST_SNAPSHOT` if don't specify any value,The optional values are:<br/>TABLE_SCAN_THEN_INCREMENTAL: Do a regular table scan then switch to the incremental mode.<br/>FROM_LATEST_SNAPSHOT: Start incremental mode from the latest snapshot inclusive.<br/>FROM_EARLIEST_SNAPSHOT: Start incremental mode from the earliest snapshot inclusive.<br/>FROM_SNAPSHOT_ID: Start incremental mode from a snapshot with a specific id inclusive.<br/>FROM_SNAPSHOT_TIMESTAMP: Start incremental mode from a snapshot with a specific timestamp inclusive. |
9393
| increment.scan-interval | long | no | 2000 | The interval of increment scan(mills) |
9494
| common-options | | no | - | Source plugin common parameters, please refer to [Source Common Options](../common-options/source-common-options.md) for details. |
95-
| query | String | no | - | The select DML to select the iceberg data. It mustn't contain the table name, and doesn't support alias. For example: `select * from table where f1 > 100`, `select fn from table where f1 > 100`. The current support for the LIKE syntax is limited: the LIKE clause shouldn't start with `%`. The supported one is: `select f1 from t where f2 like 'tom%' ` |
95+
| query | String | no | - | The select DML to select the iceberg data. It mustn't contain the table name, and doesn’t support alias. For example: `select * from table where f1 > 100`, `select fn from table where f1 > 100`. The current support for the LIKE syntax is limited: the LIKE clause shouldn't start with `%`. The supported one is: `select f1 from t where f2 like 'tom%' ` |
96+
| krb5_path | string | no | /etc/krb5.conf | The path to the `krb5.conf` file for Kerberos authentication. |
97+
| kerberos_principal | string | no | - | The principal for Kerberos authentication. |
98+
| kerberos_keytab_path | string | no | - | The path to the keytab file for Kerberos authentication. |
9699

97100

98101
## Task Example
@@ -149,7 +152,7 @@ source {
149152
query = "select fn from table where f1 > 100"
150153
}
151154
]
152-
155+
153156
plugin_output = "iceberg"
154157
}
155158
}
@@ -191,13 +194,41 @@ source {
191194
warehouse = "hdfs://your_cluster//tmp/seatunnel/iceberg/"
192195
}
193196
catalog_type = "hive"
194-
197+
198+
namespace = "your_iceberg_database"
199+
table = "your_iceberg_table"
200+
}
201+
}
202+
```
203+
204+
### Kerberos Authentication
205+
206+
The following example demonstrates how to configure Kerberos authentication for the Iceberg Source when using Hadoop Catalog and HDFS:
207+
208+
```hocon
209+
source {
210+
Iceberg {
211+
catalog_name = "seatunnel"
212+
iceberg.catalog.config = {
213+
type = "hadoop"
214+
warehouse = "hdfs://your_cluster/tmp/seatunnel/iceberg/"
215+
}
195216
namespace = "your_iceberg_database"
196217
table = "your_iceberg_table"
218+
krb5_path = "/etc/krb5.conf"
219+
kerberos_principal = "hive/your_host@EXAMPLE.COM"
220+
kerberos_keytab_path = "/path/to/your.keytab"
221+
plugin_output = "iceberg_kerberos"
197222
}
198223
}
199224
```
200225

226+
Description:
227+
228+
- `krb5_path`: The path to the `krb5.conf` file for Kerberos authentication.
229+
- `kerberos_principal`: The principal for Kerberos authentication, in the format of `primary/instance@REALM`.
230+
- `kerberos_keytab_path`: The path to the keytab file for Kerberos authentication.
231+
201232
### Column Projection
202233

203234
```hocon

docs/zh/connectors/sink/Iceberg.md

Lines changed: 54 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,23 @@ libfb303-xxx.jar
8080
| data_save_mode | Enum | no | APPEND_DATA | 数据写入方式, 请参考下面的 `data_save_mode` |
8181
| custom_sql | string | no | - | 自定义 `delete` 数据的 SQL 语句,用于数据写入方式。例如: `delete from ... where ...` |
8282
| iceberg.table.commit-branch | string | no | - | 提交的默认分支 |
83+
| krb5_path | string | no | /etc/krb5.conf | `krb5.conf` 文件的路径,用于 Kerberos 认证。 |
84+
| kerberos_principal | string | no | - | Kerberos 认证的 principal。 |
85+
| kerberos_keytab_path | string | no | - | Kerberos 认证的 keytab 文件路径。 |
86+
87+
## Sink 选项说明
88+
89+
### krb5_path [string]
90+
91+
`krb5.conf` 文件的路径,用于 Kerberos 认证。
92+
93+
### kerberos_principal [string]
94+
95+
Kerberos 认证的 principal。
96+
97+
### kerberos_keytab_path [string]
98+
99+
Kerberos 认证的 keytab 文件路径。
83100

84101
## 任务示例
85102

@@ -207,6 +224,42 @@ sink {
207224
}
208225
```
209226

227+
### Kerberos 认证
228+
229+
以下示例演示了在使用 Hadoop Catalog 和 HDFS 时如何配置 Iceberg Sink 的 Kerberos 认证:
230+
231+
```hocon
232+
sink {
233+
Iceberg {
234+
catalog_name = "seatunnel_test"
235+
iceberg.catalog.config = {
236+
type = "hadoop"
237+
warehouse = "hdfs://your_cluster/tmp/seatunnel/iceberg/"
238+
}
239+
namespace = "seatunnel_namespace"
240+
table = "iceberg_sink_table"
241+
iceberg.table.write-props = {
242+
write.format.default = "parquet"
243+
write.target-file-size-bytes = 536870912
244+
}
245+
krb5_path = "/etc/krb5.conf"
246+
kerberos_principal = "hive/your_host@EXAMPLE.COM"
247+
kerberos_keytab_path = "/path/to/your.keytab"
248+
iceberg.table.primary-keys = "id"
249+
iceberg.table.partition-keys = "f_datetime"
250+
iceberg.table.upsert-mode-enabled = true
251+
iceberg.table.schema-evolution-enabled = true
252+
case_sensitive = true
253+
}
254+
}
255+
```
256+
257+
说明:
258+
259+
- `krb5_path`:用于 Kerberos 认证的 `krb5.conf` 文件路径。
260+
- `kerberos_principal`:Kerberos 认证的 principal,格式为 `primary/instance@REALM`
261+
- `kerberos_keytab_path`:Kerberos 认证的 keytab 文件路径。
262+
210263
### Multiple table(多表写入)
211264

212265
#### 示例1
@@ -223,7 +276,7 @@ source {
223276
url = "jdbc:mysql://127.0.0.1:3306/seatunnel"
224277
username = "root"
225278
password = "******"
226-
279+
227280
table-names = ["seatunnel.role","seatunnel.user","galileo.Bucket"]
228281
}
229282
}

0 commit comments

Comments
 (0)