Skip to content

Commit 948d588

Browse files
authored
[Feature][connector-elasticsearch] elasticsearch source support PIT (#9150)
1 parent 879b1e2 commit 948d588

File tree

14 files changed

+634
-27
lines changed

14 files changed

+634
-27
lines changed

Diff for: docs/en/connector-v2/source/Elasticsearch.md

+45-6
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,9 @@ support version >= 2.x and <= 8.x.
3030
| index_list | array | no | used to define a multiple table task |
3131
| source | array | no | - |
3232
| query | json | no | {"match_all": {}} |
33-
| search_type | json | no | Search method,sql or dsl,default dsl |
34-
| sql_query | json | no | sql query |
33+
| search_type | enum | no | Query type, SQL or DSL, default DSL |
34+
| search_api_type | enum | no | Pagination API type, SCROLL or PIT, default SCROLL |
35+
| sql_query | json | no | SQL query, required when search_type is SQL |
3536
| scroll_time | string | no | 1m |
3637
| scroll_size | int | no | 100 |
3738
| tls_verify_certificate | boolean | no | true |
@@ -41,6 +42,8 @@ support version >= 2.x and <= 8.x.
4142
| tls_keystore_password | string | no | - |
4243
| tls_truststore_path | string | no | - |
4344
| tls_truststore_password | string | no | - |
45+
| pit_keep_alive | long | no | 60000 (1 minute) |
46+
| pit_batch_size | int | no | 100 |
4447
| common-options | | no | - |
4548

4649

@@ -113,6 +116,22 @@ The path to PEM or JKS trust store. This file must be readable by the operating
113116

114117
The key password for the trust store specified
115118

119+
### search_type
120+
Query type, available values:
121+
- DSL: Use Domain Specific Language query (default)
122+
- SQL: Use SQL query
123+
124+
### search_api_type
125+
Pagination API type, available values:
126+
- SCROLL: Use Scroll API for pagination (default)
127+
- PIT: Use Point in Time (PIT) API for pagination
128+
129+
### pit_keep_alive [long]
130+
The amount of time (in milliseconds) for which the PIT should be keep alive
131+
132+
### pit_batch_size [long]
133+
Maximum number of hits to be returned with each PIT search request
134+
116135
### common options
117136

118137
Source plugin common parameters, please refer to [Source Common Options](../source-common-options.md) for details
@@ -177,7 +196,7 @@ source {
177196
c_date2,
178197
c_null
179198
]
180-
199+
181200
}
182201
183202
]
@@ -214,7 +233,7 @@ source {
214233
hosts = ["https://localhost:9200"]
215234
username = "elastic"
216235
password = "elasticsearch"
217-
236+
218237
tls_verify_certificate = false
219238
}
220239
}
@@ -228,7 +247,7 @@ source {
228247
hosts = ["https://localhost:9200"]
229248
username = "elastic"
230249
password = "elasticsearch"
231-
250+
232251
tls_verify_hostname = false
233252
}
234253
}
@@ -242,7 +261,7 @@ source {
242261
hosts = ["https://localhost:9200"]
243262
username = "elastic"
244263
password = "elasticsearch"
245-
264+
246265
tls_keystore_path = "${your elasticsearch home}/config/certs/http.p12"
247266
tls_keystore_password = "${your password}"
248267
}
@@ -266,6 +285,26 @@ source {
266285
}
267286
```
268287

288+
Demo7: PIT
289+
```hocon
290+
source {
291+
Elasticsearch {
292+
hosts = ["https://elasticsearch:9200"]
293+
username = "elastic"
294+
password = "elasticsearch"
295+
tls_verify_certificate = false
296+
tls_verify_hostname = false
297+
298+
index = "st_index"
299+
query = {"range": {"c_int": {"gte": 10, "lte": 20}}}
300+
301+
# Use DSL query with PIT API
302+
search_type = DSL
303+
search_api_type = PIT
304+
pit_keep_alive = 60000 # 1 minute in milliseconds
305+
pit_batch_size = 100
306+
```
307+
269308
## Changelog
270309

271310
<ChangeLog />

Diff for: docs/zh/connector-v2/source/Elasticsearch.md

+45-6
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,9 @@ import ChangeLog from '../changelog/connector-elasticsearch.md';
2828
| index_list | array | no | 用来定义多索引同步任务 |
2929
| source | array | no | - |
3030
| query | json | no | {"match_all": {}} |
31-
| search_type | json | no | 查询方式,sql或者dsl,默认 dsl |
32-
| sql_query | json | no | sql 查询语句 |
31+
| search_type | enum | no | 查询类型,SQL 或 DSL,默认 DSL |
32+
| search_api_type | enum | no | 分页 API 类型,SCROLL 或 PIT,默认 SCROLL |
33+
| sql_query | json | no | SQL 查询语句,当 search_type 为 SQL 时必须 |
3334
| scroll_time | string | no | 1m |
3435
| scroll_size | int | no | 100 |
3536
| tls_verify_certificate | boolean | no | true |
@@ -39,6 +40,8 @@ import ChangeLog from '../changelog/connector-elasticsearch.md';
3940
| tls_keystore_password | string | no | - |
4041
| tls_truststore_path | string | no | - |
4142
| tls_truststore_password | string | no | - |
43+
| pit_keep_alive | long | no | 60000 (1 minute) |
44+
| pit_batch_size | int | no | 100 |
4245
| common-options | | no | - |
4346

4447
### hosts [array]
@@ -115,6 +118,22 @@ PEM 或 JKS 信任库的路径。该文件必须对运行 SeaTunnel 的操作系
115118

116119
指定信任库的密钥密码。
117120

121+
### search_type
122+
查询类型,可选值:
123+
- DSL: 使用 Domain Specific Language 查询(默认)
124+
- SQL: 使用 SQL 查询
125+
126+
### search_api_type
127+
分页 API 类型,可选值:
128+
- SCROLL: 使用 Scroll API 进行分页(默认)
129+
- PIT: 使用 Point in Time (PIT) API 进行分页
130+
131+
### pit_keep_alive [long]
132+
PIT 应保持活动的时间量(以毫秒为单位)
133+
134+
### pit_batch_size [long]
135+
每次 PIT 搜索请求返回的最大数量
136+
118137
### common options
119138

120139
Source 插件常用参数,具体请参考 [Source 常用选项](../source-common-options.md)
@@ -180,7 +199,7 @@ source {
180199
c_date2,
181200
c_null
182201
]
183-
202+
184203
}
185204
186205
]
@@ -215,7 +234,7 @@ source {
215234
hosts = ["https://localhost:9200"]
216235
username = "elastic"
217236
password = "elasticsearch"
218-
237+
219238
tls_verify_certificate = false
220239
}
221240
}
@@ -229,7 +248,7 @@ source {
229248
hosts = ["https://localhost:9200"]
230249
username = "elastic"
231250
password = "elasticsearch"
232-
251+
233252
tls_verify_hostname = false
234253
}
235254
}
@@ -243,7 +262,7 @@ source {
243262
hosts = ["https://localhost:9200"]
244263
username = "elastic"
245264
password = "elasticsearch"
246-
265+
247266
tls_keystore_path = "${your elasticsearch home}/config/certs/http.p12"
248267
tls_keystore_password = "${your password}"
249268
}
@@ -267,6 +286,26 @@ source {
267286
}
268287
```
269288

289+
Demo7: PIT方式滚动查询
290+
```hocon
291+
source {
292+
Elasticsearch {
293+
hosts = ["https://elasticsearch:9200"]
294+
username = "elastic"
295+
password = "elasticsearch"
296+
tls_verify_certificate = false
297+
tls_verify_hostname = false
298+
299+
index = "st_index"
300+
query = {"range": {"c_int": {"gte": 10, "lte": 20}}}
301+
302+
# 使用 DSL 查询和 PIT API
303+
search_type = DSL
304+
search_api_type = PIT
305+
pit_keep_alive = 60000 # 1 minute in milliseconds
306+
pit_batch_size = 100
307+
```
308+
270309
## 变更日志
271310

272311
<ChangeLog />

0 commit comments

Comments
 (0)