Add configurable ClickHouse query timeout#542
Conversation
e641928 to
37653b1
Compare
37653b1 to
09d2733
Compare
|
@ShimonSte, before we review this, I saw you extended the repo's docs (which is deprecated). Can you add here the link to the docs PR covering this? |
The update to the docs were done because our tests will fail if we introduce a new conf and not mention it in the documentation in the repo. I haven't opened the PR for the documentation yet since I introduced the old timeout behavior in the open PR and I do not want to tie the publish of the docs PR with a new Spark version. |
| assert(nodeSpec.database === "testing") | ||
| assert(nodeSpec.options.get("ssl") === "true") | ||
| } | ||
|
|
There was a problem hiding this comment.
Maybe I have missed. Should we run a test to verify our default timeout value?
There was a problem hiding this comment.
We could, I'll add it.
Document under [Unreleased]: - Added: top-N pushdown (#543), configurable query timeout (#542) - Fixed: use_time_zone option (#548), nested VariantType via Arrow (#541), filter-expr quoting (#538), useNullableQuerySchema moved to catalog (#516) Also fix the [Unreleased] compare link (v0.10.0...HEAD) and add the missing [0.10.0] tag link.
Summary
Adds a configurable timeout for ClickHouse client query and ping operations via a new SQL config
spark.clickhouse.client.queryTimeout(default60s). (#530)Changes
spark.clickhouse.client.queryTimeout(ConfigEntry[Long], default60000ms, defined inClickHouseSQLConfand exposed onSparkOptionsfor read/write paths.NodeClient,NodesClient,ClusterClient): constructors now acceptqueryTimeoutMs: Long(defaulting toDEFAULT_QUERY_TIMEOUT_MS = 60000L, preserving prior behavior) and propagate it toclient.query(...).get(timeout, MS)andclient.ping(timeout).ClickHouseCatalog,ClickHouseTable,ClickHouseCommandRunner): read fromClickHouseHelper.clientQueryTimeoutMs(driver-sideSQLConf).ClickHouseRead,ClickHouseReader,ClickHouseWrite,ClickHouseWriter): read fromscanJob.readOptions.clientQueryTimeout/writeJob.writeOptions.clientQueryTimeout, which are serialized into the job description and safe on executors.spark-3.3,spark-3.4,spark-3.5,spark-4.0.