Open
Description
Bug description
When I set this config, velox threw an error.
spark.hadoop.fs.s3a.connection.timeout=200000
errMsg:
org.apache.gluten.exception.GlutenException: org.apache.gluten.exception.GlutenException: Exception: VeloxUserError
Error Source: USER
Error Code: INVALID_ARGUMENT
Reason: Invalid duration '200000'
Retriable: False
Context: Split [Hive: s3a://xxxxxxxxxx/part-00000-xxxxx.zstd.parquet 0 - 1442] Task Gluten_Stage_1_TID_1_VTID_0
Function: toDuration
File: /home/gitlab-runner/builds/2Grm8K_1/0/gluten/ep/build-velox/build/velox_ep/velox/common/config/Config.cpp
Line: 88
This error is related to the toDuration
function, which will throw error when the value string without time unit.
https://github.com/facebookincubator/velox/blob/main/velox/common/config/Config.cpp#L88
static const RE2 kPattern(R"(^\s*(\d+(?:\.\d+)?)\s*([a-zA-Z]+)\s*)");
double value;
std::string unit;
if (!RE2::FullMatch(str, kPattern, &value, &unit)) {
VELOX_USER_FAIL("Invalid duration '{}'", str);
}
Expected behavior:
In vanilla Spark, the set spark.hadoop.fs.s3a.connection.timeout=200000
is work.
Expect velox can support this without time unit.