saveAsTable function doesn't create a table after updating to the spark 3.1.1 #349
Open
Description
Hello,
I have a problem after updating the Spark version from 2.4.3 to 3.1.1. Previously I use the following code to save parquet files and create correspondence table and everything worked fine in tests
df.write
.mode(SaveMode.ErrorIfExists)
.format("parquet").option("path", location)
.saveAsTable(tableFqn)
But after I had moved to Spark version 3.1.1 the last line stopped to create the corresponding table (in tests, at least). Command spark.table(tableFqn)
returns an empty df.
Also, I got new warnings and I supposed that this is the root cause of the problem:
2022-02-24 18:37:29.736 [WARN] SparkContext - Using an existing SparkContext; some configuration may not take effect. <ScalaTest-run>
2022-02-24 18:37:33.148 [WARN] HiveConf - HiveConf of name hive.stats.jdbc.timeout does not exist <ScalaTest-run>
2022-02-24 18:37:33.148 [WARN] HiveConf - HiveConf of name hive.stats.retries.wait does not exist <ScalaTest-run>
2022-02-24 18:37:36.709 [WARN] ObjectStore - Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 2.3.0 <ScalaTest-run>
2022-02-24 18:37:36.709 [WARN] ObjectStore - setMetaStoreSchemaVersion called but recording version is disabled: version = 2.3.0, comment = Set by MetaStore [email protected] <ScalaTest-run>
2022-02-24 18:37:36.725 [WARN] ObjectStore - Failed to get database default, returning NoSuchObjectException <ScalaTest-run>
2022-02-24 18:37:37.158 [WARN] ObjectStore - Failed to get database enriched_db, returning NoSuchObjectException <ScalaTest-run>
2022-02-24 18:37:37.174 [WARN] ObjectStore - Failed to get database enriched_db, returning NoSuchObjectException <ScalaTest-run>
2022-02-24 18:37:37.213 [WARN] ObjectStore - Failed to get database global_temp, returning NoSuchObjectException <ScalaTest-run>
2022-02-24 18:37:37.217 [WARN] ObjectStore - Failed to get database enriched_db, returning NoSuchObjectException <ScalaTest-run>
Full stacktrace:
2022-02-24 18:37:29.736 [WARN] SparkContext - Using an existing SparkContext; some configuration may not take effect. <ScalaTest-run>
2022-02-24 18:37:33.148 [WARN] HiveConf - HiveConf of name hive.stats.jdbc.timeout does not exist <ScalaTest-run>
2022-02-24 18:37:33.148 [WARN] HiveConf - HiveConf of name hive.stats.retries.wait does not exist <ScalaTest-run>
2022-02-24 18:37:36.709 [WARN] ObjectStore - Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 2.3.0 <ScalaTest-run>
2022-02-24 18:37:36.709 [WARN] ObjectStore - setMetaStoreSchemaVersion called but recording version is disabled: version = 2.3.0, comment = Set by MetaStore [email protected] <ScalaTest-run>
2022-02-24 18:37:36.725 [WARN] ObjectStore - Failed to get database default, returning NoSuchObjectException <ScalaTest-run>
2022-02-24 18:37:37.158 [WARN] ObjectStore - Failed to get database enriched_db, returning NoSuchObjectException <ScalaTest-run>
2022-02-24 18:37:37.174 [WARN] ObjectStore - Failed to get database enriched_db, returning NoSuchObjectException <ScalaTest-run>
2022-02-24 18:37:37.213 [WARN] ObjectStore - Failed to get database global_temp, returning NoSuchObjectException <ScalaTest-run>
2022-02-24 18:37:37.217 [WARN] ObjectStore - Failed to get database enriched_db, returning NoSuchObjectException <ScalaTest-run>
2022-02-24 18:37:38.694 [WARN] package - Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'. <ScalaTest-run-running-KimtoxTest>
2022-02-24 18:37:43.584 [WARN] ProcfsMetricsGetter - Exception when trying to compute pagesize, as a result reporting of ProcessTree metrics is stopped <driver-heartbeater>
2022-02-24 18:37:44.381 [WARN] SessionState - METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory. <ScalaTest-run-running-KimtoxTest>
2022-02-24 18:37:44.448 [WARN] HiveConf - HiveConf of name hive.internal.ss.authz.settings.applied.marker does not exist <ScalaTest-run-running-KimtoxTest>
2022-02-24 18:37:44.448 [WARN] HiveConf - HiveConf of name hive.stats.jdbc.timeout does not exist <ScalaTest-run-running-KimtoxTest>
2022-02-24 18:37:44.448 [WARN] HiveConf - HiveConf of name hive.stats.retries.wait does not exist <ScalaTest-run-running-KimtoxTest>
-chgrp: '<MY_COMPANY_NAME>\<MY_USERNAME>' does not match expected pattern for group
Usage: hadoop fs [generic options]
[-appendToFile <localsrc> ... <dst>]
[-cat [-ignoreCrc] <src> ...]
[-checksum <src> ...]
[-chgrp [-R] GROUP PATH...]
[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
[-chown [-R] [OWNER][:[GROUP]] PATH...]
[-copyFromLocal [-f] [-p] [-l] [-d] <localsrc> ... <dst>]
[-copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-count [-q] [-h] [-v] [-t [<storage type>]] [-u] [-x] <path> ...]
[-cp [-f] [-p | -p[topax]] [-d] <src> ... <dst>]
[-createSnapshot <snapshotDir> [<snapshotName>]]
[-deleteSnapshot <snapshotDir> <snapshotName>]
[-df [-h] [<path> ...]]
[-du [-s] [-h] [-x] <path> ...]
[-expunge]
[-find <path> ... <expression> ...]
[-get [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-getfacl [-R] <path>]
[-getfattr [-R] {-n name | -d} [-e en] <path>]
[-getmerge [-nl] [-skip-empty-file] <src> <localdst>]
[-help [cmd ...]]
[-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [<path> ...]]
[-mkdir [-p] <path> ...]
[-moveFromLocal <localsrc> ... <dst>]
[-moveToLocal <src> <localdst>]
[-mv <src> ... <dst>]
[-put [-f] [-p] [-l] [-d] <localsrc> ... <dst>]
[-renameSnapshot <snapshotDir> <oldName> <newName>]
[-rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ...]
[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
[-setfattr {-n name [-v value] | -x name} <path>]
[-setrep [-R] [-w] <rep> <path> ...]
[-stat [format] <path> ...]
[-tail [-f] <file>]
[-test -[defsz] <path>]
[-text [-ignoreCrc] <src> ...]
[-touchz <path> ...]
[-truncate [-w] <length> <path> ...]
[-usage [cmd ...]]
Generic options supported are
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
command [genericOptions] [commandOptions]
Usage: hadoop fs [generic options] -chgrp [-R] GROUP PATH...
2022-02-24 18:37:46.942 [WARN] ApacheUtils - NoSuchMethodException was thrown when disabling normalizeUri. This indicates you are using an old version (< 4.5.8) of Apache http client. It is recommended to use http client version >= 4.5.9 to avoid the breaking change introduced in apache client 4.5.7 and the latency in exception handling. See https://github.com/aws/aws-sdk-java/issues/1919 for more information <ScalaTest-run-running-FactActualsToEnrichedIntegrationTest>
.....
[]: Expected 5 values but got 0
java.lang.AssertionError: []: Expected 5 values but got 0
Does anybody have any ideas about this behavior? Versions:
Scala - 2.12.15
Spark - 3.1.1
spark-testing-base_2.12 - 3.1.1_1.1.1
Metadata
Assignees
Labels
No labels