-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core, Hive: Double check commit status in case of commit conflict for NoLock #12637
base: main
Are you sure you want to change the base?
Conversation
I need to make these changes to enable a test with NoLock:
If these changes turn out to be unacceptable, I suppose we can just mock the exception in the HMS client. |
core/src/main/java/org/apache/iceberg/BaseMetastoreOperations.java
Outdated
Show resolved
Hide resolved
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java
Outdated
Show resolved
Hide resolved
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java
Show resolved
Hide resolved
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java
Outdated
Show resolved
Hide resolved
* @return Empty if locations cannot be checked, e.g. unable to refresh. True if the new location | ||
* is committed, false otherwise. | ||
*/ | ||
protected Optional<Boolean> metadataLocationCommitted( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of returning an Optional<Boolean>
, would it be better to return the current commit status?
Like:
/**
* Attempt to load the content and see if any current or past metadata location matches the one we
* were attempting to set. This is used as a last resort when we are dealing with exceptions that
* may indicate the commit has failed and don't have proof that this is the case, but we can be sure that
* no retry attempts for the commit will be successful later. Note that all the previous locations must also
* be searched on the chance that a second committer was able to successfully commit on top of our commit.
* When the {@code newMetadataLocation} is not in the history the method returns
* {@link CommitStatus.FAILURE}, when the {@link commitStatusSupplier} fails repeatedly the method returns
* {@link CommitStatus.UNKNOWN}.
*
* @param tableOrViewName full name of the Table/View
* @param newMetadataLocation the path of the new commit file
* @param properties properties for retry
* @param commitStatusSupplier check if the latest metadata presents or not using metadata
* location for table.
* @return Commit Status of Success, Failure or Unknown
*/
protected CommitStatus checkCommitStatusStrict(
String tableOrViewName,
String newMetadataLocation,
Map<String, String> properties,
Supplier<Boolean> commitStatusSupplier) {
And we need to update the javadoc for the checkCommitStatus
too, to describe that it will never return CommitStatus.FAILURE
, like:
/**
* Attempt to load the content and see if any current or past metadata location matches the one we
* were attempting to set. This is used as a last resort when we are dealing with exceptions that
* may indicate the commit has failed but don't have proof that this is the case. Note that all
* the previous locations must also be searched on the chance that a second committer was able to
* successfully commit on top of our commit. When the {@code newMetadataLocation} is not in the
* history or the {@link commitStatusSupplier} fails repeatedly the method returns
* {@link CommitStatus.UNKNOWN}, because possible pending retires might still commit the change.
*
* @param tableOrViewName full name of the Table/View
* @param newMetadataLocation the path of the new commit file
* @param properties properties for retry
* @param commitStatusSupplier check if the latest metadata presents or not using metadata
* location for table.
* @return Commit Status of Success or Unknown
*/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah but I would like checkCommitStatusStrict
and checkCommitStatus
to share the same underlying check logic. The difference is checkCommitStatusStrict
treats false as CommitStatus.FAILURE
while checkCommitStatus
treats false as CommitStatus.UNKNOWN
. So I think we can keep the method returning Optional<Boolean>
, but make it private?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or could we just call the checkCommitStatusStrict
inside the checkCommitStatus
and reinterpret the result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, good idea!
core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java
Outdated
Show resolved
Hide resolved
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java
Outdated
Show resolved
Hide resolved
metastore = new TestHiveMetastore(); | ||
HiveConf hiveConfWithOverrides = new HiveConf(TestHiveMetastore.class); | ||
if (hiveConfOverride != null) { | ||
for (Map.Entry<String, String> kv : hiveConfOverride.entrySet()) { | ||
hiveConfWithOverrides.set(kv.getKey(), kv.getValue()); | ||
} | ||
} | ||
metastore = new TestHiveMetastore(hiveConfOverride); | ||
|
||
metastore.start(hiveConfWithOverrides); | ||
metastoreClient = new HiveMetaStoreClient(hiveConfWithOverrides); | ||
metastore.start(new HiveConf(TestHiveMetastore.class)); | ||
metastoreClient = new HiveMetaStoreClient(metastore.hiveConf()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
METASTORE_TRY_DIRECT_SQL
is hard coded to false in TestHiveMetastore::initConf
. This change is to make it possible to override the config in our test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please try to set this globally, and see if there is an issue with it? This is just test code, and I would like to try to keep it as simple as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I'll try setting the default value to true, if everything is fine, I'll revert this change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enabling direct SQL globally seems to fix the weird test case. I did some debug and believe it's because direct SQL gets the partition filter pushdown correct and retrieves the corresponding partitions. Should I update the test cases here, or leave it to another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the Spark issue still persists when the directSql is turned off, so we don't want to change that before talking to the owners of the test.
Just create this method, and the old methods should use this:
/**
* Starts a TestHiveMetastore with a provided connection pool size and HiveConf.
*
* @param conf The hive configuration to use
* @param poolSize The number of threads in the executor pool
* @param directSql Used to turn on directSql
*/
public void start(HiveConf conf, int poolSize, boolean directSql) {
[..]
initConf(conf, port, directSql);
[..]
}
/**
* Starts a TestHiveMetastore with a provided connection pool size and HiveConf.
*
* @param conf The hive configuration to use
* @param poolSize The number of threads in the executor pool
*/
public void start(HiveConf conf, int poolSize) {
start(conf, poolSize, false);
}
@@ -130,7 +130,7 @@ CREATE TABLE "APP"."TAB_COL_STATS"( | |||
"BIT_VECTOR" BLOB | |||
); | |||
|
|||
CREATE TABLE "APP"."TABLE_PARAMS" ("TBL_ID" BIGINT NOT NULL, "PARAM_KEY" VARCHAR(256) NOT NULL, "PARAM_VALUE" CLOB); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to change the other CLOB
s in this file as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add more changes to be consistent with HIVE-16667 and HIVE-25574
Thanks @lirui-apache for all your work on this! We can ping Russell when we are ready, but if he doesn't have time, then I will be happy to merge it later this week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 pending tests
Looks good to me. |
Fixes #11814