Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core, Hive: Double check commit status in case of commit conflict for NoLock #12637
base: main
Are you sure you want to change the base?
Core, Hive: Double check commit status in case of commit conflict for NoLock #12637
Changes from 3 commits
6efb3f8
ed60da3
3fe077a
991a783
bf871bc
f5dae6b
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of returning an
Optional<Boolean>
, would it be better to return the current commit status?Like:
And we need to update the javadoc for the
checkCommitStatus
too, to describe that it will never returnCommitStatus.FAILURE
, like:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah but I would like
checkCommitStatusStrict
andcheckCommitStatus
to share the same underlying check logic. The difference ischeckCommitStatusStrict
treats false asCommitStatus.FAILURE
whilecheckCommitStatus
treats false asCommitStatus.UNKNOWN
. So I think we can keep the method returningOptional<Boolean>
, but make it private?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or could we just call the
checkCommitStatusStrict
inside thecheckCommitStatus
and reinterpret the result?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, good idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this weird casting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it was introduced in #10001 when we had two enums:
BaseMetastoreTableOperations::CommitStatus
andBaseMetastoreOperations::CommitStatus
, and is not needed anymore. I'd be happy to remove it in this PR if you think that's better.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
METASTORE_TRY_DIRECT_SQL
is hard coded to false inTestHiveMetastore::initConf
. This change is to make it possible to override the config in our test.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please try to set this globally, and see if there is an issue with it? This is just test code, and I would like to try to keep it as simple as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I'll try setting the default value to true, if everything is fine, I'll revert this change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enabling direct SQL globally seems to fix the weird test case. I did some debug and believe it's because direct SQL gets the partition filter pushdown correct and retrieves the corresponding partitions. Should I update the test cases here, or leave it to another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the Spark issue still persists when the directSql is turned off, so we don't want to change that before talking to the owners of the test.
Just create this method, and the old methods should use this:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to change the other
CLOB
s in this file as well?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add more changes to be consistent with HIVE-16667 and HIVE-25574
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI CLOB handling was fixed in https://github.com/apache/hive/pull/5386/files#diff-bcca13f6cc251df321e8fe80568ef0334a1d44f7e5e7ff2fcaa06ab4f05bbdf9R3387
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know the direct SQL is made more robust. Maybe we can revisit this change when we upgrade our hive dependency.