Skip to content

Conversation

@Aggarwal-Raghav
Copy link
Contributor

What changes were proposed in this pull request?

Check HIVE-29375 for repro and stacktrace

Why are the changes needed?

To make DATE type support for full outer join, which is, getting converted to map join because of hive.optimize.dynamic.partition.hashjoin=true;

Does this PR introduce any user-facing change?

No

How was this patch tested?

Wrote a new q file and will see CI output

mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=vector_full_outer_join_date.q -Drat.skip -Dtest.output.overwrite -Pitests -pl itests/qtest

@Aggarwal-Raghav
Copy link
Contributor Author

Explanation:

In Vectorizer.java the hashTableKeyTypeis getting set as DATE for DATE column type

hashTableKeyType = HashTableKeyType.DATE;

But the DATE type is not present VectorMapJoinOuterGenerateResultOperator in :

For Double and Timestamp Columns, they are working without the patch as well because the default hashTableKeyType is MULTI_KEY

hashTableKeyType = HashTableKeyType.MULTI_KEY;

@Aggarwal-Raghav
Copy link
Contributor Author

Will address sonar issues post review comments. I',m willing to move the if-else pattern + old switch style with jdk21 'switch expressions'. Reviewers can let me know. will file separate jira to migrate especially in vectorization.

@Aggarwal-Raghav
Copy link
Contributor Author

CC @zabetak , can you please help with the review?

@zabetak
Copy link
Member

zabetak commented Dec 19, 2025

Hey @Aggarwal-Raghav, I am on holidays till Dec 29, with intermittent and not stable internet connection. Not sure if I will find time to check this before then.

@Aggarwal-Raghav
Copy link
Contributor Author

Hey @Aggarwal-Raghav, I am on holidays till Dec 29, with intermittent and not stable internet connection. Not sure if I will find time to check this before then.

No worries. Enjoy !! 😅

@mdayakar
Copy link
Contributor

mdayakar commented Dec 19, 2025

Hi @Aggarwal-Raghav ,
I am not a vectorization feature expert but as per code changes I feel you missed below places adding DATE HashTableKeyType, please check.

  1. CheckFastRowHashMap.java
  2. VectorMapJoinOptimizedLongHashMap.java
  3. VectorMapJoinOptimizedLongHashMap.java

Also I could see there are many UT test cases, may be you can add more test cases related to DATE HashTableKeyType. For example MapJoinTestConfig.java

@Aggarwal-Raghav
Copy link
Contributor Author

Hi @Aggarwal-Raghav , I am not a vectorization feature expert but as per code changes I feel you missed below places adding DATE HashTableKeyType, please check.

  1. CheckFastRowHashMap.java
  2. VectorMapJoinOptimizedLongHashMap.java
  3. VectorMapJoinOptimizedLongHashMap.java

Also I could see there are many UT test cases, may be you can add more test cases related to DATE HashTableKeyType. For example MapJoinTestConfig.java

Thanks for the thorough review @mdayakar , i checked the above places you mentioned. For [2] and [3], it is already handled in this PR. For [1] and MapJoinTestConfig.java which are test classes, I'll evaluate on adding more test cases for DATE type and update in sometime.

Copy link
Member

@zabetak zabetak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM! I left some comments for minor improvements and potentially few unit tests more.

(The Sonar issues are not worth fixing; at least not now)

Comment on lines +397 to +399
case DATE:
hashTableKeyType = HashTableKeyType.DATE;
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have unit tests exploiting this config? Do we need to add something in TestMapJoinOperator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add a testcase testDate0 in TestMapJoinOperator. As testString0 makes use of the DATE type, but it does so as a Value column, not as a Join Key.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well if its not a Join key then the tests are not directly targeting the fix so not sure how much are needed. Do we have unit tests for join keys with different types somewhere in the repo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In TestMapJoinOperator.java, the following are the join keys for UT. The bigTableKeyColumnNums = new int[] {0}; determines the join column.

- TestLong0: long
- testLong0_NoRegularKeys: long
- testLong1: int
- testLong2: short
- testLong3: int
- testLong3_NoRegularKeys: int
- testLong4: int
- testLong5: long
- testLong6: long
- testDate0: date
- testMultiKey0: short, int
- testMultiKey1: timestamp, short, string
- testMultiKey2: long, short, string
- testMultiKey3: date, byte
- testString0: string
- testString1: binary
- testString2: string

@Aggarwal-Raghav
Copy link
Contributor Author

Thanks for the review @zabetak , I'll accomodate the suggestions.

@Aggarwal-Raghav
Copy link
Contributor Author

Aggarwal-Raghav commented Jan 4, 2026

Will push changes based on your input on #6239 (comment)

Comment on lines +397 to +399
case DATE:
hashTableKeyType = HashTableKeyType.DATE;
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well if its not a Join key then the tests are not directly targeting the fix so not sure how much are needed. Do we have unit tests for join keys with different types somewhere in the repo?

@sonarqubecloud
Copy link

sonarqubecloud bot commented Jan 5, 2026

Copy link
Member

@zabetak zabetak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Aggarwal-Raghav Can you please clarify what kind of coverage we are adding with the unit tests:

  • TestVectorMapJoinFastRowHashMap#testDateRowsExact
  • TestMapJoinOperator#testDate0

It seems that none of the above goes through the problematic code.

@Aggarwal-Raghav
Copy link
Contributor Author

@Aggarwal-Raghav Can you please clarify what kind of coverage we are adding with the unit tests:

  • TestVectorMapJoinFastRowHashMap#testDateRowsExact
  • TestMapJoinOperator#testDate0

It seems that none of the above goes through the problematic code.

That's strange, it was failing for me. Let me re-check and confirm on that.
Screenshot 2026-01-06 at 8 42 36 PM
Screenshot 2026-01-06 at 8 34 36 PM

@Aggarwal-Raghav
Copy link
Contributor Author

Please test with just these files from this PR:

 M ql/src/test/org/apache/hadoop/hive/ql/exec/vector/mapjoin/MapJoinTestConfig.java
 M ql/src/test/org/apache/hadoop/hive/ql/exec/vector/mapjoin/TestMapJoinOperator.java
 M ql/src/test/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/TestVectorMapJoinFastRowHashMap.java
?? ql/src/test/queries/clientpositive/vector_full_outer_join_date.q
?? ql/src/test/results/clientpositive/llap/vector_full_outer_join_date.q.out

MapJoinTestConfig changes are required otherwise the testDate0 is using MULTI_KEY as default.

For TestVectorMapJoinFastRowHashMap, ensure you have not taken CheckFastRowHashMap changes.
Both MapJoinTestConfig and CheckFastRowHashMap are supporting test classes. They don't contains test/UT as such

@zabetak
Copy link
Member

zabetak commented Jan 6, 2026

Thanks for the detailed explanation @Aggarwal-Raghav! My workspace was not clean and didn't notice that I was missing MapJoinTestConfig sorry for the confusion.

Copy link
Member

@zabetak zabetak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience. I will merge the PR soon!

@Aggarwal-Raghav
Copy link
Contributor Author

Thanks for the detailed explanation @Aggarwal-Raghav! My workspace was not clean and didn't notice that I was missing MapJoinTestConfig sorry for the confusion.

No worries. Thanks for the thorough review.

@zabetak zabetak merged commit 0978d70 into apache:master Jan 7, 2026
2 checks passed
@zabetak
Copy link
Member

zabetak commented Jan 7, 2026

Many thanks for the PR @Aggarwal-Raghav and @mdayakar for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants