Skip to content

[PoC] Test Flink and Spark 4.0 using Hive 4 metastore #13262

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

wypoon
Copy link
Contributor

@wypoon wypoon commented Jun 6, 2025

Continuation of #12721.

@wypoon wypoon changed the title [PoC] Remove use of HiveConf.ConfVars [PoC] Test Flink and Spark 4.0 using Hive 4 metastore Jun 7, 2025
@pan3793
Copy link
Member

pan3793 commented Jun 9, 2025

According to changes in hive-metastore/src/main, I suppose the iceberg-hive-metastore.jar compiled by Hive 2.3.10 is also binary compatible with Hive 3.x and 4.x runtime jars?

@wypoon
Copy link
Contributor Author

wypoon commented Jun 9, 2025

According to changes in hive-metastore/src/main, I suppose the iceberg-hive-metastore.jar compiled by Hive 2.3.10 is also binary compatible with Hive 3.x and 4.x runtime jars?

@pan3793 I don't believe so. According to @danielcweeks in #12721 (comment), there is a binary incompatibility. We cannot use iceberg-hive-metastore.jar built against Hive 2 with a Hive 4 HMS or vice versa. We discussed this in the last community sync. As I understand it, the way forward is to build the iceberg-hive-metastore against separate versions of Hive, producing separate artifacts, and to remove the bundled classes from the runtime jars of the engines. Users of the engines can then choose which version of iceberg-hive-metastore.jar they wish to use, according to the version of HMS they use. However, removal of the bundled classes is a breaking change and won't be done until Iceberg 2.0.

I know you have raised the point that Spark's IsolatedClientLoader is only used by its HiveExternalCatalog. However, here in this PoC, what we're doing is to put the Hive 4 iceberg-hive-metastore classes as well as Hive 4 classes in the Spark classpath for the tests, so Spark is loading Hive 4 metastore client classes to talk to the Iceberg TestHiveMetastore. It appears that for Spark 4.0, this works.

... and not iceberg-hive4-metastore classes.
@pan3793
Copy link
Member

pan3793 commented Jun 10, 2025

@wypoon thanks for your reply and input, I might miss context around this thread. Let me go through the original PR discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants