-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark: Add some tests for variant fixup #12497
base: main
Are you sure you want to change the base?
Conversation
api/src/main/java/org/apache/iceberg/variants/VariantMetadata.java
Outdated
Show resolved
Hide resolved
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestSparkFixupTypes.java
Outdated
Show resolved
Hide resolved
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestSparkFixupTypes.java
Outdated
Show resolved
Hide resolved
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestSparkFixupTypes.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rebase. Otherwise, LGTM.
0bc41e3
to
74c5118
Compare
api/src/test/java/org/apache/iceberg/types/TestReadabilityChecks.java
Outdated
Show resolved
Hide resolved
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/TestSparkFixupTypes.java
Outdated
Show resolved
Hide resolved
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/data/RandomData.java
Outdated
Show resolved
Hide resolved
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/PruneColumnsWithoutReordering.java
Show resolved
Hide resolved
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/TestSparkFixupTypes.java
Outdated
Show resolved
Hide resolved
…thod and enhance variant fixup tests
} | ||
|
||
@Test | ||
void fixupShouldCorrectlyFixVariant() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this behavior is correct. SparkFixupTypes
should only fix the type when there is an explicit fix. But SparkFixupTypes
doesn't have one so all types are converted to variant. I would expect this to only allow binary, which is probably what we will use to pass this value into Spark 3.5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please further explain this? Do you mean that we need to restrict VariantType
to only convert with BinaryType
in SparkFixupTypes
? Or are you just suggesting that the name of my test should not be "Correct"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cases that are fixed up are in SparkFixupTypes
:
@Override
protected boolean fixupPrimitive(Type.PrimitiveType type, Type source) {
switch (type.typeId()) {
case STRING:
if (source.typeId() == Type.TypeID.UUID) {
return true;
}
break;
case BINARY:
if (source.typeId() == Type.TypeID.FIXED) {
return true;
}
break;
case TIMESTAMP:
if (source.typeId() == Type.TypeID.TIMESTAMP) {
return true;
}
break;
default:
}
return false;
}
Currently, the behavior is to always use the type from schema and never override with the reference schema's type. The test here validates that the type is not fixed, but what we care about are cases where a type does get fixed.
I think the case where a type does get fixed is when we are exposing variant as binary. So I would expect an update to SparkFixupTypes
and the base FixupTypes
to override binary with variant when variant is in the reference schema. Otherwise, the variant type is never "fixed" and we don't really need a new test suite.
close #12473
The implementation of variant in various visitors has been basically completed in #11831. Since Spark3 does not support the variant type, this PR only adds preliminary work and unit tests.