feat: Variant Support#2188
Conversation
| let table_creation = TableCreation::builder() | ||
| .name(name.clone()) | ||
| .schema(iceberg_schema) | ||
| .format_version(format_version) |
There was a problem hiding this comment.
Before this change existing tests where rightfully failing as we used to create a V2 table with a NS Timestamp column:
https://github.com/apache/iceberg-rust/actions/runs/22522306915/job/65248930667
This new logic determines the min format version required and uses that - but at least V2. Thus we switch now to V3 for ns timestamps.
|
@CTTY @liurenjie1024 @Xuanwo this would be ready for review! |
CTTY
left a comment
There was a problem hiding this comment.
Thanks for the feature! Just took a look.
Also the test seems to be failing
|
@CTTY ready for another round! |
CTTY
left a comment
There was a problem hiding this comment.
Mostly LGTM! Left some minor comments
Co-authored-by: Shawn Chang <yxchang@amazon.com>
|
@CTTY ready for another round! |
| } | ||
|
|
||
| Ok(()) | ||
| fn variant(&mut self, _v: &VariantType) -> Result<Self::T> { |
There was a problem hiding this comment.
Not really a comment on this line, but: iceberg-java's TypeToMessageType#variant writes the Parquet group with LogicalTypeAnnotation.variantType(VARIANT_SPEC_VERSION). The Rust write path here doesn't add that annotation, so files written by iceberg-rust would carry a plain Struct(Binary, Binary) without the variant logical type marker. The integration tests are read-only against Spark-written data, so it isn't caught. Worth a tracking issue, or already on the roadmap?
There was a problem hiding this comment.
this PR is unshredded variants only, with two follow-ups:
- Write annotation (your comment): we emit a plain Struct(Binary,Binary) without variantType(...) since
variant_experimentalis off. Doesn't break Java read-back — it resolves variant by field-id, not the annotation — but I'll track adding it. - Shredded reads: a
typed_valuesub-field was being silently dropped (corrupt data). Added a guard that returnsFeatureUnsupported+ a test; full shredding reconstruction is a follow-up.
Guard added here:
b702f5a
There was a problem hiding this comment.
Regarding 1)
iceberg-rust writes via AsyncArrowWriter, which derives the Parquet schema from the Arrow schema. In parquet 58.1.0, that path only emits the VARIANT annotation when the field carries the parquet_variant_compute::VariantType extension type and variant_experimental is enabled (otherwise logical_type_for_struct is a stub returning None). I couldn't find a public per-field hook to inject the annotation onto a plain Struct(Binary,Binary).
So the real cost is: enable variant_experimental + attach the extension type to the field. Two risksthat I se:
- Turning on the feature may change how the reader decodes a VARIANT-annotated group (native VariantArray instead of Struct{metadata,value}) — could break the current read path that expects the struct.
- New experimental dep surface.
Not sure how we should proceed. I think this maybe should be a separate issue?
I created this for now:
#2546
# Conflicts: # crates/iceberg/public-api.txt
mbutrovich
left a comment
There was a problem hiding this comment.
Thanks for addressing the feedback @c-thiel! This LGTM! Looking forward to putting some queries through this.
| /// Returns the minimum [`FormatVersion`] required to represent all types in this schema. | ||
| /// | ||
| /// Defaults to `FormatVersion::V1` if all types are universally supported. | ||
| pub fn min_format_version(&self) -> FormatVersion { |
There was a problem hiding this comment.
This api looks odd to me, it makes me feel like reading a field, which it's not that cheap. I think a better solution is to add a SchemaVisitor implementation, which check if it's compatible with some FormatVersion.
There was a problem hiding this comment.
Agreed on the visitor — I'll move the computation into a SchemaVisitor (T = FormatVersion), which also lets me drop the recursive Type::min_format_version you flagged below. I'd keep a public method returning the value though: datafusion uses it to derive the table's format version (min_format_version().max(V2)), not to check a known one, so a pure is_compatible(fv) predicate wouldn't cover that use. So: visitor under the hood, thin query on top.
There was a problem hiding this comment.
Rolling back:
I tried the SchemaVisitor approach first but moved to a small leaf_min_format_version(&Type) helper + iteration over the flattened id_to_field. Three reasons it's cleaner here:
- Infallibility vs. the trait.
SchemaVisitorreturns Result on every method, but this logic can't fail — so it needed .expect("never fails") at the call sites. - Wrong shape for the check:
check_format_compatibilitywants a shallow, per-field test (each flattened field judged by its own type, so blame lands on the leaf, not its container). The visitor is a recursive tree-folder; reusing it meant callingvisit_typeon single leaves behind an!is_nested()guard — indirect, and it silently couples that guard to which visitor arms count as "leaf." - Cost for no benefit. The visitor allocated a
Vec<FormatVersion>per struct level inmin_format_version(the flat fold allocates nothing) and was ~40 lines of 7-method boilerplate for a 5-line rule.
The flat form is also closer to Java, which keeps this as a static MIN_FORMAT_VERSIONS map in Schema iterated over lazyIdToField() — not a visitor. So: one match as the single source of truth, both min_format_version and check_format_compatibility route through it, and the per-field iteration mirrors checkCompatibility directly.
Fixed in ef25e2c
| /// which older readers can't honor. `write_default` only affects newly | ||
| /// written rows (physically materialized, read the same at any version), so | ||
| /// it is not checked. | ||
| pub fn check_format_compatibility(&self, format_version: FormatVersion) -> Result<()> { |
There was a problem hiding this comment.
See the comment below, we should use a visitor for this.
There was a problem hiding this comment.
Addressed together with the visitor discussion above — #2188 (comment)
| .name_by_field_id(field.id) | ||
| .unwrap_or(field.name.as_str()); | ||
|
|
||
| let min_version = field.field_type.min_format_version(); |
There was a problem hiding this comment.
This is error prone. Java's approach uses TypeID, but this method will calls recursively again and again for nested data types.
There was a problem hiding this comment.
Switched to Java's shape: a shallow leaf_min_format_version(&Type) (a match, like Java's MIN_FORMAT_VERSIONS TypeID lookup) applied per field over the flattened id_to_field (like lazyIdToField()) — no recursion. This also fixed a real bug the recursion caused: it blamed the container for a nested v3 type.
We have a test for it now.
There was a problem hiding this comment.
When you change these implementations, I think we need to add some ut the verify that these changes are correct.
There was a problem hiding this comment.
Added test_prune_columns_variant — a variant prunes like a primitive leaf: selecting it keeps it (same for full and non-full projection), selecting a sibling drops it. I also mirrored Java's TestTypeUtil variant coverage for the other arms this PR touched: test_reassign_ids_variant (id_reassigner) and test_assign_fresh_ids_variant (schema-evolution id assignment).
Added in 1673a2b
Do you think anything else is missing?
The index_by_id/index_by_name arms are no-ops for variant, so I left those.
There was a problem hiding this comment.
I'm not talking about prune_columns only, I mean all other affects parts.
There was a problem hiding this comment.
More tests added in 16c81e0 - let me know if you feel something is missing!
There was a problem hiding this comment.
I don't understand why you put these tests in integration tests. We should gradually remove this integration tests, so we should not do this unnecessary we have no other choice. These tests are almost all about arrow readers, why not put them in arrow module?
There was a problem hiding this comment.
Moved in fd6a91b
I had put these in integration_tests because they were genuinely Spark integration tests (reading real Spark-written variant data end-to-end through the REST catalog), and I wasn't aware we're trying to phase that suite out. Now that I know, I've moved them.
Mitigation: the coverage is now in arrow/reader/projection.rs as self-contained unit tests over synthetic variant Parquet (full scan, variant-only, sibling-only, nested-in-struct) — they drive ArrowReader end-to-end (projection mask + decode), assert the id values and the variant metadata/value bytes round-trip exactly. read_variant.rs is deleted and the variant tables are removed from dev/spark/provision.py, so the PR no longer touches Spark provisioning at all.
The one thing we lose is the real cross-engine interop check. If we want that back later it's a single-commit revert, but I think the unit tests cover the reader logic that actually mattered here.
Longer term a cleaner interop source could be apache/parquet-testing (engine-agnostic, spec-canonical variant vectors) rather than Spark — I'd wire that in when we add variant value decoding / take on the #2546 annotation+shredding work, since that's what the corpus actually exercises.
|
@blackmwk ready for another round! |
| pub const SCHEMA_NAME_DELIMITER: &str = "."; | ||
| /// Minimum format version that allows non-null field default values. | ||
| /// Mirrors Java's `Schema.DEFAULT_VALUES_MIN_FORMAT_VERSION`. | ||
| pub const MIN_FORMAT_VERSION_DEFAULT_VALUES: FormatVersion = FormatVersion::V3; |
There was a problem hiding this comment.
| pub const MIN_FORMAT_VERSION_DEFAULT_VALUES: FormatVersion = FormatVersion::V3; | |
| pub(crate) const DEFAULT_VALUES_MIN_FORMAT_VERSION: FormatVersion = FormatVersion::V3; |
This is not a public api.
There was a problem hiding this comment.
Applied. That said, I'd lean toward keeping spec-defined version floors like this pub. iceberg-rust is an SDK for empowered users — I don't think we should be overly protective with visibility. Downstream catalogs/engines (Lakekeeper, for us) that gate default-value writes on format version otherwise have to re-declare this constant locally.
| /// `TimestampNs` / `TimestamptzNs` / `Variant` require v3; everything else (including | ||
| /// nested types, validated per-leaf elsewhere) is valid from v1. Single source of truth | ||
| /// for the type version rules, mirroring Java's `Schema.MIN_FORMAT_VERSIONS`. | ||
| fn leaf_min_format_version(field_type: &Type) -> FormatVersion { |
There was a problem hiding this comment.
This should be part of Type.
There was a problem hiding this comment.
15884b9 moved the shallow per-type rule onto Type::min_format_version - just like it was originally, just with the fixed recursion from the last review.
| /// Minimum [`FormatVersion`] required to represent all *types* in this schema. | ||
| /// | ||
| /// Types only; for initial-default version floors see [`Schema::check_format_compatibility`]. | ||
| pub fn min_format_version(&self) -> FormatVersion { |
There was a problem hiding this comment.
| pub fn min_format_version(&self) -> FormatVersion { | |
| pub fn calc_min_compatible_format(&self) -> FormatVersion { |
Also please change comments to clarify that this will visit whole schema to get this, or how about we store it a lazy field in Schema?
There was a problem hiding this comment.
15884b9 Renamed to calc_min_compatible_format and doc'd that it walks every field. Skipped the lazy field because: the only caller is datafusion's register_table today — once per CREATE TABLE, never a hot path. Caching a once-per-table O(fields) call would add a field to Schema (and to its Clone/serde/Eq surface) for not much win. Easy to add later, non-breaking, if a hot consumer ever appears.
| /// Returns an error listing every field incompatible with `format_version`. | ||
| /// Mirrors Java's `Schema.checkCompatibility()`. Two checks per field: | ||
| /// | ||
| /// - **Type** — per `leaf_min_format_version`. |
There was a problem hiding this comment.
| /// - **Type** — per `leaf_min_format_version`. | |
| /// - **Type** — Minimum format version required to support that type, without taking nested filed types into account. |
The leaf_min_format_version is implementation detail, and we should not show it in comments.
| if format_version < min_version { | ||
| let name = self | ||
| .name_by_field_id(field.id) | ||
| .unwrap_or(field.name.as_str()); |
There was a problem hiding this comment.
This is a bug, we should return error.
| { | ||
| let name = self | ||
| .name_by_field_id(field.id) | ||
| .unwrap_or(field.name.as_str()); |
There was a problem hiding this comment.
I'm not talking about prune_columns only, I mean all other affects parts.
- Move the shallow per-type version rule onto `Type::min_format_version` (pub(crate), non-recursive — mirrors Java's MIN_FORMAT_VERSIONS type-id lookup); drop the free `leaf_min_format_version` helper. - Rename `Schema::min_format_version` -> `calc_min_compatible_format` and doc that it walks every field (it is O(fields), not a cheap getter). - Make `DEFAULT_VALUES_MIN_FORMAT_VERSION` pub(crate) and rename to match Java; it has no external consumer. - Error (Unexpected) instead of silently falling back to an unqualified field name when the id is missing from the name index. - Regenerate iceberg public-api baseline for the above.
Add variant unit tests for the visitor arms the PR touched that produce observable output: iceberg->arrow type, arrow->iceberg value (unsupported), parquet path indexing, schema index-by-id/name, and the Glue/Hive type mappings. Avro was already covered; pure no-op arms are skipped. Also apply rustfmt fixups (import order, format! wrap) from the prior commit.
Which issue does this PR close?
Variant Support.
Arrow value support is currently missing as I am unsure how we want to extend
LiteralWhat changes are included in this PR?
Core: Variant Type
crates/iceberg/src/spec/datatypes.rs— newVarianttypecrates/iceberg/src/spec/values/literal.rs—Variantliteral valuecrates/iceberg/src/spec/schema/— visitor, index, pruning, mod, id reassigner all handleVariantcrates/iceberg/src/spec/table_metadata.rs— metadata supportAvro
crates/iceberg/src/avro/schema.rs— read/writeVariantin AvroArrow
crates/iceberg/src/arrow/schema.rs— mapVariantto Arrow typecrates/iceberg/src/arrow/reader.rs— readVariantfrom Arrowcrates/iceberg/src/arrow/value.rs— Arrow value conversioncaching_delete_file_loader.rsandnan_val_cnt_visitor.rsParquet
crates/iceberg/src/writer/file_writer/parquet_writer.rs— writeVariantcolumnsTests & Dev
crates/integration_tests/tests/read_variant.rs— new integration test for reading Variant datadev/spark/provision.py— Spark provisioning to generate Variant test dataAre these changes tested?
Sure! Even integration tested :)