[lake/iceberg] Support tier array type for iceberg #2266
+784
−1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
Linked issue: close #2252
This PR adds support for array type conversion between Fluss and Iceberg, enabling tiering of tables with array columns to Iceberg lakehouse storage.
Brief change log
FlussDataTypeToIcebergDataTypeto convert FlussARRAYtype to IcebergLISTtype instead of throwingUnsupportedOperationExceptionFlussArrayAsIcebergListadapter class to wrap FlussInternalArrayas JavaListfor Iceberg, supporting:FlussRowAsIcebergRecordto handle array field conversion using the new adapterIcebergConversionsto support bidirectional type conversion (Iceberg LIST ↔ Fluss ARRAY)ARRAY → LISTin the Iceberg data type mapping tableTests
Unit Tests (
FlussRowAsIcebergRecordTest- 9 test cases):testArrayWithIntElements- Array of integerstestArrayWithStringElements- Array of stringstestNestedArrayType- Nested arrays (array of arrays)testArrayWithAllPrimitiveTypes- Arrays of all primitive typestestArrayWithDecimalElements- Array of decimal valuestestArrayWithTimestampElements- Arrays of TIMESTAMP and TIMESTAMP_LTZtestArrayWithNullElements- Arrays with null elementstestNullArray- Null array handlingtestArrayWithBinaryElements- Array of binary dataIntegration Tests (
IcebergTieringTest):testTieringWriteTableWithArrayType- Parameterized test (4 cases) covering:Test Results: All 92 tests pass (increased from 88 tests before this PR)
API and Format
API Changes: None. No breaking changes to public APIs.
Storage Format: No changes to Fluss storage format. This only affects the conversion layer between Fluss and Iceberg during tiering operations.
Type Mapping:
ARRAY<T>→ IcebergLIST<T>(element typeTis converted recursively)Documentation
New Feature: Yes, this introduces array type support for Iceberg tiering.
Documentation Changes:
website/docs/streaming-lakehouse/integrate-data-lakes/iceberg.mdARRAY → LISTmapping to the data type compatibility table