Skip to content

Support Iceberg Identifier fields #23563

Open
@lozbrown

Description

@lozbrown

The Iceberg spec mentions

Identifier Field IDs
A schema can optionally track the set of primitive fields that identify rows in a table, using the property identifier-field-ids (see JSON encoding in Appendix C).

Two rows are the "same"---that is, the rows represent the same entity---if the identifier fields are equal. However, uniqueness of rows by this identifier is not guaranteed or required by Iceberg and it is the responsibility of processing engines or data providers to enforce.

Identifier fields may be nested in structs but cannot be nested within maps or lists. Float, double, and optional fields cannot be used as identifier fields and a nested field cannot be used as an identifier field if it is nested in an optional struct, to avoid null values in identifiers.

Support defining Identifier Field IDs when creating tables and showing that these are identifier fields when displaying with "show create table"

As above no expectation of enforcing these is required in Trino but the creation, storage and access to this metadata would be helpful to those creating automated processes to merge tables.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesticebergIceberg connector

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions