Skip to content

[Kernel][icebergWriterCompatV1] Add a check that map struct keys don't evolve #4525

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 15, 2025

Conversation

nicklan
Copy link
Collaborator

@nicklan nicklan commented May 9, 2025

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

Add a check to forbid changes in struct types that are map keys. See docs which state:

Altering a map 'key' column by adding columns is not allowed. Only map values can be updated.

We have empirically determined that drops also trigger an error, so we forbid any changes to structs that are used as map keys.

This check only triggers if IcebergWriterCompatV1 is enabled on the table. This is required by the spec (see here)

How was this patch tested?

Unit Tests

Does this PR introduce any user-facing changes?

No

@nicklan nicklan requested review from allisonport-db, scottsand-db and vkorukanti and removed request for allisonport-db and scottsand-db May 9, 2025 23:49
@nicklan nicklan self-assigned this May 9, 2025
* If IcebergWriterCompatV1 is enabled, we need to ensure that map struct keys don't change. This
* validates that
*/
private static void validateNoMapStructKeyChanges(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this check should be nested deeper within the schema evolution checks? Then we can use the SchemaChanges class @amogh-jahagirdar added to just check the "updatedColumns"

But I'm also okay with this if it's not any easier/better. Maybe it's fewer schema traversals however? Since I think this method alone involves at least 2

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually that's much nicer, thanks for the suggestion I didn't see the SchemaChanges stuff

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually picturing nesting this even further in the existing code, I think it can just be an extra check in validateFieldCompatibility. There we already have a branch where we have established a field update where we go from map type -> map type (skips a lot of this code here) and then all we need to add there is the additional check if icebergWriterCompatV1 is enabled & keyType is struct

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be super concise and simple there just something like

    } else if (existingField.getDataType() instanceof MapType
        && newField.getDataType() instanceof MapType) {
      MapType existingMapType = (MapType) existingField.getDataType();
      MapType newMapType = (MapType) newField.getDataType();

      if (icebergWriterCompatV1Enabled && existingMapType.getKeyType() instanceof StructType && newMapType.getKeyType() instanceof StructType) {
               // Require existingMapType.getKeyType() equals newMapType.getKeyType()
      }

      validateFieldCompatibility(existingMapType.getKeyField(), newMapType.getKeyField());
      validateFieldCompatibility(existingMapType.getValueField(), newMapType.getValueField());
    }

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh man, can't believe I missed that those checks already exist. Thanks! That's obviously much better.

@nicklan nicklan requested a review from allisonport-db May 14, 2025 22:43
* If IcebergWriterCompatV1 is enabled, we need to ensure that map struct keys don't change. This
* validates that
*/
private static void validateNoMapStructKeyChanges(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually picturing nesting this even further in the existing code, I think it can just be an extra check in validateFieldCompatibility. There we already have a branch where we have established a field update where we go from map type -> map type (skips a lot of this code here) and then all we need to add there is the additional check if icebergWriterCompatV1 is enabled & keyType is struct

* If IcebergWriterCompatV1 is enabled, we need to ensure that map struct keys don't change. This
* validates that
*/
private static void validateNoMapStructKeyChanges(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be super concise and simple there just something like

    } else if (existingField.getDataType() instanceof MapType
        && newField.getDataType() instanceof MapType) {
      MapType existingMapType = (MapType) existingField.getDataType();
      MapType newMapType = (MapType) newField.getDataType();

      if (icebergWriterCompatV1Enabled && existingMapType.getKeyType() instanceof StructType && newMapType.getKeyType() instanceof StructType) {
               // Require existingMapType.getKeyType() equals newMapType.getKeyType()
      }

      validateFieldCompatibility(existingMapType.getKeyField(), newMapType.getKeyField());
      validateFieldCompatibility(existingMapType.getValueField(), newMapType.getValueField());
    }

@nicklan nicklan requested a review from allisonport-db May 14, 2025 23:51
Copy link
Collaborator

@allisonport-db allisonport-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@allisonport-db allisonport-db changed the title Add a check that map struct keys don't evolve [Kernel][icebergWriterCompatV1] Add a check that map struct keys don't evolve May 15, 2025
@nicklan nicklan merged commit 1cbeda4 into delta-io:master May 15, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants