Skip to content

Conversation

@thisisArjit
Copy link
Contributor

Dear Gobblin maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

Description

  • Here are some details about my PR, including screenshots (if applicable):

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

@thisisArjit thisisArjit force-pushed the schema-evolution-iceberg-partition branch from abcef82 to 6325883 Compare October 15, 2025 09:00
Comment on lines +324 to +342
*/
public void updateSchema(Schema updatedSchema) throws TableNotFoundException {
TableMetadata currentTableMetadata = accessTableMetadata();
Schema currentSchema = currentTableMetadata.schema();

if (currentSchema.sameSchema(updatedSchema)) {
log.info("~{}~ schema is already up-to-date", tableId);
return;
}

log.info("~{}~ updating schema from {} to {}", tableId, currentSchema, updatedSchema);

TableMetadata updatedTableMetadata = currentTableMetadata.updateSchema(updatedSchema, updatedSchema.highestFieldId());
Preconditions.checkArgument(updatedTableMetadata.schema().sameSchema(updatedSchema), "Schema mismatch after update, please check destination table");

tableOps.commit(currentTableMetadata, updatedTableMetadata);
tableOps.refresh();

log.info("~{}~ schema updated successfully", tableId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets not update schema here itself, since we haven't copied files yet updating schema will leave table in unwanted state if copying files fail due to any reason

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also schema update and data files commit should be done in one transaction

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants