Skip to content

Spec: Allow the use of source-id in V3 #12644

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Apr 22, 2025
Merged

Conversation

Fokko
Copy link
Contributor

@Fokko Fokko commented Mar 25, 2025

Changes around the multi-argument transforms, mainly two things:

  • Up for debate. The spec does not point out an actual implementations of transforms that accept multiple arguments. From the existing transforms, the only contender is the bucket transform. Should we include this in the V3 spec? It will only allow to prune metadata if you do an equality expression on all the fields that are part of the transform.
  • Along the way, we've removed something that we did not intend. First we allowed to write source-id and source-ids based on the number of arguments. This has been changed to only allow source-ids for V3 in a PR that introduces backward compatibility. I think this makes the JSON parsers/producers more complex than needed (specifically PyIceberg). Also, in Java we would need to plumb down the table version to the PartitionSpecParser.java. I think it would be great to simplify this.

@github-actions github-actions bot added the Specification Issues that may introduce spec changes. label Mar 25, 2025
@szehon-ho
Copy link
Collaborator

Hi @Fokko i havent take a look yet at spec change, but for multi bucket we had some discussions last year. For reference the pr is here : #8259 with more discussion from: #8579

Fokko and others added 2 commits March 27, 2025 11:25
@szehon-ho
Copy link
Collaborator

I wonder should we remove multi-bucket into separate pr, to allow the source-id part to get in?

@Fokko
Copy link
Contributor Author

Fokko commented Apr 3, 2025

@szehon-ho I think that's a good idea. Let me rework the PR 👍

@Fokko Fokko force-pushed the fd-moar-spec-changes branch from 81711a0 to c103bab Compare April 3, 2025 11:22
Copy link
Contributor

@rdblue rdblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a typo, but overall I think this is the right direction. Thanks, @Fokko!

@RussellSpitzer
Copy link
Member

Looks good to me, thanks @Fokko !

@Fokko Fokko merged commit 12b1f52 into apache:main Apr 22, 2025
2 checks passed
@Fokko Fokko deleted the fd-moar-spec-changes branch April 22, 2025 08:29
Fokko added a commit that referenced this pull request Apr 22, 2025
I merged a [spec-change earlier today](#12644), but noticed that it was not live on the website. I think it would be good to get these changes out right away.
@Fokko Fokko mentioned this pull request Apr 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Specification Issues that may introduce spec changes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants