-
Notifications
You must be signed in to change notification settings - Fork 458
[lake] Support auto snapshot expiration for paimon lake table #2184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
949a29b to
22a0e69
Compare
5e99f29 to
4108c7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds support for automatic snapshot expiration for lake tables in Fluss. The feature allows snapshots to be automatically expired when the tiering service commits data to the datalake, helping to manage storage costs and maintain cleaner snapshot histories.
- Introduces two new configuration options:
table.datalake.auto-expire-snapshot(table-level) andlake.tiering.auto-expire-snapshot(tiering service-level) - Updates the
CommitterInitContextinterface to provide table info and lake tiering configuration to committers - Modifies Paimon implementation to control the
WRITE_ONLYoption based on auto-expiration settings, enabling snapshot expiration when needed
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
website/docs/maintenance/tiered-storage/lakehouse-storage.md |
Adds documentation for the new lake.tiering.auto-expire-snapshot configuration option |
website/docs/engine-flink/options.md |
Adds documentation for the new table.datalake.auto-expire-snapshot table-level option |
fluss-common/src/main/java/org/apache/fluss/config/ConfigOptions.java |
Defines the two new configuration options for snapshot expiration |
fluss-common/src/main/java/org/apache/fluss/config/TableConfig.java |
Adds accessor method for the table-level auto-expire-snapshot setting |
fluss-common/src/main/java/org/apache/fluss/lake/committer/CommitterInitContext.java |
Extends interface to include table info and lake tiering config |
fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/tiering/committer/TieringCommitterInitContext.java |
Implements the extended CommitterInitContext interface |
fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/tiering/committer/TieringCommitOperator.java |
Passes lake tiering config and table info to committer initialization |
fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/tiering/committer/TieringCommitOperatorFactory.java |
Accepts and passes through lake tiering configuration |
fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/tiering/LakeTieringJobBuilder.java |
Adds lake tiering configuration parameter to job builder |
fluss-flink/fluss-flink-tiering/src/main/java/org/apache/fluss/flink/tiering/FlussLakeTieringEntrypoint.java |
Extracts lake tiering config from command-line parameters |
fluss-lake/fluss-lake-paimon/src/main/java/org/apache/fluss/lake/paimon/tiering/PaimonLakeCommitter.java |
Updates to use TableCommitImpl and conditionally enable snapshot expiration based on config |
fluss-lake/fluss-lake-paimon/src/main/java/org/apache/fluss/lake/paimon/tiering/PaimonLakeTieringFactory.java |
Passes full CommitterInitContext to committer instead of just table path |
fluss-lake/fluss-lake-paimon/src/test/java/org/apache/fluss/lake/paimon/tiering/PaimonTieringTest.java |
Adds comprehensive tests for snapshot expiration with various configuration combinations |
fluss-lake/fluss-lake-iceberg/src/test/java/org/apache/fluss/lake/iceberg/tiering/IcebergTieringTest.java |
Updates test to use new CommitterInitContext interface |
fluss-lake/fluss-lake-lance/src/test/java/org/apache/fluss/lake/lance/tiering/LanceTieringTest.java |
Updates test to use new CommitterInitContext interface |
fluss-test-coverage/pom.xml |
Excludes TieringCommitterInitContext from coverage requirements |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| public class FlussLakeTieringEntrypoint { | ||
|
|
||
| private static final String FLUSS_CONF_PREFIX = "fluss."; | ||
| private static final String LAKE_TIERING_CONFIG_PREFIX = "lake.teiring."; |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in the word "tiering": the prefix is misspelled as "lake.teiring." but should be "lake.tiering." to match the actual config option key defined in ConfigOptions.LAKE_TIERING_AUTO_EXPIRE_SNAPSHOT.
| private static final String LAKE_TIERING_CONFIG_PREFIX = "lake.teiring."; | |
| private static final String LAKE_TIERING_CONFIG_PREFIX = "lake.tiering."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
luoyuxia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LiebingYu Thanks for the pr. Left minor comments. Otherwise, LGTM
Btw, coud you please create a issue for iceberg to respect the snapshot expiration option?
| public class FlussLakeTieringEntrypoint { | ||
|
|
||
| private static final String FLUSS_CONF_PREFIX = "fluss."; | ||
| private static final String LAKE_TIERING_CONFIG_PREFIX = "lake.teiring."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
4108c7c to
6d72ff6
Compare
Sure, created at #2213. |
6d72ff6 to
3884be2
Compare
luoyuxia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Purpose
Linked issue: close #2182
Brief change log
Tests
API and Format
Documentation