Skip to content

Conversation

@LiebingYu
Copy link
Contributor

Purpose

Linked issue: close #2182

Brief change log

Tests

API and Format

Documentation

@LiebingYu LiebingYu force-pushed the paimon-expire-snapshot branch from 949a29b to 22a0e69 Compare December 18, 2025 06:32
@LiebingYu LiebingYu changed the title [lake/paimon] Support snapshot expiration for tables that are not write-only [lake] Support auto snapshot expiration for lake table Dec 18, 2025
@LiebingYu LiebingYu force-pushed the paimon-expire-snapshot branch 8 times, most recently from 5e99f29 to 4108c7c Compare December 19, 2025 06:00
@luoyuxia luoyuxia requested a review from Copilot December 19, 2025 09:16
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for automatic snapshot expiration for lake tables in Fluss. The feature allows snapshots to be automatically expired when the tiering service commits data to the datalake, helping to manage storage costs and maintain cleaner snapshot histories.

  • Introduces two new configuration options: table.datalake.auto-expire-snapshot (table-level) and lake.tiering.auto-expire-snapshot (tiering service-level)
  • Updates the CommitterInitContext interface to provide table info and lake tiering configuration to committers
  • Modifies Paimon implementation to control the WRITE_ONLY option based on auto-expiration settings, enabling snapshot expiration when needed

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
website/docs/maintenance/tiered-storage/lakehouse-storage.md Adds documentation for the new lake.tiering.auto-expire-snapshot configuration option
website/docs/engine-flink/options.md Adds documentation for the new table.datalake.auto-expire-snapshot table-level option
fluss-common/src/main/java/org/apache/fluss/config/ConfigOptions.java Defines the two new configuration options for snapshot expiration
fluss-common/src/main/java/org/apache/fluss/config/TableConfig.java Adds accessor method for the table-level auto-expire-snapshot setting
fluss-common/src/main/java/org/apache/fluss/lake/committer/CommitterInitContext.java Extends interface to include table info and lake tiering config
fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/tiering/committer/TieringCommitterInitContext.java Implements the extended CommitterInitContext interface
fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/tiering/committer/TieringCommitOperator.java Passes lake tiering config and table info to committer initialization
fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/tiering/committer/TieringCommitOperatorFactory.java Accepts and passes through lake tiering configuration
fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/tiering/LakeTieringJobBuilder.java Adds lake tiering configuration parameter to job builder
fluss-flink/fluss-flink-tiering/src/main/java/org/apache/fluss/flink/tiering/FlussLakeTieringEntrypoint.java Extracts lake tiering config from command-line parameters
fluss-lake/fluss-lake-paimon/src/main/java/org/apache/fluss/lake/paimon/tiering/PaimonLakeCommitter.java Updates to use TableCommitImpl and conditionally enable snapshot expiration based on config
fluss-lake/fluss-lake-paimon/src/main/java/org/apache/fluss/lake/paimon/tiering/PaimonLakeTieringFactory.java Passes full CommitterInitContext to committer instead of just table path
fluss-lake/fluss-lake-paimon/src/test/java/org/apache/fluss/lake/paimon/tiering/PaimonTieringTest.java Adds comprehensive tests for snapshot expiration with various configuration combinations
fluss-lake/fluss-lake-iceberg/src/test/java/org/apache/fluss/lake/iceberg/tiering/IcebergTieringTest.java Updates test to use new CommitterInitContext interface
fluss-lake/fluss-lake-lance/src/test/java/org/apache/fluss/lake/lance/tiering/LanceTieringTest.java Updates test to use new CommitterInitContext interface
fluss-test-coverage/pom.xml Excludes TieringCommitterInitContext from coverage requirements

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

public class FlussLakeTieringEntrypoint {

private static final String FLUSS_CONF_PREFIX = "fluss.";
private static final String LAKE_TIERING_CONFIG_PREFIX = "lake.teiring.";
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the word "tiering": the prefix is misspelled as "lake.teiring." but should be "lake.tiering." to match the actual config option key defined in ConfigOptions.LAKE_TIERING_AUTO_EXPIRE_SNAPSHOT.

Suggested change
private static final String LAKE_TIERING_CONFIG_PREFIX = "lake.teiring.";
private static final String LAKE_TIERING_CONFIG_PREFIX = "lake.tiering.";

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@luoyuxia luoyuxia changed the title [lake] Support auto snapshot expiration for lake table [lake] Support auto snapshot expiration for paimon lake table Dec 19, 2025
Copy link
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LiebingYu Thanks for the pr. Left minor comments. Otherwise, LGTM
Btw, coud you please create a issue for iceberg to respect the snapshot expiration option?

public class FlussLakeTieringEntrypoint {

private static final String FLUSS_CONF_PREFIX = "fluss.";
private static final String LAKE_TIERING_CONFIG_PREFIX = "lake.teiring.";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@LiebingYu
Copy link
Contributor Author

@LiebingYu Thanks for the pr. Left minor comments. Otherwise, LGTM Btw, coud you please create a issue for iceberg to respect the snapshot expiration option?

Sure, created at #2213.

@LiebingYu LiebingYu force-pushed the paimon-expire-snapshot branch from 6d72ff6 to 3884be2 Compare December 20, 2025 08:11
Copy link
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@luoyuxia luoyuxia merged commit 58957fd into apache:main Dec 22, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[lake/paimon] Support snapshot expiration for lake tables

2 participants