Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Support incremental compute for partition stats #12629

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

ajantha-bhat
Copy link
Member

@ajantha-bhat ajantha-bhat commented Mar 24, 2025

If the previous stats file exist, no need to compute the stats from the scratch.

Identify the latest snapshot for which partition stats file exist. Read the previous stats, incrementally compute the stats for new snapshots, merge the stats and write them to the new file.

@ajantha-bhat
Copy link
Member Author

Some engines want to synchronously write the partition stats. (Similar to how Trino synchronously write the puffin files during insert). Reading all the manifests in a table can be avoided to compute partition stats if the we compute the stats incrementally and merge it with previous stats.

@aokolnychyi, @pvary, @deniskuzZ, @gaborkaszab : Let me know what you guys think.

PartitionMap<PartitionStats> statsMap = PartitionMap.create(table.specs());
// read previous stats
try (CloseableIterable<PartitionStats> oldStats =
readPartitionStatsFile(statsFileSchema, Files.localInput(statisticsFile.path()))) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the new unified tuple is used for reading the old stats file. It automatically handled the schema evolution.

@ajantha-bhat ajantha-bhat force-pushed the incremental branch 2 times, most recently from 5446ee0 to 9b9f5ad Compare March 24, 2025 15:59
@ajantha-bhat ajantha-bhat added this to the Iceberg 1.9.0 milestone Mar 27, 2025
@ajantha-bhat
Copy link
Member Author

I have added 1.9.0 milestone for this PR as it is a small change (excluding refactoring) and we still have some time for 1.9.0 sue to open issues in milestone.

manifestFilePredicate =
manifestFile ->
snapshotIdsRange.contains(manifestFile.snapshotId())
&& !manifestFile.hasExistingFiles();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we want this as a default predicate?

manifestFile -> !manifestFile.hasExistingFiles()

Copy link
Member

@deniskuzZ deniskuzZ Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could add it as default filter:

if (fromSnapshot != null) {
  manifestFilePredicate =
      manifestFile -> snapshotIdsRange.contains(manifestFile.snapshotId())
}
List<ManifestFile> manifests =
  currentSnapshot.allManifests(table.io()).stream()
      .filter(manifestFilePredicate)
      .filter(manifestFile -> !manifestFile.hasExistingFiles())
      .collect(Collectors.toList());

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point.

While computing incremental, I observed that it may become duplicate counts. So, I added.
I do have some gaps, I need to understand fully when and all we mark manifest entry as existing.
Is there any scenario exist to consider "existing" entries or just "added" is enough?

There is another check down below, that considers both added and existing (added long back).

I will update the code to just keep added entry and also add a testcase of rewrite data files to ensure stats are same after the rewrite.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, looks like ManifestFile can have both added and existing entries together? So, Instead of filtering here. I will keep filtering just at the entries level down below in collectStatsForManifest

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we have compaction and expire snapshots? new manifests would have the EXISTING entries?

Copy link
Contributor

@pvary pvary Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we do with the stats of the removed files?

Lets say:

  • S1 adds data
  • Execute the stats collection
  • S2 adds more data
  • S3 compacts data from S1, and S2 - This removes files created by S1, and S2 and creates new files
  • Execute incremental/normal stats collection

What happens with the stats in this case?

Copy link
Member

@deniskuzZ deniskuzZ Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compaction doesn't remove the data. if we expire S1 and S2 we don't have prev snapshots/stats and start fresh (i.e. full compute)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't expire data, could we detect that S3 is only a compaction commit, and the stats don't need to be changed?

What if S3 instead is a MoW commit? Can we detect the changes and calculate stats incrementally?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Compaction will have snapshot operation as REPLACE and we can reuse the old stats for that scenario. But need to write the new stats file with same data to handle clean GC of snapshot files.

Compaction will be tested end to end while adding the spark procedure.

  1. About the live (existing + added),

For full compute, old manifest files will be marked as deleted and entries will be reused as existing in the manifest files + may have additional added entry. So, for full compute need to consider both existing and added.

For incremental compute, old stats file has some entires which are now existing. So, should consider the existing entires.

This all leads to the next question, what happens when manifest is deleted. That case we just update the snapshot entry (last modified) and not decrement the stats. Hence, we should skip it for incremental compute again.

All these logic present in collectStatsForManifest and existing testcases (full compute and incremental) covers it as it uses mergeAppend which produces manifest mix of added and existing entires.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We didn't need decrement stats for full compute because we were discarding the deleted manifests. Only considering live manifests.

Now, I am not really sure for compaction, the current code will work. We may need decrement stats just for incremental compute. I will test compaction scenario tomorrow and handle this.

Copy link
Member

@deniskuzZ deniskuzZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

Comment on lines 193 to 197
PartitionStatisticsFile statisticsFile = latestStatsFile(table, snapshot.snapshotId());
if (statisticsFile == null) {
LOG.info("Previous stats not found. Computing the stats for whole table.");
return PartitionStatsUtil.computeStats(table, null, snapshot);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this throw an error instead?

Copy link
Member

@deniskuzZ deniskuzZ Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? That is to handle the case when no stats files existed before, and we need to execute a full computation.
We enter here when computing stats for the first time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the user requested incremental stat compute, but with wrong parameters. In this case we could either "correct" the mistake or throw an error.

The question is, how frequent is the problem, and how easy is to detect from the user side

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean by wrong parameters? non existing snapshotId?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean by wrong parameters? non existing snapshotId?

Exactly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throwing error now and added the testcase

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't agree with that design, see #12629 (comment)

@ajantha-bhat
Copy link
Member Author

ajantha-bhat commented Mar 27, 2025

@deniskuzZ, @pvary: Thanks guys for the review. I have addressed all the comments. You can take a fresh look again tomorrow :D (after some break :D)

Table table, Snapshot snapshot, StructType partitionType) throws IOException {
PartitionStatisticsFile statisticsFile = latestStatsFile(table, snapshot.snapshotId());
if (statisticsFile == null) {
throw new RuntimeException(
Copy link
Member

@deniskuzZ deniskuzZ Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's user-friendly + recompute flag loses it's purpose (you can call directly computeAndWriteStats).
Now every client needs to implement either the same prev stats file check or do the try-catch.

try {
   computeAndWriteStatsFileIncremental()
} catch (RuntimeException e) {
 if (e.getMessage().equals("bla-bla")) {
   computeAndWriteStats
 }
}

I would expect from computeAndWriteStatsFileIncremental do what's needed instead throwing Previous stats not found exception.

Non-existent snapshotId is a diff situation. We should validate if snapshot == null and throw Snapshot doesn't exist

Copy link
Member Author

@ajantha-bhat ajantha-bhat Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recompute flag loses it's purpose

there is no recompute flag exposed to the user. The private method (incrementalComputeAndMerge) which is throwing this exception is also always computing incrementally.

I would expect from computeAndWriteStatsFileIncremental do what's needed instead throwing Previous stats not found exception.

computeAndWriteStatsFileIncremental says incremental compute. Forcefully recomputing when there is an error is not a good idea as the method's responsibility is just to try incremental compute?

Maybe I can expose another method called computeAndWriteStatsWithFallback(), which will internally calls it?

public void computeAndWriteStatsIncrementalWithFallback() {
   try {
       computeAndWriteStatsFileIncremental();
   } catch (RuntimeException e) {
       if ("bla-bla".equals(e.getMessage())) {
           computeAndWriteStats(); // Fallback in case of a specific error
       } else {
           throw e; // Re-throw unexpected errors
       }
   }
}

Copy link
Member

@deniskuzZ deniskuzZ Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked how you did it initially. Please disregard the recompute flag comment, it has nothing to do with the incremental workflow.

Think about what changes are needed on the client side. I was planning just to replace the existing call to the incremental one unless it's ANALYZE TABLE (force recompute).

What are the use-cases we would benefit from the prev stats file missing exception?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets see what @pvary thinks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants