-
Notifications
You must be signed in to change notification settings - Fork 2.2k
[Enhancement] Repair cloud native table with missing data files #67229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
34c8399 to
8be6314
Compare
Signed-off-by: wyb <[email protected]>
|
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[FE Incremental Coverage Report]✅ pass : 89 / 93 (95.70%) file detail
|
[BE Incremental Coverage Report]✅ pass : 45 / 55 (81.82%) file detail
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR enhances cloud-native table repair functionality to detect and handle missing data files (segments, delete vectors, primary key index sst, cols files). When metadata files exist but their referenced data files are missing, the system can now identify these issues and roll back to valid previous versions during repair operations.
Key Changes:
- Introduced
TabletMetadataEntrystructure to wrap metadata with missing file information - Refactored
TabletMetadatastoTabletResultto better represent tablet-level results with status - Added backend file existence checking functionality that's optionally enabled via
check_missing_filesflag - Implemented validation logic to determine if metadata with missing files can be recovered (e.g., missing only SST files that can be rebuilt)
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| gensrc/proto/lake_service.proto | Refactored protobuf schema: renamed TabletMetadatas to TabletResult, introduced TabletMetadataEntry with missing_files field, and added check_missing_files option to request |
| be/src/service/service_be/lake_service.cpp | Implemented check_missing_files() function to validate file existence for segments, delete vectors, pk index sst, and cols files; integrated checking into get_tablet_metadatas() workflow |
| be/test/service/lake_service_test.cpp | Updated tests to use new TabletResult and TabletMetadataEntry structures; added test cases for missing file detection scenarios |
| fe/fe-core/src/main/java/com/starrocks/lake/TabletRepairHelper.java | Added validation methods checkTabletMetadataValid() and getValidTabletMetadata() to determine if metadata with missing files can be used for repair; integrated missing file detection into repair workflow |
| fe/fe-core/src/test/java/com/starrocks/lake/TabletRepairHelperTest.java | Added comprehensive test coverage for missing file scenarios including cases where only SST files are missing (recoverable) vs. data files missing (non-recoverable) |
| private static TabletMetadataPB getValidTabletMetadata(TabletMetadataEntry metadataEntry) { | ||
| TabletMetadataPB metadata = metadataEntry.metadata; | ||
| List<String> missingFiles = metadataEntry.missingFiles; | ||
| if (missingFiles == null || missingFiles.isEmpty()) { | ||
| // no missing files, metadata is valid | ||
| return metadata; | ||
| } | ||
|
|
||
| // only missing pk index sst files, clear sstableMeta | ||
| if (checkOnlySstFilesMissing(missingFiles)) { | ||
| metadata.sstableMeta = null; | ||
| return metadata; | ||
| } | ||
|
|
||
| Preconditions.checkState(false, "should not reach here"); | ||
| return null; |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method mutates the original metadata object by setting metadata.sstableMeta = null on line 339. Since the metadata object comes from the metadataEntry.metadata field, this directly modifies the metadata stored in the entry, which could have unintended side effects if the entry is accessed later. Consider creating a copy of the metadata before modification to avoid mutating shared state.
|
@cursor review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ Bugbot reviewed your changes and found no bugs!



Why I'm doing:
What I'm doing:
This PR introduces functionality to identify and handle missing data files (segments, delete vectors, primary key index sst, cols files) when repairing cloud-native tables.
TabletMetadatastoTabletResultand introducedTabletMetadataEntryto includemissing_filesinformation.LakeServiceImpl::get_tablet_metadatasto optionally perform missing file checks.TabletRepairHelper.javato leverage the new structures and integrate the missing file detection and repair logic.#66015
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check:
Note
Introduces per-version tablet metadata results with optional missing-file detection to improve repair workflows.
check_missing_filestoGetTabletMetadatasRequest; replaceTabletMetadataswithTabletResultandTabletMetadataEntry(withmissing_files); response now returnstablet_results.get_tablet_metadatasnow emitstablet_resultsand, when enabled, checks existence ofsegments,delete vectors,pk index sst, andcolsfiles; new helpercheck_missing_files; updated status handling and logs; corresponding tests added/updated.TabletRepairHelperupdated to consumetablet_results, request missing-file checks, select valid metadata (accepts only sst-missing, clearssstableMeta), and proceed with repair; comprehensive unit tests adjusted and expanded.Written by Cursor Bugbot for commit 3ab362e. This will update automatically on new commits. Configure here.