Skip to content

Conversation

@luohaha
Copy link
Contributor

@luohaha luohaha commented Dec 26, 2025

Why I'm doing:

This PR fixes metadata access errors in Primary Key tables when the same tablet is duplicated across different disks on the same BE node.

Problem

For Primary Key tables, rowset data includes not only data files but also metadata in KVStore (Delete Vectors and Delta Column Groups). Previously, KVStore references were passed as parameters through multiple method calls, which could lead to accessing the wrong KVStore when:

  • A tablet exists on multiple disks within the same BE node
  • Different data directories have their own KVStore instances
  • Methods were called without the correct KVStore reference

This resulted in:

  • Delete Vector not found errors
  • Metadata corruption
  • Incorrect query results

What I'm doing:

1. Store KVStore reference in Rowset

  • Added _kvstore member variable to the Rowset class
  • Modified Rowset constructor to accept and store KVStore pointer
  • Ensures each Rowset always uses the correct KVStore for its tablet

2. Simplify method signatures
Removed KVStore parameters from methods that now use the internal reference:

  • remove_delta_column_group() - no longer needs KVStore parameter
  • link_files_to() - simplified signature
  • copy_files_to() - simplified signature
  • get_segment_iterators2() - replaced separate meta/dcg_meta params with MetaLoadMode

3. Introduce MetaLoadMode enum
Added explicit control for metadata loading:

  • NONE: No metadata loaded
  • DELETE_VEC_ONLY: Load only Delete Vectors
  • DCG_ONLY: Load only Delta Column Groups
  • ALL: Load both Delete Vectors and DCGs

This makes the API clearer and prevents ambiguity about which metadata to load.

Changes

Core files:

  • be/src/storage/rowset/rowset.h - Added _kvstore member and MetaLoadMode enum
  • be/src/storage/rowset/rowset.cpp - Updated methods to use internal _kvstore
  • be/src/storage/rowset/segment_iterator.cpp - Simplified DCG loading logic
  • be/src/storage/rowset/segment_options.h - Removed redundant flag

Callers updated:

  • tablet_updates.cpp - Use MetaLoadMode::ALL for schema conversion
  • primary_index.cpp - Use MetaLoadMode::ALL for index construction
  • local_primary_key_recover.cpp - Use MetaLoadMode::NONE for recovery
  • primary_key_dump.cpp - Use MetaLoadMode::NONE for key dumping
  • And other related files

Benefits

  1. Correctness: Prevents metadata access errors when tablets are duplicated across disks
  2. Simplicity: Cleaner API with fewer parameters to pass around
  3. Clarity: Explicit MetaLoadMode makes intent clear at call sites
  4. Maintainability: Centralized KVStore management in Rowset

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
    • This pr needs auto generate documentation
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 4.0
    • 3.5
    • 3.4
    • 3.3

Note

Addresses PK-table metadata correctness by centralizing metadata access in Rowset and making loading explicit.

  • Add _kvstore to Rowset; constructor/factory updated and used across tablet load, writer, clone, snapshot, migration, and tests
  • Replace KVStore params in get_segment_iterators2, link_files_to, copy_files_to, remove_delta_column_group with internal _kvstore
  • Introduce MetaLoadMode {NONE, DELETE_VEC_ONLY, DCG_ONLY, ALL} and wire through segment iteration; remove read_by_generated_column_adding
  • Ensure delvec/DCG loaders and DCG file ops (scan/link/copy/delete) use _kvstore to avoid cross-disk metadata mix-ups
  • Update call sites to choose appropriate mode (e.g., ALL for index/compaction/schema-change; NONE for dumps/recovery)
  • Simplify DCG loading logic in segment_iterator.cpp; tidy related options

Written by Cursor Bugbot for commit df26c5e. This will update automatically on new commits. Configure here.

@github-actions
Copy link

[BE Incremental Coverage Report]

fail : 50 / 66 (75.76%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 src/storage/task/engine_clone_task.cpp 0 1 00.00% [937]
🔵 src/storage/data_dir.cpp 0 2 00.00% [370, 371]
🔵 src/storage/primary_index.cpp 0 1 00.00% [1241]
🔵 src/storage/tablet.cpp 1 2 50.00% [238]
🔵 src/storage/tablet_updates.cpp 6 11 54.55% [4035, 4036, 4255, 4256, 4257]
🔵 src/storage/rowset/rowset.h 3 4 75.00% [206]
🔵 src/storage/snapshot_manager.cpp 3 4 75.00% [511]
🔵 src/storage/rowset/rowset.cpp 24 28 85.71% [403, 404, 435, 669]
🔵 src/storage/schema_change.cpp 2 2 100.00% []
🔵 src/storage/rowset_merger.cpp 4 4 100.00% []
🔵 src/storage/rowset/rowset_factory.cpp 1 1 100.00% []
🔵 src/storage/rowset/rowset_writer.cpp 2 2 100.00% []
🔵 src/storage/rowset_update_state.cpp 1 1 100.00% []
🔵 src/storage/update_compaction_state.cpp 1 1 100.00% []
🔵 src/storage/rowset/segment_iterator.cpp 2 2 100.00% []

Signed-off-by: luohaha <[email protected]>
@github-actions
Copy link

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

@github-actions
Copy link

[FE Incremental Coverage Report]

pass : 0 / 0 (0%)

@alvin-celerdata
Copy link
Contributor

@cursor review

RETURN_IF_ERROR(
RowsetFactory::create_rowset(_context.tablet_schema, _context.rowset_path_prefix, rowset_meta, &rowset));
TabletSharedPtr tablet = StorageEngine::instance()->tablet_manager()->get_tablet(_context.tablet_id);
RETURN_IF(tablet == nullptr, Status::InvalidArgument(fmt::format("Not Found tablet:{}", _context.tablet_id)));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rowset build fails when tablet doesn't exist during restore

The RowsetWriter::build() method now requires the tablet to exist by looking it up via tablet_manager()->get_tablet() and returning an error if null. Previously, the rowset was created without requiring the tablet to exist. This breaks snapshot restore flows where _rename_rowset_id calls rs_writer->build() but the tablet hasn't been created yet. In SnapshotManager::convert_rowset_ids, rowsets are renamed and rebuilt before the tablet exists, causing the new null check to fail with "Not Found tablet" error.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants