Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(nemesis): kill mv building coordinator during mv building #10632

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

aleksbykov
Copy link
Contributor

@aleksbykov aleksbykov commented Apr 12, 2025

Plan to do:

  • simplify code
  • move base logic out of nemesis.py file
  • move create mv funcitonality to seperate file
  • run in parallel restart and wait mv building
  • Prepare new validation steps for mv building with tablets

usefull links:

Additional info:

yes, view_building_coordinator is started and stopped by topolog_coordinator
well, it depends
tables:
currently there are 2 tables: system.view_building_coordinator_tasks and system.view_building_coordinator_staging_sstables. But those tables will be merged into one table described in the doc.
There is already existing table system.view_build_status_v2 - it shows whether a host started or finished building a view. VBC is using this table too
system.built_views - also existing table, if (ks_name, view_name) is in the table, it means the view is built
tbh I'm not happy how logging looks in current implementation, I want to improve it
3.
no - if raft leader is change, view building is not aborted. VBC is aborting a task if corresponding tablet is migrated of resized
it won't be paused. VBC sets task's state to STARTED and new coordinator will simply attach to existing task if there is any
VBC only works for views on tablets, views on vnodes are still using the old path (node-local view_builder)
3.a) currently tasks don't have state in the system table but I'm going to add it. Attaching to existing tasks is already implemented

Testing

  • [ ]

PR pre-checks (self review)

  • I added the relevant backport labels
  • I didn't leave commented-out/debugging code

Reminders

  • Add New configuration option and document them (in sdcm/sct_config.py)
  • Add unit tests to cover my changes (under unit-test/ folder)
  • Update the Readme/doc folder relevant to this change (if needed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant