-
Notifications
You must be signed in to change notification settings - Fork 101
fix(limited_voters): validate non-voters as failed for scylla <=2025.1 #10636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(limited_voters): validate non-voters as failed for scylla <=2025.1 #10636
Conversation
57a934f
to
8b91539
Compare
@scylladb/qa-maintainers , could you please take a look |
Who decides which nodes are voters and which aren't? Is this constant, or changes with some action (e.g. restart)/time based? |
@soyacz , Topology coordinator automatically decide to which node will be voters and automatically choose on any topology/restart action new voters. |
Looks, there's a minimal number of voters, can we validate at least this? |
@temichus can you verify the testing logic behind the feature? I saw long document describing it and I don't feel competent enough to fully verify it. |
i am not sure we need it. because it will depend on cluster size and topology(dcs, racks, number of nodes). Raft topology will choose, monitor and reconfigure voters only for raft quorum needs. and it will be difficult to predict voters will be changed or not. Max number of voters is 5 (and it could change in future). if cluster has 3 nodes, all node will be voters, if cluster has 6 nodes, only five nodes going to be a voter and which one exactly, raft will be decide on the fly. also voters will be spread between racks and dcs. So this feature is mostly hidden, and if something goes wrong, we will detect by other mechanisms of health validator and schema/topology operations. |
8b91539
to
d286871
Compare
7738629
to
60ea0b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now looks good. please retest and ask some Dev to check the logic.
60ea0b9
to
ba31adc
Compare
@emdroid , can you take a look please |
ba31adc
to
54f7ad5
Compare
New feature Limited raft voters was merged. Starting from 2025.2 not all nodes are raft voters. Update raft check consistency functionality to reported errors if alive non-voters nodes
54f7ad5
to
5bd919e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
In PR: scylladb#10636 after switch to use feature name instead of version was used wrong logic in if statement. Check non voters should be done only if feature GROUP0_LIMITED_VOTERS is not enabled
In PR: #10636 after switch to use feature name instead of version was used wrong logic in if statement. Check non voters should be done only if feature GROUP0_LIMITED_VOTERS is not enabled
In PR: scylladb#10636 after switch to use feature name instead of version was used wrong logic in if statement. Check non voters should be done only if feature GROUP0_LIMITED_VOTERS is not enabled (cherry picked from commit d16d124)
@aleksbykov we need to backport this fix to |
if performance tests use the health validator , then it is also should be backported to that branches and this one also #10651 |
@aleksbykov |
In PR: scylladb#10636 after switch to use feature name instead of version was used wrong logic in if statement. Check non voters should be done only if feature GROUP0_LIMITED_VOTERS is not enabled (cherry picked from commit d16d124)
In PR: scylladb#10636 after switch to use feature name instead of version was used wrong logic in if statement. Check non voters should be done only if feature GROUP0_LIMITED_VOTERS is not enabled
New feature Limited raft voters was merged. Starting from 2025.2 not all nodes are raft voters. Update raft check consistency functionality to not report errors if alive non-voters nodes
for latest sct jobs health validator reported next errors:
Testing
PASSED 4h
PASSED 3h gce
Error is not appeared.
Running multidc
PR pre-checks (self review)
backport
labelsReminders
sdcm/sct_config.py
)unit-test/
folder)