-
Notifications
You must be signed in to change notification settings - Fork 142
Add test for RequiredHostAbsent error #1475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive test coverage for the RequiredHostAbsent error condition and refactors the schema agreement logic to better handle transient failures. The changes are based on semantic improvements from PR #1473 that allow detecting when a required coordinator node becomes unavailable during schema agreement checks.
Key Changes
- Refactored
await_schema_agreement_with_required_nodeto track the last attempt result and return more informative errors on timeout - Added
test_schema_await_with_transient_failureto verify schema agreement succeeds even when initial checks fail - Added
test_schema_await_required_host_absentto test theRequiredHostAbsenterror condition by simulating connection failures to the coordinator node
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| scylla/tests/integration/session/schema_agreement.rs | Adds two new integration tests that verify schema agreement behavior under failure conditions using proxy rules to simulate network issues |
| scylla/src/client/session.rs | Refactors schema agreement implementation to eliminate the helper method and improve error reporting by tracking last attempt results |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
4a1e59e to
18debc9
Compare
|
Rebased on main |
18debc9 to
fc92c9b
Compare
|
Rebased on current version of base PR, addressed comments. |
42cc396 to
4bb38fa
Compare
|
Rebased on main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "CREATE KEYSPACE {ks} WITH | ||
| REPLICATION = {{'class' : 'NetworkTopologyStrategy', 'replication_factor' : 1}}" |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CREATE KEYSPACE statement is split across multiple lines without using a line continuation backslash (\). This will include a newline and indentation whitespace in the SQL string, which is inconsistent with the rest of the codebase.
Consider reformatting to a single line like other tests in this file (e.g., line 43), or use a backslash for line continuation like in simple_strategy.rs:
let mut request = Statement::new(format!(
"CREATE KEYSPACE {ks} WITH REPLICATION = {{'class' : 'NetworkTopologyStrategy', 'replication_factor' : 1}}"
));| "CREATE KEYSPACE {ks} WITH | |
| REPLICATION = {{'class' : 'NetworkTopologyStrategy', 'replication_factor' : 1}}" | |
| "CREATE KEYSPACE {ks} WITH REPLICATION = {{'class' : 'NetworkTopologyStrategy', 'replication_factor' : 1}}" |
|
Doesn't this close #1349? |
|
I don't think so. It doesn't test the things that were changed in that PR (fetching only from one connection). |
|
The test failed on Cassandra, but not in the main part of the test but in droopping the keyspace at the end. Weird. The code is // Cleanup
running_proxy.running_nodes[1].change_request_rules(Some(vec![]));
session.await_schema_agreement().await.unwrap();
session
.query_unpaged(format!("DROP KEYSPACE {ks}"), &[])
.await
.unwrap();with What is weird is that the error mentions SchemaAgreementError alone, not as part of ExecutionError. |
4bb38fa to
7761943
Compare
|
Rebased on main. Added a check that uuid in error is correct. |
|
Now the test passed. I'll rerun it a few times. |
|
Cassandra failed again, but on a different test :D |
SyntaxError will cause schema agreement process to fail immediately after next commits (that will add error classification there).
This changes the logic of schema agreement to allow some error to end the process immediately, without waiting until the timeout. For now the error clasification is not done with a lot of effort, just to have something that doesn't treat transient errors as non-transient, but also classify some errors as non-transient.
7761943 to
858f4af
Compare
After changing schema agreement logic, those tests became slow, as they had to wait until schema agreement timeout in each case, which is by default 60s.
858f4af to
8a8625c
Compare
Based on #1473 (which was merged) and #1479
The above PR changes the semantics in a way that allows us to write the test for RequiredHostAbsent error.
The test adds proxy rules that drop connection when query to system.local is received on node 1 (which is coordinator for DDL) and also prevent opening new connections to this node.
Now schema awaiting will perform check twice:
Test needs to wait until schema agreement timeout - the semantics change means that it won't return error earlier.
If we need to make this test faster, then we need to make
check_schema_agreement_with_required_nodepublic and call it directly.Fixes: #1362
Pre-review checklist
I have provided docstrings for the public items that I want to introduce.I have adjusted the documentation in./docs/source/.Fixes:annotations to PR description.