Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

integration: direct node connectivity test #1189

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

wprzytula
Copy link
Collaborator

Problem

I noticed that we lacked a test that requires direct connectivity to all nodes in the cluster in order to pass. Other tests would often succeed even if only one node (the initial contact point) was directly reachable.

The issue has manifested in testing serverless cloud why working on rustls support: tests would pass even with address translation disabled...

Side note

An example of a test which has exercised direct connectivity to all nodes is tablets.rs, yet it could be enabled only for non-cloud case, as it uses the proxy.

Solution

I wrote a test that iterates though all targets (pairs (node, shard)) and sends a request directly to them (using a load balancing policy that produces singleton query plans).

Bonus

I extracted and refactored some utils from tablets.rs to utils.rs, so that they can be used by other tests.

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • [ ] I have provided docstrings for the public items that I want to introduce.
  • [ ] I have adjusted the documentation in ./docs/source/.
  • [ ] I added appropriate Fixes: annotations to PR description.

The following helpers are all usable universally, not only for tablets:
- SingleTargetLBP,
- send_statement_everywhere,
- send_unprepared_query_everywhere,

so it makes sense to extract them to utils.rs.
The following helpers were renamed to better convey their semantics:
- `send_statement_everywhere` -> `execute_prepared_statement_everywhere`,
- `send_unprepared_query_everywhere` ->
  `execute_unprepared_statement_everywhere`.
…y values

`execute_unprepared_statement_everywhere` was limited to empty values;
now any value list is supported.
`execute_(un)prepared_statement_everywhere` both have similar logic that
can be extracted to a generic function. `for_each_target_execute`
extraction will be even more important in the next commit, where we make
its logic a bit more complex not to `unwrap()` the Sharder.
Before, `for_each_target_execute` would panic on absence of Sharder
for a Node. This could be caused by two main reasons:
1) the node was unsharded (e.g., a Cassandra node),
2) there were no open connections to the node, e.g. due to the node
   being down.

In both cases, we prefer not to panic in `for_each_target_execute` but
rather assume the node is unsharded and proceed. In the case 1),
execution of the request should succeed, whereas in 2) we should get
the ConnectionPoolError.
I noticed that we lacked a test that requires direct connectivity to all
nodes in the cluster in order to pass. Other tests would often succeed
even if only one node (the initial contact point) was directly
reachable.

An example of a test which has exercised direct connectivity to all
nodes is tablets.rs, yet it could be enabled only for non-cloud case,
as it uses the proxy.
@wprzytula wprzytula added the area/testing Related to tests - unit, integration, or even out of repo label Feb 1, 2025
@wprzytula wprzytula self-assigned this Feb 1, 2025
Copy link

github-actions bot commented Feb 1, 2025

cargo semver-checks found no API-breaking changes in this PR! 🎉🥳
Checked commit: fdaa97e

Comment on lines 366 to 372

// I expect Scylla to not send feedback for unprepared queries,
// as such queries cannot be token-aware anyway
send_unprepared_query_everywhere(
execute_unprepared_statement_everywhere(
&session,
session.get_cluster_data().as_ref(),
&Query::new(format!("INSERT INTO {ks}.t (a, b, c) VALUES (1, 1, 'abc')")),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I don't really like "execute" used with unprepared statements, because EXECUTE is a CQL command to execute prepared statements. Would "send_unprepared_statement_everywhere" be an acceptable name?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking that once we do the execution API refactor for 2.0, there will be only one method for all kinds of statements: execute. Do you agree? If so, then why not use it here?

Copy link
Collaborator

@Lorak-mmk Lorak-mmk Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about it lately, and I came to the conclusion that even in new execution API we should keep separate methods for unprepared statements, prepared statements, and batches.
Why? The choice of the query type should be more concious, a single method makes this less obvious.
Also the interface would be simpler to learn: user would still have methods of the struct that accept simple types, and not traits which then user needs to research and learn what actually implements them.

What I think request execution refactor should be mostly about is enabling configuration of the request, meaning we can do session.execute(something).with_timestamp(....).paging_iter().await instead of having an exponential amount of methods.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, makes sense.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTOH, the fact that the CQL protocol has some specific names for execution of prepared and unprepared statements does not imply that those names are a good fit for the names of high-level user-facing API functions. It's not intuitive at all for anyone not well-versed in the CQL protocol that "query" is related to unprepared statements and "execute" to prepared statements.

If in the future we were to have distinct names for execution of different types of statements, then I'd go for:

  • execute_unprepared(),
  • execute_prepared(),
  • execute_batch().

The names then make it perfectly clear that the action is execution and the object is a particular kind of a statement.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the names do not need to be the same (but in that case the documentation should clearly state which CQL command it corresponds to).
The proposed names are a bit too long for my taste.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/testing Related to tests - unit, integration, or even out of repo
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants