-
Notifications
You must be signed in to change notification settings - Fork 49
Explicitly disable tablets for SimpleStrategy replication #4555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
@paszkow I will try to take a look at it this week - I would expect that this change mostly affects keyspace creation in SM tests. |
|
So all tests executed against scylla 2024.1.15 failed because tablets weren't a valid option back then. Not sure if it's possible to check scylla version via cql session. Worst case we can try to use the tablets syntax and retry without it if it fails. |
fe91679 to
b626cbd
Compare
I followed what was done in |
7dff6a0 to
250e60b
Compare
250e60b to
313fd63
Compare
|
@paszkow I guess that it's annoying for you to work on this item. How urgent is it to make those changes on SM side? If it can wait for late december / early january, I could take care of that. |
In fact, I was fixing the remaining failing dtests and unit tests and forgot about it for a while. It would be good to move it forward, otherwise, it will root and without it I cannot merge the PR to scylla (it will also root). It's part of getting rid of SimpleStrategy. The sooner is done, the better. It's been hanging for a while already. |
|
This branch has conflicts that must be resolved |
|
@paszkow please rebase and resolve conflicts |
Currently, tablets enabled by default are not enforced. Thus, creating
keyspaces that do not support tablets, silently disable them and
use vnodes.
This patch adds `AND tablets = {'enabled': false}` to explicitly
disable tablets for keyspaces using SimpleStrategy replication.
This is a preparation step for scylladb/scylladb#25342
that changes the current behaviour and forces users to explicitly
disable tablets when enabled by default.
Refs: scylladb/scylladb#25340
313fd63 to
cdc2453
Compare
Done. But I need it be be reviewed and merged. It's the last step that is needed. |
|
Looks fine to me. @Michal-Leszczynski @karol-kokoszka please review and consider merging |
SimpleStrategy is not recommended for the production environments. It also does not support tablets, which are the way towards scylla is moving (in some distant time, vnodes might even be deprecated). Because of that, it also runs into unexpected problems like #4555. Because of those reasons, we should switch from keeping SM data in SimpleStrategy to NetworkTopologyStrategy keyspace by default. For the default, single local node SM DB cluster, both strategies result in the single node containing all of SM data. For single DC SM DB cluster they are also equivalent. On the other hand, for multi DC SM DB cluster (really not recommended...), NetworkTopologyStrategy results in keeping replication_factor replicas per DC and not per cluster. This is theoretically problematic, but such setup is also problematic for SimpleStrategy, which would still replicate the data in all DCs, so it's not a new problem. Moreover, this change only defines the default SM keyspace creation. In case user has some non-default (perhaps not recommended) SM DB setup, they can always create SM keyspace manually before starting SM server to configure it to their liking. Refs #4555
…n tests Because of #4555, but also because it's better to test recommended keyspace replication strategy, we need to replace SimpleStrategy with NetworkTopologyStrategy in tests wherever possible. In case test is aiming to run against SimpleStrategy specifically, it needs to be adjusted so that it ensures that keyspace with SimpleStrategy does not use tablets. Refs #4555
| const createKeyspaceStmt = "CREATE KEYSPACE {{.Keyspace}} WITH replication = {'class': 'SimpleStrategy', 'replication_factor': {{.ReplicationFactor}}}" + | ||
| " AND tablets = {'enabled': false}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This unfortunately won't work with 2024.1, which is currently still supported by SM.
For this scylla version, we can't use the tablets = {'enabled': false} syntax.
Tests are passing because we don't have real e2e tests in SM repo, and those tests use different functions for creating SM keyspace. If we were to execute SCT/DTests with this change and scylla 2024.1, we would fail on SM startup.
| tablets: ${{ env.tablets }} | ||
| - name: Run tests | ||
| run: make pkg-integration-test IP_FAMILY=${{ env.ip-family }} SSL_ENABLED=${{ env.ssl-enabled}} BACKUP_METHOD=${{ env.backup-method }} RESTORE_METHOD=${{ env.restore-method }} PKG=./pkg/service/backup | ||
| run: make pkg-integration-test TABLETS=${{ env.tablets }} IP_FAMILY=${{ env.ip-family }} SSL_ENABLED=${{ env.ssl-enabled}} BACKUP_METHOD=${{ env.backup-method }} RESTORE_METHOD=${{ env.restore-method }} PKG=./pkg/service/backup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, we have a simple script for generating workflows with tests against different scylla versions. In order to modify them, .github/cfg/integration-test-core.yaml should be adjusted and go run .github/cfg/main.go should be executed.
| ksStmt := fmt.Sprintf("CREATE KEYSPACE %s WITH replication = %s", tc.name, tc.replicationStmt) | ||
| if tablets := os.Getenv("TABLETS"); tablets == "enabled" && tc.replication == scyllaclient.SimpleStrategy { | ||
| ksStmt += " AND tablets = {'enabled': false}" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I don't really like relying on checking env variables in tests.
We do it in many places for convenience and out of laziness, because we don't really have vnode or tablet specific tests - we just run all the tests twice, once with tablets as default replication type, once with vnodes, but this is dirty and results in rerunning many tests for no reason at all (as they don't depend on tablet vs vnode).
If possible, it should be replaced with checking scylla version / default replication type.
|
@paszkow I left 3 comments, 2 are more or less nits, but #4555 (comment) is a real one. After spending some time on understanding the issue on the scylla side and the proposed implementation, I came to the conclusion that it would be better to move away from SimpleStrategy in favor of NetworkTopologyStrategy. I created PR with this approach and described my reasoning with more details there. Please take a look, and again, sorry for not reviewing/working on this item for a long time. |
|
In terms of testing/releasing this change, I understand that this PR was created because of a failing DTest executed as a part of scylla CI, is that right? If so, then it's a similar situation to #4578. I can try to rekick failed DTest with custom SM build and custom scylla version, as it was done then. After that, do you need some proper SM release (e.g. SM 3.8.1) containing this change, so that the scylla CI can be executed against it and pass? |
SimpleStrategy is not recommended for the production environments. It also does not support tablets, which are the way towards scylla is moving (in some distant time, vnodes might even be deprecated). Because of that, it also runs into unexpected problems like #4555. Because of those reasons, we should switch from keeping SM data in SimpleStrategy to NetworkTopologyStrategy keyspace by default. For the default, single local node SM DB cluster, both strategies result in the single node containing all of SM data. For single DC SM DB cluster they are also equivalent. On the other hand, for multi DC SM DB cluster (really not recommended...), NetworkTopologyStrategy results in keeping replication_factor replicas per DC and not per cluster. This is theoretically problematic, but such setup is also problematic for SimpleStrategy, which would still replicate the data in all DCs, so it's not a new problem. Moreover, this change only defines the default SM keyspace creation. In case user has some non-default (perhaps not recommended) SM DB setup, they can always create SM keyspace manually before starting SM server to configure it to their liking. Refs #4555
…n tests Because of #4555, but also because it's better to test recommended keyspace replication strategy, we need to replace SimpleStrategy with NetworkTopologyStrategy in tests wherever possible. In case test is aiming to run against SimpleStrategy specifically, it needs to be adjusted so that it ensures that keyspace with SimpleStrategy does not use tablets. Refs #4555
Currently, tablets enabled by default are not enforced. Thus, creating keyspaces that do not support tablets, silently disable them and use vnodes.
This patch adds
AND tablets = {'enabled': false}to explicitly disable tablets for keyspaces using SimpleStrategy replication.This is a preparation step for scylladb/scylladb#25342 that changes the current behaviour and forces users to explicitly disable tablets when enabled by default.
Refs: scylladb/scylladb#25340