Skip to content

Conversation

@paszkow
Copy link

@paszkow paszkow commented Aug 22, 2025

Currently, tablets enabled by default are not enforced. Thus, creating keyspaces that do not support tablets, silently disable them and use vnodes.

This patch adds AND tablets = {'enabled': false} to explicitly disable tablets for keyspaces using SimpleStrategy replication.

This is a preparation step for scylladb/scylladb#25342 that changes the current behaviour and forces users to explicitly disable tablets when enabled by default.

Refs: scylladb/scylladb#25340

@Michal-Leszczynski
Copy link
Collaborator

@paszkow I will try to take a look at it this week - I would expect that this change mostly affects keyspace creation in SM tests.

@Michal-Leszczynski
Copy link
Collaborator

So all tests executed against scylla 2024.1.15 failed because tablets weren't a valid option back then. Not sure if it's possible to check scylla version via cql session. Worst case we can try to use the tablets syntax and retry without it if it fails.

@paszkow paszkow force-pushed the disable_tablets_explicitly branch from fe91679 to b626cbd Compare September 23, 2025 11:23
@paszkow
Copy link
Author

paszkow commented Sep 23, 2025

So all tests executed against scylla 2024.1.15 failed because tablets weren't a valid option back then. Not sure if it's possible to check scylla version via cql session. Worst case we can try to use the tablets syntax and retry without it if it fails.

I followed what was done in worker_validate_integration_test.go and used if tablets := os.Getenv("TABLETS"); tablets == "enabled" to use tablet syntax.

@paszkow paszkow force-pushed the disable_tablets_explicitly branch 2 times, most recently from 7dff6a0 to 250e60b Compare September 24, 2025 07:24
@paszkow paszkow force-pushed the disable_tablets_explicitly branch from 250e60b to 313fd63 Compare November 24, 2025 12:06
@Michal-Leszczynski
Copy link
Collaborator

@paszkow I guess that it's annoying for you to work on this item. How urgent is it to make those changes on SM side? If it can wait for late december / early january, I could take care of that.

@paszkow
Copy link
Author

paszkow commented Nov 24, 2025

@paszkow I guess that it's annoying for you to work on this item. How urgent is it to make those changes on SM side? If it can wait for late december / early january, I could take care of that.

In fact, I was fixing the remaining failing dtests and unit tests and forgot about it for a while. It would be good to move it forward, otherwise, it will root and without it I cannot merge the PR to scylla (it will also root). It's part of getting rid of SimpleStrategy.

The sooner is done, the better. It's been hanging for a while already.

@bhalevy
Copy link
Member

bhalevy commented Dec 21, 2025

This branch has conflicts that must be resolved

@bhalevy
Copy link
Member

bhalevy commented Dec 23, 2025

@paszkow please rebase and resolve conflicts

Currently, tablets enabled by default are not enforced. Thus, creating
keyspaces that do not support tablets, silently disable them and
use vnodes.

This patch adds `AND tablets = {'enabled': false}` to explicitly
disable tablets for keyspaces using SimpleStrategy replication.

This is a preparation step for scylladb/scylladb#25342
that changes the current behaviour and forces users to explicitly
disable tablets when enabled by default.

Refs: scylladb/scylladb#25340
@paszkow paszkow force-pushed the disable_tablets_explicitly branch from 313fd63 to cdc2453 Compare December 23, 2025 10:40
@paszkow
Copy link
Author

paszkow commented Dec 23, 2025

@paszkow please rebase and resolve conflicts

Done. But I need it be be reviewed and merged. It's the last step that is needed.

@bhalevy
Copy link
Member

bhalevy commented Dec 23, 2025

Looks fine to me.

@Michal-Leszczynski @karol-kokoszka please review and consider merging

Michal-Leszczynski added a commit that referenced this pull request Dec 29, 2025
SimpleStrategy is not recommended for the production environments.
It also does not support tablets, which are the way towards scylla
is moving (in some distant time, vnodes might even be deprecated).
Because of that, it also runs into unexpected problems like
#4555.

Because of those reasons, we should switch from keeping SM data in
SimpleStrategy to NetworkTopologyStrategy keyspace by default.

For the default, single local node SM DB cluster, both strategies
result in the single node containing all of SM data. For single DC
SM DB cluster they are also equivalent. On the other hand, for multi
DC SM DB cluster (really not recommended...), NetworkTopologyStrategy
results in keeping replication_factor replicas per DC and not per cluster.
This is theoretically problematic, but such setup is also problematic for
SimpleStrategy, which would still replicate the data in all DCs, so it's
not a new problem. Moreover, this change only defines the default SM
keyspace creation. In case user has some non-default (perhaps not recommended)
SM DB setup, they can always create SM keyspace manually before starting
SM server to configure it to their liking.

Refs #4555
Michal-Leszczynski added a commit that referenced this pull request Dec 29, 2025
…n tests

Because of #4555, but also because
it's better to test recommended keyspace replication strategy, we need to replace
SimpleStrategy with NetworkTopologyStrategy in tests wherever possible.
In case test is aiming to run against SimpleStrategy specifically, it needs to be
adjusted so that it ensures that keyspace with SimpleStrategy does not use tablets.

Refs #4555
Comment on lines +61 to +62
const createKeyspaceStmt = "CREATE KEYSPACE {{.Keyspace}} WITH replication = {'class': 'SimpleStrategy', 'replication_factor': {{.ReplicationFactor}}}" +
" AND tablets = {'enabled': false}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unfortunately won't work with 2024.1, which is currently still supported by SM.
For this scylla version, we can't use the tablets = {'enabled': false} syntax.
Tests are passing because we don't have real e2e tests in SM repo, and those tests use different functions for creating SM keyspace. If we were to execute SCT/DTests with this change and scylla 2024.1, we would fail on SM startup.

tablets: ${{ env.tablets }}
- name: Run tests
run: make pkg-integration-test IP_FAMILY=${{ env.ip-family }} SSL_ENABLED=${{ env.ssl-enabled}} BACKUP_METHOD=${{ env.backup-method }} RESTORE_METHOD=${{ env.restore-method }} PKG=./pkg/service/backup
run: make pkg-integration-test TABLETS=${{ env.tablets }} IP_FAMILY=${{ env.ip-family }} SSL_ENABLED=${{ env.ssl-enabled}} BACKUP_METHOD=${{ env.backup-method }} RESTORE_METHOD=${{ env.restore-method }} PKG=./pkg/service/backup
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, we have a simple script for generating workflows with tests against different scylla versions. In order to modify them, .github/cfg/integration-test-core.yaml should be adjusted and go run .github/cfg/main.go should be executed.

Comment on lines +133 to +136
ksStmt := fmt.Sprintf("CREATE KEYSPACE %s WITH replication = %s", tc.name, tc.replicationStmt)
if tablets := os.Getenv("TABLETS"); tablets == "enabled" && tc.replication == scyllaclient.SimpleStrategy {
ksStmt += " AND tablets = {'enabled': false}"
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I don't really like relying on checking env variables in tests.
We do it in many places for convenience and out of laziness, because we don't really have vnode or tablet specific tests - we just run all the tests twice, once with tablets as default replication type, once with vnodes, but this is dirty and results in rerunning many tests for no reason at all (as they don't depend on tablet vs vnode).
If possible, it should be replaced with checking scylla version / default replication type.

@Michal-Leszczynski
Copy link
Collaborator

@paszkow I left 3 comments, 2 are more or less nits, but #4555 (comment) is a real one.

After spending some time on understanding the issue on the scylla side and the proposed implementation, I came to the conclusion that it would be better to move away from SimpleStrategy in favor of NetworkTopologyStrategy. I created PR with this approach and described my reasoning with more details there. Please take a look, and again, sorry for not reviewing/working on this item for a long time.

@Michal-Leszczynski
Copy link
Collaborator

In terms of testing/releasing this change, I understand that this PR was created because of a failing DTest executed as a part of scylla CI, is that right? If so, then it's a similar situation to #4578. I can try to rekick failed DTest with custom SM build and custom scylla version, as it was done then. After that, do you need some proper SM release (e.g. SM 3.8.1) containing this change, so that the scylla CI can be executed against it and pass?

Michal-Leszczynski added a commit that referenced this pull request Dec 30, 2025
SimpleStrategy is not recommended for the production environments.
It also does not support tablets, which are the way towards scylla
is moving (in some distant time, vnodes might even be deprecated).
Because of that, it also runs into unexpected problems like
#4555.

Because of those reasons, we should switch from keeping SM data in
SimpleStrategy to NetworkTopologyStrategy keyspace by default.

For the default, single local node SM DB cluster, both strategies
result in the single node containing all of SM data. For single DC
SM DB cluster they are also equivalent. On the other hand, for multi
DC SM DB cluster (really not recommended...), NetworkTopologyStrategy
results in keeping replication_factor replicas per DC and not per cluster.
This is theoretically problematic, but such setup is also problematic for
SimpleStrategy, which would still replicate the data in all DCs, so it's
not a new problem. Moreover, this change only defines the default SM
keyspace creation. In case user has some non-default (perhaps not recommended)
SM DB setup, they can always create SM keyspace manually before starting
SM server to configure it to their liking.

Refs #4555
Michal-Leszczynski added a commit that referenced this pull request Dec 30, 2025
…n tests

Because of #4555, but also because
it's better to test recommended keyspace replication strategy, we need to replace
SimpleStrategy with NetworkTopologyStrategy in tests wherever possible.
In case test is aiming to run against SimpleStrategy specifically, it needs to be
adjusted so that it ensures that keyspace with SimpleStrategy does not use tablets.

Refs #4555
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants