Skip to content

Automated backups are broken with ScyllaDB OS >=6.0 and Enterprise >=2024.2 when ScyllaClusters are set up with AuthN #2548

Open
@rzetelskik

Description

@rzetelskik

What happened?

As per https://manager.docs.scylladb.com/stable/sctool/backup.html#skip-schema:
For ScyllaDB versions starting at 6.0 and 2024.2, SM requires CQL credentials to back up the schema. This is the case since SM 3.3.2: scylladb/scylla-manager#4008.

We don't set up AuthN or AuthZ in our ScyllaDB Manager integration tests. When set up with AuthN and AuthZ, backup doesn't succeed: see e.g. https://prow.scylla-operator.scylladb.com/view/gs/scylla-operator-prow/pr-logs/pull/scylladb_scylla-operator/2544/pull-scylla-operator-master-e2e-gke-parallel/1899420143805534208#1:test-build-log.txt%3A2435.

What did you expect to happen?

Automated backup procedure should work when AuthN and AuthZ are configured.

How can we reproduce it (as minimally and precisely as possible)?

See https://github.com/rzetelskik/scylla-operator/compare/b3c2a3871da6095c64bacb7986f960e10af042e1..45bd57c6e7c36ba475604c1c8c80d4d71071269e.

Scylla Operator version

master

Kubernetes platform name and version

n/a

Please attach the must-gather archive.

https://gcsweb.scylla-operator.scylladb.com/gcs/scylla-operator-prow/pr-logs/pull/scylladb_scylla-operator/2544/pull-scylla-operator-master-e2e-gke-parallel/1899420143805534208/ (e2e collection failed here as CI is broken now)

Anything else we need to know?

Backups can be configured to skip schema backup. However, in that case, the restore procedure is impossible without an annoying workaround.

We have a few ways to approach this:

  1. Update the documentation with information about providing credentials to SM server before scheduling a backup. We can't claim support for automated backups in that case though.
  2. On cluster creation, create a dedicated user for operator/manager use through the maintenance socket and provide credentials to SM on cluster registration. This would require providing client certificate and key to SM for SM to use CQL over SSL (ref Configure scylla-manager with CQL over TLS #2398, most likely impossible with unmanaged SM Edit: possible, SM API takes cert/key pair in bytes despite the name indicating a filepath). In this case we should also switch to using HTTPS for manager-controller to SM API communication,
  3. SM could provide a way for agent to use the maintenance socket to describe schema. This would require some effort from SM developers. In that case, SM agent would effectively serve as reverse proxy for the maintenance socket, which in the current state would be insecure with just the auth token. Edit: see Michal's comment below.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions