Description
Description
When Azure is used as the cloud provider for backup storage, the account name is used to construct the endpoint URL. For example, if the account name were foo
, the TLD for the endpoint URL would be foo.blob.core.windows.net
. If the account name were not provided, then Scylla Manager wouldn't be able to construct the endpoint URL. Therefore, the account
parameter is pretty much mandatory.
However, if the account
parameter is not set, the SM agent does not fail gracefully with a clear error message. Instead, it tries to connect to an invalid endpoint, such as https://.blob.core.windows.net
.
Steps to reproduce
-
On each node, ensure that the agent configuration file (
/etc/scylla-manager-agent/scylla-manager-agent.yaml
) does not have theaccount
parameter configured for the Azure provider. -
Execute a backup command using an Azure bucket, like the following:
sctool backup --cluster prod-cluster --cron '35 5 * * *' --name 'daily' --location azure:prodbucket --retention 10 --dry-run
The backup should fail.
- On each node, check the systemd journal for the agent:
sudo journalctl -u scylla-manager-agent
The reason for the failure is hidden within, but it isn't very clear. See below for details.
- On each node, fix the agent configuration file, so that the
account
parameter is configured:
azure:
account: foo
- On each node, restart the agent:
sudo systemctl restart scylla-manager-agent
- Try the backup again:
sctool backup --cluster prod-cluster --cron '35 5 * * *' --name 'daily' --location azure:prodbucket --retention 10 --dry-run
This time, the backup should succeed.
Expected results
The agent should provide a clear error message that the account name has not been configured, so the endpoint URL can't be constructed. For example, something like this:
The Azure account parameter is empty or not set, so the endpoint URL can't be constructed. Set it in /etc/scylla-manager-agent/scylla-manager-agent.yaml and restart the agent.
Actual results
Instead of failing with a clear error message, the backup says that the location is inaccessible:
$ sctool backup --cluster prod-cluster --cron '35 5 * * *' --name 'daily' --location azure:prodbucket --retention 10 --dry-run
NOTICE: this may take a while, we are performing disk size calculations on the nodes
Error: get backup target: location is not accessible
10.175.0.4: giving up after 2 attempts: after 30s: context deadline exceeded
Trace ID: 2_bjbc54Rm6BBvzDIWvV9w (grep in scylla-manager logs)
And the systemd journal on the agent shows a long stack trace with the reason for the failure hidden in it:
{"L":"ERROR","T":"2024-12-14T18:26:29.444Z","N":"rclone","M":"scylla-manager-agent-3178303808: Failed to Mkdir: -> github.com/Azure/azure-pipeline-go/pipeline.NewError, github.com/Azure/[email protected]/pipeline/error.go:157\nHTTP request failed\n\nPut \"https://.blob.core.windows.net/prodbucket?restype=container&timeout=31536001\": context canceled\n","S":"github.com/scylladb/go-log.Logger.log\n\tgithub.com/scylladb/[email protected]/logger.go:101\ngithub.com/scylladb/go-log.Logger.Error\n\tgithub.com/scylladb/[email protected]/logger.go:84\nmain.(*server).init.RedirectLogPrint.func1\n\tgithub.com/scylladb/scylla-manager/v3/pkg/rclone/logger.go:19\ngithub.com/rclone/rclone/fs.LogPrintf\n\tgithub.com/rclone/[email protected]/fs/log.go:152\ngithub.com/rclone/rclone/fs.Errorf\n\tgithub.com/rclone/[email protected]/fs/log.go:167\ngithub.com/rclone/rclone/fs/sync.copyEmptyDirectories\n\tgithub.com/rclone/[email protected]/fs/sync/sync.go:623\ngithub.com/rclone/rclone/fs/sync.(*syncCopyMove).run\n\tgithub.com/rclone/[email protected]/fs/sync/sync.go:903\ngithub.com/rclone/rclone/fs/sync.runSyncCopyMove\n\tgithub.com/rclone/[email protected]/fs/sync/sync.go:1115\ngithub.com/rclone/rclone/fs/sync.CopyDir\n\tgithub.com/rclone/[email protected]/fs/sync/sync.go:1126\ngithub.com/scylladb/scylla-manager/v3/pkg/rclone/operations.CheckPermissions\n\tgithub.com/scylladb/scylla-manager/v3/pkg/rclone/operations/operations.go:92\ngithub.com/scylladb/scylla-manager/v3/pkg/rclone/rcserver.rcCheckPermissions\n\tgithub.com/scylladb/scylla-manager/v3/pkg/rclone/rcserver/rc.go:414\ngithub.com/scylladb/scylla-manager/v3/pkg/rclone/rcserver.Server.ServeHTTP\n\tgithub.com/scylladb/scylla-manager/v3/pkg/rclone/rcserver/rcserver.go:260\nmain.newAgentHandler.StripPrefix.func7\n\tnet/http/server.go:2214\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2171\ngithub.com/go-chi/chi/v5.(*Mux).Mount.func1\n\tgithub.com/go-chi/chi/[email protected]/mux.go:327\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2171\ngithub.com/go-chi/chi/v5.(*Mux).routeHTTP\n\tgithub.com/go-chi/chi/[email protected]/mux.go:459\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2171\ngithub.com/go-chi/chi/v5.(*Mux).ServeHTTP\n\tgithub.com/go-chi/chi/[email protected]/mux.go:73\ngithub.com/go-chi/chi/v5.(*Mux).Mount.func1\n\tgithub.com/go-chi/chi/[email protected]/mux.go:327\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2171\nmain.newRouter.ValidateToken.func3.1\n\tgithub.com/scylladb/scylla-manager/v3/pkg/auth/auth.go:50\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2171\ngithub.com/go-chi/chi/v5.(*ChainHandler).ServeHTTP\n\tgithub.com/go-chi/chi/[email protected]/chain.go:31\ngithub.com/go-chi/chi/v5.(*Mux).routeHTTP\n\tgithub.com/go-chi/chi/[email protected]/mux.go:459\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2171\nmain.newRouter.RequestLogger.RequestLogger.func5.1\n\tgithub.com/go-chi/chi/[email protected]/middleware/logger.go:55\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2171\ngithub.com/go-chi/chi/v5.(*Mux).ServeHTTP\n\tgithub.com/go-chi/chi/[email protected]/mux.go:90\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:3142\nnet/http.(*conn).serve\n\tnet/http/server.go:2044"}