Skip to content

Conversation

@Michal-Leszczynski
Copy link
Collaborator

This PR contains complete implementation of sctool start --enable --soft features (#4666).
Alongside adjusting managerclient and sctool commands, it also adds more test coverage (api integration tests are not a part of our gh CI, but they pass locally).
We need to merge those changes in a few PRs (changes to different SM submodules) - this PR uses go work as a workaround.
This PR serves as the overview of all changes and as a way of executing tests early.

Up to this point, it was possible to set no continue only as a part
of starting the task. In some cases, when we know, that the task can't
be continued, it's also useful to set no continue for the next task
activation without starting it. One example is when we stopped the
backup task and deleted all snapshots from node's disks. Since current
no continue implementation relies on timestamp comparison, I added
additional 'force' no continue param which allows to skip it if needed.

Refs #4647
It is supposed to mirror the functionality of disabling the task.
Currently, we disable the task with 'sctool stop --disable', but
we enable it with `sctool <task-type> update --enable`.
With this param, it will be possible to enable task with
'sctool start --enable' as well.

Refs #4647
It allows for starting the task only when it skipped
its activation. It is useful for disabling and enabling
tasks without worrying about missed activations.

Refs #4647
There are multiple approaches to soft start implementation.
Initially I wanted to base the timestamp comparison on the
LastError and LastSuccess timestamps, but it seemed less clear
since those values don't take activations ended with a pause or
running out of maintenance window into consideration.
Instead, I focused on analyzing just the last task run (if any).

If the last task run finished successfully, we check if we missed
any activations since that time. Otherwise, we always start it.

Those rules are rather simple and easy to understand,
but note that we will always start new (without any previous executions) tasks.
We lack sufficient information to check if new task missed its activation.

We could check it by fetching its `WRITETIME(enabled)` from SM DB, but this
just adds more lousy timestamps checks and runs into problems with driver/server
side timestamps and the fact that we might "touch" this column when updating others
ones by accident. Because of that, I abandoned this idea.

In the end, starting new task even if it hasn't missed its activation shouldn't
be a big deal (it does not break backup retention) and it's a one time per task thing.
Moreover, it's a safer option than never soft starting it, because we might run into
a situation where the chain of `sctool stop --disable` and `sctool start --enable --no-continue --soft`
results in never starting this task at all.

Refs #4647
This method can be used for starting the task with the new
'enabled' and 'soft' params. Since we expose managerclient
pkg to public, we can't simply add those params as StartTask
method arguments, because this would break backward compatibility.

Refs #4647
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants