Skip to content

Don't skip cron activations #4309

Open
@Michal-Leszczynski

Description

@Michal-Leszczynski

Currently, when task finishes running, its next activation is calculated in relation to the time when task ended, not when it started.
This results in rare, yet problematic scenarios like:

  • schedule repair to run once a week
  • repair starts and runs for a week + 1h
  • next repair activation takes place in a week - 1h, meaning that we skip one week's repair

This scenario can be found here: Ref: https://github.com/scylladb/scylla-enterprise/issues/5281

Here are the places in the code from which we can see next task activation calculation:

func (s *Scheduler[K]) asyncRun(ctx *RunContext[K]) {
s.listener.OnRunStart(ctx)
s.wg.Add(1)
go func(ctx *RunContext[K]) {
defer s.wg.Done()
ctx.err = s.run(*ctx)
s.onRunEnd(ctx)
s.reschedule(ctx)
}(ctx)
}

now := s.now()
if d.Location != nil {
now = now.In(d.Location)
}
next := d.Trigger.Next(now)
var (
retno int8
p Properties
)
switch {
case shouldContinue(ctx):
next = now
retno = ctx.Retry
p = ctx.Properties
case shouldRetry(ctx, ctx.err):
if d.Backoff != nil {
if b := d.Backoff.NextBackOff(); b != retry.Stop {
next = now.Add(b)
retno = ctx.Retry + 1
p = ctx.Properties
s.listener.OnRetryBackoff(ctx, key, b, retno)
}
}
default:
if d.Backoff != nil {
d.Backoff.Reset()
}
}
s.scheduleLocked(ctx, key, next, retno, p, d.Window)

In order to fix that, SM should calculate the next task activation in relation to the task start time.

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions