Skip to content

fix: Set first interval based on last result time#1528

Open
PythonGermany wants to merge 1 commit intoTwiN:masterfrom
PythonGermany:fix-heartbeat-check-scheduling
Open

fix: Set first interval based on last result time#1528
PythonGermany wants to merge 1 commit intoTwiN:masterfrom
PythonGermany:fix-heartbeat-check-scheduling

Conversation

@PythonGermany
Copy link
Contributor

@PythonGermany PythonGermany commented Feb 3, 2026

Summary

Closes #1237 by introducing a check after startup adjusting the time to wait for the first heartbeat check based on the time that has passed since the last received result

Checklist

  • Tested and/or added tests to validate that the changes work as intended, if applicable.
  • Updated documentation in README.md, if applicable.

@github-actions github-actions bot added the bug Something isn't working label Feb 3, 2026
return
case <-ticker.C:
executeExternalEndpointHeartbeat(ee, cfg, extraLabels)
ticker.Reset(ee.Heartbeat.Interval)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is considered a "hack". I'd love to learn how to do this in a more clean looking way.

ticker := time.NewTicker(ee.Heartbeat.Interval)
timeToNextCheck := ee.Heartbeat.Interval

lastStatus, err := store.Get().GetEndpointStatusByKey(ee.Key(), paging.NewEndpointStatusParams().WithResults(0, 1))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an appropriate way to do this here with this store method and results size 1?

if results := lastStatus.Results; len(results) > 0 {
timeSinceLastResult := time.Since(results[0].Timestamp)
logr.Debugf("[watchdog.monitorExternalEndpointHeartbeat] Time since last result: '%s'", timeSinceLastResult)
if timeSinceLastResult < timeToNextCheck {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question still open for me is which way to handle what should happen if the time since the last result is longer than the configured heartbeat interval. Currently it will wait for the duration of the heartbeat interval before the first check.

Alternatively the behaviour could be that a failure is created immediately. However I suspect this could often lead to false positives, especially for very short intervals.

I think I prefer the currently implemented behaviour. Just wanted to throw the different possible approaches out there!

@PythonGermany PythonGermany marked this pull request as ready for review February 13, 2026 08:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Schedule heartbeat checks after startup sooner than the interval

1 participant