Runner heartbeat and monitoring#144
Merged
danielholanda merged 5 commits intomainfrom Mar 16, 2026
Merged
Conversation
… runner names is approved by IT
… to monitor the results of the heartbeat workflow
danielholanda
approved these changes
Mar 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds 2 workflows to the repo:
runner-heartbeat.yml- this workflow will ping each self-hosted runner in order to check that 1) they're alive still and 2) to ensure that a workflow has run on the runner within the last 14 days (this is GitHub's retention limit for self-hosted runners, after 14 days, the runner is automatically removed). At the end of the workflow, an artifact for each runner that was successful is uploaded.monitor-runners.yml- this workflow looks for the uploaded artifacts. If any runner's artifact is missing, a Teams message is sent to notify the team that we need to check on a runner.The list of runners needs to be manually updated whenever a new runner is added.
Future work - once IT/AMD GitHub Admin approves the token, the workflows should be updated to automatically pull the list of available self-hosted runners that should be tested.