Skip to content

ci: auto-rerun daily pipeline on runner allocation failure#183

Merged
costajohnt merged 1 commit intomainfrom
auto-rerun-runner-allocation-failure
May 7, 2026
Merged

ci: auto-rerun daily pipeline on runner allocation failure#183
costajohnt merged 1 commit intomainfrom
auto-rerun-runner-allocation-failure

Conversation

@costajohnt
Copy link
Copy Markdown
Owner

Summary

  • Adds .github/workflows/auto-rerun-allocation-failure.yml which listens for completed runs of the Daily Trading Pipeline and reruns failed jobs once if the failure annotation matches GitHub's runner-pool allocation message ("not acquired by Runner of type hosted...").
  • Other failure modes (test failures, script errors, real bugs) are left alone so they continue to surface normally.
  • Guarded to run_attempt == 1 to prevent rerun loops if the rerun itself fails.

Context

Run #25385462949 failed at 15m4s with the annotation "The job was not acquired by Runner of type hosted even after multiple attempts", a transient GitHub infrastructure incident where the hosted runner pool never picked up the job. The pipeline self-recovered on the next scheduled tick, but the missed slot was lost. This workflow handles that class of failure automatically.

Verified the detection logic against run #25385462949: workflow_run conclusion is failure, run_attempt is 1, and the matching annotation is present on the job.

Test plan

  • Confirm workflow appears under Actions tab after merge
  • On next runner allocation failure (rare), verify a rerun is triggered automatically
  • Verify normal pipeline failures (e.g. a test failure) do NOT trigger a rerun

When GitHub's hosted runner pool fails to acquire a runner (job ends
with the "not acquired by Runner of type hosted" annotation), retry
the failed jobs once. Other failure modes are left alone so real
breakage still surfaces.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@costajohnt costajohnt force-pushed the auto-rerun-runner-allocation-failure branch from 6a7c7b3 to 7e059d1 Compare May 7, 2026 01:03
@costajohnt costajohnt merged commit a90ae33 into main May 7, 2026
7 checks passed
@costajohnt costajohnt deleted the auto-rerun-runner-allocation-failure branch May 7, 2026 01:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant