Ephemeral GPU runners on Modal for GitHub Actions. Launches a Modal Sandbox with the GitHub Actions runner binary, using JIT (just-in-time) runner config for zero-config ephemeral registration.
launch.pygenerates a unique runner label, requests JIT config from the GitHub API, builds a Modal image with the runner binary baked in, and creates a Sandbox with a GPU- The Sandbox runs
run.sh --jitconfig ...which connects to GitHub and picks up jobs matching the label - Downstream GHA jobs use
runs-on: ${{ needs.modal.outputs.id }}to target the runner - The runner self-terminates after the job completes (JIT runners are inherently ephemeral)
src/modal_gha/launch.py— Main entrypoint (run viamodal run).github/workflows/runner.yml— Reusable workflow (callers useworkflow_call).github/workflows/e2e-test.yml— Self-test: launches T4 runner, runsnvidia-smi
T4, L4, A10G, L40S, A100, H100
GH_SA_TOKEN— GitHub PAT with repo admin scope (for JIT runner registration)MODAL_TOKEN_ID/MODAL_TOKEN_SECRET— Modal API credentials
jobs:
modal:
uses: Open-Athena/modal-gha/.github/workflows/runner.yml@main
secrets:
GH_SA_TOKEN: ${{ secrets.GH_SA_TOKEN }}
MODAL_TOKEN_ID: ${{ secrets.MODAL_TOKEN_ID }}
MODAL_TOKEN_SECRET: ${{ secrets.MODAL_TOKEN_SECRET }}
with:
gpu: "T4"
timeout: "30"
my-job:
needs: modal
runs-on: ${{ needs.modal.outputs.id }}
steps:
- run: nvidia-smi- ec2-gha — Same pattern on AWS EC2
- lambda-gha — Same pattern on Lambda Labs
- cloud-gha — Unified dispatch layer (WIP) across providers