Closed
Description
Checks
- I've already read https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners-with-actions-runner-controller/troubleshooting-actions-runner-controller-errors and I'm sure my issue is not covered in the troubleshooting guide.
- I am using charts that are officially provided
Controller Version
0.10.1
Deployment Method
Helm
Checks
- This isn't a question or user support case (For Q&A and community support, go to Discussions).
- I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes
To Reproduce
start jobs
Describe the bug
Sometimes, the runner pods continue running in zombie mode after completing their jobs.
Describe the expected behavior
runner pods should should be terminated after job completion
Additional Context
gha-runner-scale-set-controller:
enabled: true
flags:
logLevel: "warn"
podLabels:
finops.company.net/cloud_provider: gcp
finops.company.net/cost_center: compute
finops.company.net/product: tools
finops.company.net/service: actions-runner-controller
finops.company.net/region: europe-west1
replicaCount: 3
podAnnotations:
ad.datadoghq.com/manager.checks: |
{
"openmetrics": {
"instances": [
{
"openmetrics_endpoint": "http://%%host%%:8080/metrics",
"histogram_buckets_as_distributions": true,
"namespace": "actions-runner-system",
"metrics": [".*"]
}
]
}
}
metrics:
controllerManagerAddr: ":8080"
listenerAddr: ":8080"
listenerEndpoint: "/metrics"
gha-runner-scale-set:
enabled: true
githubConfigUrl: https://github.com/company
githubConfigSecret:
github_token: <path:secret/github_token/actions_runner_controller#token>
maxRunners: 100
minRunners: 1
containerMode:
type: "dind" ## type can be set to dind or kubernetes
listenerTemplate:
metadata:
labels:
finops.company.net/cloud_provider: gcp
finops.company.net/cost_center: compute
finops.company.net/product: tools
finops.company.net/service: actions-runner-controller
finops.company.net/region: europe-west1
annotations:
ad.datadoghq.com/listener.checks: |
{
"openmetrics": {
"instances": [
{
"openmetrics_endpoint": "http://%%host%%:8080/metrics",
"histogram_buckets_as_distributions": true,
"namespace": "actions-runner-system",
"max_returned_metrics": 6000,
"metrics": [".*"],
"exclude_metrics": [
"gha_job_startup_duration_seconds",
"gha_job_execution_duration_seconds"
],
"exclude_labels": [
"enterprise",
"event_name",
"job_name",
"job_result",
"job_workflow_ref",
"organization",
"repository",
"runner_name"
]
}
]
}
}
spec:
containers:
- name: listener
securityContext:
runAsUser: 1000
template:
metadata:
labels:
finops.company.net/cloud_provider: gcp
finops.company.net/cost_center: compute
finops.company.net/product: tools
finops.company.net/service: actions-runner-controller
finops.company.net/region: europe-west1
spec:
restartPolicy: OnFailure
imagePullSecrets:
- name: company-prod-registry
containers:
- name: runner
image: eu.gcr.io/company-production/devex/gha-runners:v1.0.0-snapshot5
command: ["/home/runner/run.sh"]
controllerServiceAccount:
namespace: actions-runner-system
name: actions-runner-controller-gha-rs-controller
Controller Logs
Date,Host,Service,Message
"2025-01-29T15:16:06.017Z","""node_name""","""manager""","Ephemeral runner container is still running"
"2025-01-29T15:15:52.677Z","""node_name""","""manager""","Ephemeral runner container is still running"
"2025-01-29T15:15:52.671Z","""node_name""","""manager""","Updated ephemeral runner status with pod phase"
"2025-01-29T15:15:52.657Z","""node_name""","""manager""","Updating ephemeral runner status with pod phase"
"2025-01-29T15:15:52.657Z","""node_name""","""manager""","Ephemeral runner container is still running"
"2025-01-29T15:15:51.652Z","""node_name""","""manager""","Ephemeral runner container is still running"
"2025-01-29T15:15:49.690Z","""node_name""","""manager""","Ephemeral runner container is still running"
"2025-01-29T15:15:48.461Z","""node_name""","""manager""","Ephemeral runner container is still running"
"2025-01-29T15:15:48.456Z","""node_name""","""manager""","Updated ephemeral runner status with pod phase"
"2025-01-29T15:15:48.440Z","""node_name""","""manager""","Updating ephemeral runner status with pod phase"
"2025-01-29T15:15:48.440Z","""node_name""","""manager""","Ephemeral runner container is still running"
"2025-01-29T15:15:48.424Z","""node_name""","""manager""","Waiting for runner container status to be available"
"2025-01-29T15:15:48.399Z","""node_name""","""manager""","Created ephemeral runner pod"
"2025-01-29T15:15:48.367Z","""node_name""","""manager""","Created new pod spec for ephemeral runner"
"2025-01-29T15:15:48.366Z","""node_name""","""manager""","Creating new pod for ephemeral runner"
"2025-01-29T15:15:48.366Z","""node_name""","""manager""","Creating new EphemeralRunner pod."
"2025-01-29T15:15:48.361Z","""node_name""","""manager""","Created ephemeral runner secret"
"2025-01-29T15:15:48.313Z","""node_name""","""manager""","Created new secret spec for ephemeral runner"
"2025-01-29T15:15:48.313Z","""node_name""","""manager""","Creating new secret for ephemeral runner"
"2025-01-29T15:15:48.313Z","""node_name""","""manager""","Creating new ephemeral runner secret for jitconfig."
"2025-01-29T15:15:48.308Z","""node_name""","""manager""","Updated ephemeral runner status with runnerId and runnerJITConfig"
"2025-01-29T15:15:48.294Z","""node_name""","""manager""","Updating ephemeral runner status with runnerId and runnerJITConfig"
"2025-01-29T15:15:48.294Z","""node_name""","""manager""","Created ephemeral runner JIT config"
"2025-01-29T15:15:48.093Z","""node_name""","""manager""","Creating ephemeral runner JIT config"
"2025-01-29T15:15:48.093Z","""node_name""","""manager""","Creating new ephemeral runner registration and updating status with runner config"
"2025-01-29T15:15:48.093Z","""node_name""","""manager""","Successfully added runner registration finalizer"
"2025-01-29T15:15:48.076Z","""node_name""","""manager""","Adding runner registration finalizer"
"2025-01-29T15:15:48.076Z","""node_name""","""manager""","Successfully added finalizer"
"2025-01-29T15:15:48.059Z","""node_name""","""manager""","Adding finalizer"
Runner Pod Logs
https://gist.github.com/julien-michaud/ce2a1e5c5d494d89e09453f0b270a26f