-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Jenkins and plugins versions report
Environment
Jenkins: 2.528.3.35200
OS: Linux - 6.14.0-37-generic
Java: 21.0.9 - Red Hat, Inc. (OpenJDK 64-Bit Server VM)
---
kubernetes:4392.v19cea_fdb_5913
kubernetes-client-api:7.3.1-256.v788a_0b_787114
kubernetes-credentials:206.vde31a_b_0f71a_c
What Operating System are you using (both controller, and any agents involved in the problem)?
OS: Linux - 6.14.0-37-generic
Reproduction steps
-
Kubernetes cloud with
containerCap: 5 -
Create pipeline using
podTemplatewithidleMinutes: 5
podTemplate(
cloud: 'k8',
idleMinutes: 5,
containers: [
containerTemplate(
name: 'jnlp', image: 'jenkins/inbound-agent:latest-jdk17'
)
]
) {
node(POD_LABEL) {
stage('Test Job') {
sleep 5
}
}
}-
Run pipeline 5 times, wait for each to complete
-
Restart Jenkins during
idleMinutestimeout (before agents are deleted) -
Run the pipeline again after restart
Expected Results
-
A new agent can be provisioned
-
Agents are always removed after an idle timeout, reducing the current total and making room for new agents to be created.
Actual Results
-
The new agent cannot be provisioned because current limit is reached
-
After Jenkins restarts:
- 5 nodes (jobname-uuid) exists in manage/computer/
- limit counter is 0
- on first pipeline run, limit counter is updated to 5
- after idle timeout, the nodes are removed, but the limit counter is not decremented
- new pipeline builds cannot run because the limit is set to 5, even when no computer nodes are available
Anything else?
Possible root cause: agents lose their transient template references during reload, which causes the reaper cleanup process to skip node unregistration
There may be other ways to reproduce this issue besides a manual restart. Other plugins or processes could also serialize and restore agents, resulting in the same problem.
Are you interested in contributing a fix?
No response