Skip to content

Commit 82cd4c5

Browse files
committed
feat: add job timeout enforcer policy
This policy ensures that Kubernetes Jobs have appropriate timeout values set through activeDeadlineSeconds to prevent indefinite resource consumption. Key features: - Enforces activeDeadlineSeconds between 1 hour and 24 hours - Prevents Jobs from running indefinitely - Includes comprehensive Chainsaw tests - Helps with resource management and cost optimization Signed-off-by: Karthik babu Manam <[email protected]>
1 parent 775f2ff commit 82cd4c5

File tree

5 files changed

+81
-66
lines changed

5 files changed

+81
-66
lines changed
Lines changed: 12 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,28 @@
1+
---
12
name: job-timeout-enforcer
23
version: 1.0.0
34
displayName: Enforce Job Timeouts
4-
createdAt: "2024-03-19T00:00:00.000Z"
5+
createdAt: "2024-03-20T00:00:00.000Z"
56
description: >-
67
Jobs without timeouts can run indefinitely, consuming cluster resources and potentially
78
indicating stuck workloads. This policy ensures all Jobs have an activeDeadlineSeconds
8-
set with a reasonable timeout value between 1 hour and 24 hours. This helps prevent
9-
resource leaks and identifies stuck Jobs early.
9+
set with a reasonable timeout value between 1 hour and 24 hours.
1010
install: |-
11-
```shell
12-
kubectl apply -f https://raw.githubusercontent.com/kyverno/policies/main/resource-lifecycle/job-timeout-enforcer/job-timeout-enforcer.yaml
11+
```sh
12+
kubectl apply -f https://raw.githubusercontent.com/kyverno/policies/main/job-timeout-enforcer/job-timeout-enforcer.yaml
1313
```
1414
keywords:
15-
- kyverno
16-
- resource lifecycle
1715
- job
1816
- timeout
17+
- resource management
1918
readme: |
19+
# Enforce Job Timeouts
20+
2021
Jobs without timeouts can run indefinitely, consuming cluster resources and potentially
2122
indicating stuck workloads. This policy ensures all Jobs have an activeDeadlineSeconds
22-
set with a reasonable timeout value between 1 hour and 24 hours. This helps prevent
23-
resource leaks and identifies stuck Jobs early.
24-
25-
Refer to the documentation for more details on Kyverno annotations: https://artifacthub.io/docs/topics/annotations/kyverno/
23+
set with a reasonable timeout value between 1 hour and 24 hours.
2624
annotations:
27-
kyverno/category: "Resource Lifecycle"
25+
kyverno/category: Resource Management
26+
kyverno/severity: medium
27+
kyverno/subject: Job
2828
kyverno/kubernetesVersion: "1.23-1.28"
29-
kyverno/subject: "Job"
30-
digest: b247e22ba7353f3e2fcdc09cdbf158fcb5bb92bd897cefb006b781593a9fd337 # sha256sum job-timeout-enforcer.yaml
Lines changed: 26 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,28 @@
1-
apiVersion: kyverno.io/v1
2-
kind: ClusterPolicy
1+
apiVersion: batch/v1
2+
kind: Job
33
metadata:
4-
name: job-timeout-enforcer
5-
annotations:
6-
policies.kyverno.io/title: Enforce Job Timeouts
7-
policies.kyverno.io/category: Resource Lifecycle
8-
policies.kyverno.io/severity: medium
9-
policies.kyverno.io/subject: Job
10-
policies.kyverno.io/description: >-
11-
Jobs without timeouts can run indefinitely, consuming cluster resources and potentially
12-
indicating stuck workloads. This policy ensures all Jobs have an activeDeadlineSeconds
13-
set with a reasonable timeout value between 1 hour and 24 hours. This helps prevent
14-
resource leaks and identifies stuck Jobs early.
15-
kyverno.io/kyverno-version: 1.6.0
16-
policies.kyverno.io/minversion: 1.6.0
17-
kyverno.io/kubernetes-version: "1.23-1.28"
4+
name: invalid-job-no-timeout
5+
namespace: default
186
spec:
19-
validationFailureAction: Audit
20-
background: true
21-
rules:
22-
- name: validate-job-timeout
23-
match:
24-
any:
25-
- resources:
26-
kinds:
27-
- Job
28-
validate:
29-
message: "Jobs must specify activeDeadlineSeconds between 3600 (1 hour) and 86400 (24 hours)"
30-
pattern:
31-
spec:
32-
activeDeadlineSeconds: ">= 3600 && <= 86400"
7+
template:
8+
spec:
9+
containers:
10+
- name: pi
11+
image: perl:5.34.0
12+
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
13+
restartPolicy: Never
14+
---
15+
apiVersion: batch/v1
16+
kind: Job
17+
metadata:
18+
name: invalid-job-too-short
19+
namespace: default
20+
spec:
21+
template:
22+
spec:
23+
containers:
24+
- name: pi
25+
image: perl:5.34.0
26+
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
27+
restartPolicy: Never
28+
activeDeadlineSeconds: 1800

job-timeout-enforcer/test.yaml renamed to job-timeout-enforcer/test/resources/invalid-job.yaml

Lines changed: 5 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,8 @@
1-
# Valid job
2-
apiVersion: batch/v1
3-
kind: Job
4-
metadata:
5-
name: valid-job
6-
spec:
7-
activeDeadlineSeconds: 7200 # 2 hours
8-
template:
9-
spec:
10-
containers:
11-
- name: pi
12-
image: perl:5.34.0
13-
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
14-
restartPolicy: Never
15-
16-
---
17-
# Invalid job (no timeout)
181
apiVersion: batch/v1
192
kind: Job
203
metadata:
214
name: invalid-job-no-timeout
5+
namespace: default
226
spec:
237
template:
248
spec:
@@ -27,19 +11,18 @@ spec:
2711
image: perl:5.34.0
2812
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
2913
restartPolicy: Never
30-
3114
---
32-
# Invalid job (timeout too long)
3315
apiVersion: batch/v1
3416
kind: Job
3517
metadata:
36-
name: invalid-job-long-timeout
18+
name: invalid-job-too-short
19+
namespace: default
3720
spec:
38-
activeDeadlineSeconds: 100000
3921
template:
4022
spec:
4123
containers:
4224
- name: pi
4325
image: perl:5.34.0
4426
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
45-
restartPolicy: Never
27+
restartPolicy: Never
28+
activeDeadlineSeconds: 1800
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
apiVersion: batch/v1
2+
kind: Job
3+
metadata:
4+
name: valid-job
5+
namespace: default
6+
spec:
7+
template:
8+
spec:
9+
containers:
10+
- name: pi
11+
image: perl:5.34.0
12+
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
13+
restartPolicy: Never
14+
activeDeadlineSeconds: 3600
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
apiVersion: chainsaw.kyverno.io/v1alpha1
2+
kind: Test
3+
metadata:
4+
name: test-job-timeout-enforcer
5+
spec:
6+
steps:
7+
- name: 01-apply-policy
8+
try:
9+
- apiVersion: kyverno.io/v1
10+
kind: ClusterPolicy
11+
file: job-timeout-enforcer.yaml
12+
13+
- name: 02-test-valid-job
14+
try:
15+
- file: resources/valid-job.yaml
16+
17+
- name: 03-test-invalid-job
18+
try:
19+
- file: resources/invalid-job.yaml
20+
expect:
21+
violation:
22+
count: 2
23+
match:
24+
- message: "Jobs must specify activeDeadlineSeconds between 3600 (1 hour) and 86400 (24 hours)"

0 commit comments

Comments
 (0)