Skip to content

Commit 9c69b72

Browse files
anishasthanaChad Roberts
authored andcommitted
Update Metrics endpoints for ODH operator (#349)
* Fix ODH and Argo monitoring Signed-off-by: Anish Asthana <anishasthana1@gmail.com> * Increase replica count to 2 for HA Signed-off-by: Anish Asthana <anishasthana1@gmail.com> * Update Prometheus name and corresponding test Signed-off-by: Anish Asthana <anishasthana1@gmail.com> * Restructure Service Monitors This separate the ODH operator and ODH application monitoring into two seperate Service Monitors. Signed-off-by: Anish Asthana <anishasthana1@gmail.com>
1 parent dd65b6a commit 9c69b72

9 files changed

Lines changed: 85 additions & 9 deletions

prometheus/operator/base/kustomization.yaml

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,25 @@ resources:
44
- kafka-podmonitors.yaml
55
- prometheus.yaml
66
- route.yaml
7-
- servicemonitor.yaml
7+
- service-monitors
8+
- prometheus-monitoring-role.yaml
9+
- prometheus-monitoring-role-binding.yaml
10+
811
namespace: opendatahub
912
commonLabels:
1013
opendatahub.io/component: "true"
1114
component.opendatahub.io/name: prometheus
1215
generatorOptions:
1316
disableNameSuffixHash: true
17+
18+
vars:
19+
- name: namespace
20+
objref:
21+
kind: Prometheus
22+
name: odh-monitoring
23+
apiVersion: monitoring.coreos.com/v1
24+
fieldref:
25+
fieldpath: metadata.namespace
26+
27+
configurations:
28+
- params.yaml
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
varReference:
2+
- path: subjects/namespace
3+
kind: ClusterRoleBinding
4+
apiVersion: rbac.authorization.k8s.io/v1
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
---
2+
kind: ClusterRoleBinding
3+
apiVersion: rbac.authorization.k8s.io/v1
4+
metadata:
5+
name: odh-prometheus-monitoring-rb
6+
subjects:
7+
- kind: ServiceAccount
8+
name: prometheus-k8s
9+
namespace: $(namespace)
10+
roleRef:
11+
apiGroup: rbac.authorization.k8s.io
12+
kind: ClusterRole
13+
name: odh-prometheus-monitoring
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
---
2+
kind: ClusterRole
3+
apiVersion: rbac.authorization.k8s.io/v1
4+
metadata:
5+
name: odh-prometheus-monitoring
6+
namespace: opendatahub
7+
rules:
8+
- verbs:
9+
- get
10+
- list
11+
- watch
12+
apiGroups:
13+
- ''
14+
resources:
15+
- services
16+
- endpoints
17+
- pods
18+
- verbs:
19+
- get
20+
apiGroups:
21+
- ''
22+
resources:
23+
- configmaps

prometheus/operator/base/prometheus.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
apiVersion: monitoring.coreos.com/v1
22
kind: Prometheus
33
metadata:
4-
name: prometheus
4+
name: odh-monitoring
55
labels:
6-
prometheus: k8s
6+
app: odh-monitoring
77
namespace: prometheus
88
spec:
9-
replicas: 1
9+
replicas: 2
1010
serviceAccountName: prometheus-k8s
1111
securityContext: {}
1212
serviceMonitorSelector:

prometheus/operator/base/servicemonitor.yaml renamed to prometheus/operator/base/service-monitors/application-service-monitor.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,10 @@ kind: ServiceMonitor
33
metadata:
44
labels:
55
team: opendatahub
6-
name: odhservicemonitor
6+
name: odh-application-servicemonitor
77
spec:
88
endpoints:
9-
- port: web # odh-operator, Argo
9+
- port: metrics # Argo
1010
- bearerTokenSecret:
1111
key: PROMETHEUS_API_TOKEN
1212
name: jupyterhub
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
apiVersion: kustomize.config.k8s.io/v1beta1
2+
kind: Kustomization
3+
resources:
4+
- application-service-monitor.yaml
5+
- operator-service-monitor.yaml
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
apiVersion: monitoring.coreos.com/v1
2+
kind: ServiceMonitor
3+
metadata:
4+
labels:
5+
team: opendatahub
6+
name: odh-operator-servicemonitor
7+
spec:
8+
endpoints:
9+
- port: http-metrics # Open Data Hub Operator
10+
- port: cr-metrics # Open Data Hub Operator
11+
selector:
12+
matchLabels:
13+
name: opendatahub-operator
14+
namespaceSelector:
15+
matchNames:
16+
- openshift-operators

tests/basictests/prometheus.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,9 @@ function test_prometheus() {
2020
os::cmd::try_until_text "oc get pods -l k8s-app=prometheus-operator --field-selector='status.phase=Running' -o jsonpath='{$.items[*].metadata.name}'" "prometheus-operator" $odhdefaulttimeout $odhdefaultinterval
2121
runningbuspods=($(oc get pods -l k8s-app=prometheus-operator --field-selector="status.phase=Running" -o jsonpath="{$.items[*].metadata.name}"))
2222
os::cmd::expect_success_and_text "echo ${#runningbuspods[@]}" "1"
23-
os::cmd::try_until_text "oc get pods -l app=prometheus --field-selector='status.phase=Running' -o jsonpath='{$.items[*].metadata.name}'" "prometheus-prometheus" $odhdefaulttimeout $odhdefaultinterval
24-
runningbuspods=($(oc get pods -l app=prometheus --field-selector="status.phase=Running" -o jsonpath="{$.items[*].metadata.name}"))
25-
os::cmd::expect_success_and_text "echo ${#runningbuspods[@]}" "1"
23+
os::cmd::try_until_text "oc get pods -l prometheus=odh-monitoring --field-selector='status.phase=Running' -o jsonpath='{$.items[*].metadata.name}'" "prometheus-odh-monitoring" $odhdefaulttimeout $odhdefaultinterval
24+
runningbuspods=($(oc get pods -l prometheus=odh-monitoring --field-selector="status.phase=Running" -o jsonpath="{$.items[*].metadata.name}"))
25+
os::cmd::expect_success_and_text "echo ${#runningbuspods[@]}" "2"
2626
test_promportal
2727
}
2828

0 commit comments

Comments
 (0)