Skip to content

Commit ef1ed82

Browse files
authored
Add an alert for high heap memory usage (#503)
* Add an alert for high heap memory usage - Fix application filtering for existing queries * Reduce the threshold for "High number of errors" Based on observation, it is unlikely the threshold of 10 would be reached, even under serious problems.
1 parent a127515 commit ef1ed82

File tree

1 file changed

+16
-3
lines changed

1 file changed

+16
-3
lines changed

.nais/test/klass-api-alerts.yaml

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,21 @@ spec:
99
groups:
1010
- name: dapla-metadata
1111
rules:
12+
- alert: High heap memory usage
13+
expr: (100 * sum by (instance) (jvm_memory_used_bytes{application="klass", area="heap"})) / (sum by (instance) (jvm_memory_max_bytes{application="klass", area="heap"}) ) > 70.0
14+
for: 3m
15+
annotations:
16+
title: "High heap memory usage"
17+
consequence: "If this increase continues then the app could run out of memory and either lock or crash."
18+
action: "Immediate: Restart the app from the Nais console\nShort term: Investigate the cause of high heap usage and either fix the bug or "
19+
labels:
20+
service: klass-api
21+
namespace: dapla-metadata
22+
severity: critical
23+
environment: test
24+
1225
- alert: High number of errors
13-
expr: (100 * sum by (app, namespace) (rate(logback_events_total{app="klass-api",level="error"}[3m])) / sum by (app, namespace) (rate(logback_events_total{app="klass-api"}[3m]))) > 10
26+
expr: (100 * sum by (app, namespace) (rate(logback_events_total{application="klass",level="error"}[3m])) / sum by (app, namespace) (rate(logback_events_total{application="klass"}[3m]))) > 1
1427
for: 3m
1528
annotations:
1629
title: "High number of errors logged"
@@ -23,7 +36,7 @@ spec:
2336
environment: test
2437

2538
- alert: A Klass-api client is unavailable
26-
expr: rate(http_client_requests_seconds_count{app="klass-api", status!="200"}[1m]) > 0
39+
expr: rate(http_client_requests_seconds_count{application="klass", status!="200"}[1m]) > 0
2740
for: 1m
2841
annotations:
2942
title: "A Klass-api client is unavailable "
@@ -45,4 +58,4 @@ spec:
4558
service: klass-api
4659
namespace: dapla-metadata
4760
severity: critical
48-
environment: test
61+
environment: test

0 commit comments

Comments
 (0)