Fix: Alarm for notification queue overpassing a given threshold (#4113)#4375
Fix: Alarm for notification queue overpassing a given threshold (#4113)#4375Anjali-NEC wants to merge 13 commits into
Conversation
|
I have provided some extra comments that I hope may help. In addition note that for a PR to be ready for merging, all the tests should pass (at the present moment, they don't pass). |
@fgalan All test cases are passed except |
|
The massive fails in CI are due to the changes done in docker CI image (see #4417 (comment)). Once PR #4417, this PR should be updated with master and test will be passing again. |
PR #4417 has been merged. @Anjali-NEC please upgrade this PR's branch with master. |
| std::string details = ("notification queue reached maximum threshold"); | ||
|
|
||
| long unsigned int threshold = queueSize(service)*notifAlarmThreshold/100; | ||
|
|
||
| if (threshold >= queueSize(service)) | ||
| { | ||
| alarmMgr.notificationQueue(queueName.c_str(), details.c_str()); | ||
| } | ||
| } |
There was a problem hiding this comment.
Code can be simplified this way:
| std::string details = ("notification queue reached maximum threshold"); | |
| long unsigned int threshold = queueSize(service)*notifAlarmThreshold/100; | |
| if (threshold >= queueSize(service)) | |
| { | |
| alarmMgr.notificationQueue(queueName.c_str(), details.c_str()); | |
| } | |
| } | |
| long unsigned int threshold = queueSize(service)*notifAlarmThreshold/100; | |
| if (threshold >= queueSize(service)) | |
| { | |
| alarmMgr.notificationQueue(queueName.c_str(), "notification queue reached maximum threshold"); | |
| } | |
| } |
There was a problem hiding this comment.
Moreover, the detail message could provide information about the particular threshold for this case. For instance, if we have a queue of size 6 and the threshold is 50%, something like this:
notification queue reached maximum threshold (3)
| # VALGRIND_READY - to mark the test ready for valgrindTestSuite.sh | ||
|
|
||
| --NAME-- | ||
| alarm for notification queue overpassing a given threshold |
There was a problem hiding this comment.
| alarm for notification queue overpassing a given threshold | |
| alarm for notification queue overpassing a given threshold (relog variant) |
| Raising alarm NotificationQueue serv1: notification queue reached maximum threshold | ||
| Repeated NotificationQueue serv1: notification queue reached maximum threshold | ||
| Raising alarm NotificationQueue serv2: notification queue reached maximum threshold | ||
| Repeated NotificationQueue serv2: notification queue reached maximum threshold | ||
| Raising alarm NotificationQueue default: notification queue reached maximum threshold | ||
| Repeated NotificationQueue default: notification queue reached maximum threshold |
There was a problem hiding this comment.
Note this result doesn't follow expectations.
According to:
# 04. Create/update entity in serv1 5 times (update #3 raises alarm, update #4 and #5 cause repeated log)
# 05. Create/update entity in serv2 3 times (update #2 raises alarm, update #3 cause repeated log)
# 06. Create/update entity in serv3 (default) 7 times (update #4 raises alarm, updates #5, #6 and #7 cause repeated log)
Se should see 2 repeated logs for serv1, 1 repeated log for serv2 (that's is ok in the above output) and 3 repeated logs for default queue. Something like this:
Raising alarm NotificationQueue serv1: notification queue reached maximum threshold
Repeated NotificationQueue serv1: notification queue reached maximum threshold
Repeated NotificationQueue serv1: notification queue reached maximum threshold
Raising alarm NotificationQueue serv2: notification queue reached maximum threshold
Repeated NotificationQueue serv2: notification queue reached maximum threshold
Raising alarm NotificationQueue default: notification queue reached maximum threshold
Repeated NotificationQueue default: notification queue reached maximum threshold
Repeated NotificationQueue default: notification queue reached maximum threshold
Repeated NotificationQueue default: notification queue reached maximum threshold
|
In addition to the current two .test files (which are nice :), I'd suggest to add one to check the releasing of the new alarm. Something like this: The endpoint that responses in 10 seconds is As result of step 08, we should see some releasing alarm messages. |
|
After the merging of PR #4332 this PR needs to be upgrades with |
Fix issue #4113