Replies: 4 comments 3 replies
-
|
It might be cheaper and more robust if we put "last_seen" directly on Source: On every received event or heartbeat if that is not an event we also update "last_seen" with the timestamp of the event. Such a field is useful and cheap to show in columns even if the source does not support heart beats, but we'd need a read lock (select for update). For historical trends we could log every update (INFO) of the field. We could have a table HeartbeatConfiguration to control for how long we wait until we raise a "missing incident"-event, but should we hang it on the source system type or the source (or both?) |
Beta Was this translation helpful? Give feedback.
-
|
API-wise, the heartbeat setup should be configurable via API so the glue service can configure it itself. Viewable and editable in admin, maybe viewable in frontend. |
Beta Was this translation helpful? Give feedback.
-
|
Isn't this just a duplicate of #1160? |
Beta Was this translation helpful? Give feedback.
-
Absolute minimum change:
Bonus
Next step:Mark source system type for if the glue service produces explicit heart beats. Two global settings: to toggle heart beat support on or off, global fallback period to check for heartbeats for source system types that support it. Background task/cron checks for heartbeats for enabled types and makes an incident if too long since the last time. OPEN QUESTIONS:
Last step?Source system type is marked for how frequently to check for heart beats. API for glue service to configure heart beats. In the docs, mark which glue services produces heart beats. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
There currently is no way for argus to detect that a source is not capable of sending incidents.
See #1160 for META-issue, what we decide on should be turned into sub-issues of that one.
We could have a model
and alter SourceSystemType:
We would need a background worker that would, for all
SourceSystemTypes with heartbeat set to True, check (at an interval set in settings? cron?), look upHeartbeat's with that source, check thelast_seenand raise an incident if it's been too long. We could even have heartbeat be a onetoone key to SourceSystem and UPDATE instead of INSERT, but using a foreign key makes it possible to spot trends.Beta Was this translation helpful? Give feedback.
All reactions