-
Notifications
You must be signed in to change notification settings - Fork 16
Open
Labels
InfrastructureInfrastructure as Code, Orchestrator, NetworkInfrastructure as Code, Orchestrator, NetworkObservabilityLogs, metrics, traces, alerts, telemetryLogs, metrics, traces, alerts, telemetry
Description
Consider setting up a monitor that periodically makes a simple request to the public cluster (maybe every minute, or every 5 minutes), and if the request fails, there is a text message or email that is sent to the sliderule developers.
This could also be combined with (or implemented as) the monitoring functionality in Grafana. We have metrics on the number of container restarts, nodes registered, and discovery failures - could we put together some heuristics that determine when to alert the developers from those?
Metadata
Metadata
Assignees
Labels
InfrastructureInfrastructure as Code, Orchestrator, NetworkInfrastructure as Code, Orchestrator, NetworkObservabilityLogs, metrics, traces, alerts, telemetryLogs, metrics, traces, alerts, telemetry