Skip to content

Add Monitors in Datadog for Pager Duty #10735

Open
@nfrgosselin

Description

@nfrgosselin

Acceptance Criteria
Add Monitors for the following (this list is WIP):

  • High number of 500s

  • High Number of instances (indicates Autoscaling is working, but consuming too many resources)

  • High Latency on Page Load (indicates overall site performance degradation)

  • High number of jobs enqueued in Redis (indicates celery workers aren't keeping up with demand)

  • Synthetic pageload tests failing (canary uptime test)

  • Ensure Read Replica DBs are also monitored

  • High Mem on DBs

  • High CPU on DBs (DB load is a known bottleneck, may need to find and kill long running queries)

  • High Latency on DB Querries (indicates inefficient queries, or high db load)

  • High CPU on cluster (indicates Autoscaling is lagging behind demand)

XD Links:

Tech Details:

Open Questions:

Notes/Assumptions:

Metadata

Metadata

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions