Skip to content

Health Check for Prometheus #1965

Closed
@patrickbrophy

Description

@patrickbrophy

Following the recent Prometheus outage on the OSDF director, implementing a health check for the Prometheus server would be highly beneficial. Querying process_start_time_seconds should provide a simple way to verify if the server is responding with valid data.

Currently, the pelican_component_health_status metric tracks the health of various Pelican services using a component label. Adding a prometheus label value to this metric would be the most effective way to integrate Prometheus health monitoring.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions