Cluster Health Analyzer

An analyzer for OpenShift cluster health data.

Overview

The Cluster Health Analyzer processes the incoming stream of health signals from the OpenShift cluster and enriches them to provide better views of the data to enhance the troubleshooting experience.

It provides:

incidents detection: heuristics to group individual alerts together to allow better reasoning about the root cause of the issues.
components mapping and ranking: an opinionated way to assign the alerts to high-level components and rank them based on the importance of the components from the overall cluster health perspective.

Install

Login to a cluster using oc login command:

oc apply -f manifests/backend

Usage

The Cluster Health Analyzer is a backend that exposes the results via Prometheus metrics:

# Mapping of source signal to components and incident groups.
cluster:health:components:map
{
  # Identifier of the source signal type
  type="alert"

  # Matchers against the source labels
  src_alertname="KubeNodeNotReady",
  src_namespace="openshift-monitoring",
  src_severity="warning",

  # Identifier of the mapped component.
  layer="compute",
  component="compute",

  # Incident group id
  group_id="b8d9df3f-8245-4f5a-825d-15578a6c8397",

# Value represents the impact on the component severity
} -> 1

# Metadata about the components in the system
cluster:health:components
{
  # Identifier of the component
  component="compute", layer="compute"

# The value represents the ranking of the component: the lower number the higher
# importance of the component.
} -> 1

Development and testing

If you want to contribute to the project head over to development.md

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
cmd		cmd
docs		docs
hack		hack
manifests/backend		manifests/backend
pkg		pkg
.gitignore		.gitignore
Dockerfile		Dockerfile
Dockerfile.konflux		Dockerfile.konflux
LICENSE		LICENSE
Makefile		Makefile
OWNERS		OWNERS
README.md		README.md
development.md		development.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Cluster Health Analyzer

Overview

Install

Usage

Development and testing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 7

Uh oh!

Languages

License

openshift/cluster-health-analyzer

Folders and files

Latest commit

History

Repository files navigation

Cluster Health Analyzer

Overview

Install

Usage

Development and testing

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 7

Uh oh!

Languages

Packages