Prometheus Monitoring Mixin for Kubernetes Autoscaling

A set of Grafana dashboards and Prometheus alerts for Kubernetes Autoscaling using the metrics from Kube-state-metrics, Karpenter, and Cluster-autoscaler.

This serves as a extension for the Kubernetes-mixin and adds monitoring for components that aren't deployed by default in a Kubernetes cluster (VPA, Karpenter, Cluster-Autoscaler).

Dashboards

The mixin provides the following dashboards:

Kubernetes Autoscaling
- Pod Disruption Budgets
- Horizontal Pod Autoscalers
- Vertical Pod Autoscalers
Cluster Autoscaler
Karpenter
- Overview
- Activity
- Performance
KEDA
- Scaled Objects
- Scaled Jobs

Generated dashboards also exist in the ./dashboards_out directory.

Alerts are created for the following components currently:

Karpenter
Keda
Cluster Autoscaler

VPA, Karpenter, Keda, and Cluster Autoscaler are configurable in the config.libsonnet file. They can be turned off by setting the enabled field to false.

How to use

This mixin is designed to be vendored into the repo with your infrastructure config. To do this, use jsonnet-bundler:

You then have three options for deploying your dashboards

Generate the config files and deploy them yourself
Use jsonnet to deploy this mixin along with Prometheus and Grafana
Use prometheus-operator to deploy this mixin

Or import the dashboard using json in ./dashboards_out, alternatively import them from the Grafana.com dashboard page.

Generate config files

You can manually generate the alerts, dashboards, and rules files, but first you must install some tools:

go get github.com/jsonnet-bundler/jsonnet-bundler/cmd/jb
brew install jsonnet

Then, grab the mixin and its dependencies:

git clone https://github.com/adinhodovic/kubernetes-autoscaling-mixin
cd kubernetes-autoscaling-mixin
jb install

Finally, build the mixin:

make prometheus_alerts.yaml
make dashboards_out

The prometheus_alerts.yaml file then need to passed to your Prometheus server, and the files in dashboards_out need to be imported into you Grafana server. The exact details depend on how you deploy your monitoring stack.

Configuration

This mixin has its configuration in the config.libsonnet file. You can turn off the alerts for VPA, Karpenter, KEDA, and Cluster Autoscaler by setting the enabled field to false.

{
  _config+:: {
    vpa+: {
      enabled: false,
    },
    keda+: {
      enabled: false,
    },
    karpenter+: {
      enabled: false,
    },
    clusterAutoscaler+: {
      enabled: false,
    },
  },
}

The mixin has all components enabled by default and all the dashboards are generated in the dashboards_out directory. You can import them into Grafana.

VPA Requirements

Kube-state-metrics doesn't ship with VPA metrics by default. You need to deploy a custom kube-state-metrics with the following configuration:

Adjust the ClusterRole kube-state-metrics to include the following rules:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/part-of: kube-prometheus
  name: kube-state-metrics
rules:
    # ... other rules
    - apiGroups:
      - autoscaling.k8s.io
      resources:
      - verticalpodautoscalers
      verbs:
      - list
      - watch
    - apiGroups:
      - apiextensions.k8s.io
      resources:
      - customresourcedefinitions
      verbs:
      - list
      - watch

Adjust the Deployment kube-state-metrics to include the following extra arguments:

kind: Deployment
metadata:
  labels:
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 2.13.0
  name: kube-state-metrics
  namespace: monitoring
spec:
    ...
      containers:
      - args:
        ...
        - --custom-resource-state-config
        - |
          kind: CustomResourceStateMetrics
          spec:
            resources:
              - groupVersionKind:
                  group: autoscaling.k8s.io
                  kind: "VerticalPodAutoscaler"
                  version: "v1"
                labelsFromPath:
                  verticalpodautoscaler: [metadata, name]
                  namespace: [metadata, namespace]
                  target_api_version: [spec, targetRef, apiVersion]
                  target_kind: [spec, targetRef, kind]
                  target_name: [spec, targetRef, name]
                metrics:
                  # Labels
                  - name: "verticalpodautoscaler_labels"
                    help: "VPA container recommendations. Kubernetes labels converted to Prometheus labels"
                    each:
                      type: Info
                      info:
                        labelsFromPath:
                          name: [metadata, name]
                  # Memory Information
                  - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_target"
                    help: "VPA container recommendations for memory. Target resources the VerticalPodAutoscaler recommends for the container."
                    each:
                      type: Gauge
                      gauge:
                        path: [status, recommendation, containerRecommendations]
                        valueFrom: [target, memory]
                        labelsFromPath:
                          container: [containerName]
                    commonLabels:
                      resource: "memory"
                      unit: "byte"
                  - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound"
                    help: "VPA container recommendations for memory. Minimum resources the container can use before the VerticalPodAutoscaler updater evicts it"
                    each:
                      type: Gauge
                      gauge:
                        path: [status, recommendation, containerRecommendations]
                        valueFrom: [lowerBound, memory]
                        labelsFromPath:
                          container: [containerName]
                    commonLabels:
                      resource: "memory"
                      unit: "byte"
                  - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_upperbound"
                    help: "VPA container recommendations for memory. Maximum resources the container can use before the VerticalPodAutoscaler updater evicts it"
                    each:
                      type: Gauge
                      gauge:
                        path: [status, recommendation, containerRecommendations]
                        valueFrom: [upperBound, memory]
                        labelsFromPath:
                          container: [containerName]
                    commonLabels:
                      resource: "memory"
                      unit: "byte"
                  - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_uncappedtarget"
                    help: "VPA container recommendations for memory. Target resources the VerticalPodAutoscaler recommends for the container ignoring bounds"
                    each:
                      type: Gauge
                      gauge:
                        path: [status, recommendation, containerRecommendations]
                        valueFrom: [uncappedTarget, memory]
                        labelsFromPath:
                          container: [containerName]
                    commonLabels:
                      resource: "memory"
                      unit: "byte"
                  # CPU Information
                  - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_target"
                    help: "VPA container recommendations for cpu. Target resources the VerticalPodAutoscaler recommends for the container."
                    each:
                      type: Gauge
                      gauge:
                        path: [status, recommendation, containerRecommendations]
                        valueFrom: [target, cpu]
                        labelsFromPath:
                          container: [containerName]
                    commonLabels:
                      resource: "cpu"
                      unit: "core"
                  - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound"
                    help: "VPA container recommendations for cpu. Minimum resources the container can use before the VerticalPodAutoscaler updater evicts it"
                    each:
                      type: Gauge
                      gauge:
                        path: [status, recommendation, containerRecommendations]
                        valueFrom: [lowerBound, cpu]
                        labelsFromPath:
                          container: [containerName]
                    commonLabels:
                      resource: "cpu"
                      unit: "core"
                  - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_upperbound"
                    help: "VPA container recommendations for cpu. Maximum resources the container can use before the VerticalPodAutoscaler updater evicts it"
                    each:
                      type: Gauge
                      gauge:
                        path: [status, recommendation, containerRecommendations]
                        valueFrom: [upperBound, cpu]
                        labelsFromPath:
                          container: [containerName]
                    commonLabels:
                      resource: "cpu"
                      unit: "core"
                  - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_uncappedtarget"
                    help: "VPA container recommendations for cpu. Target resources the VerticalPodAutoscaler recommends for the container ignoring bounds"
                    each:
                      type: Gauge
                      gauge:
                        path: [status, recommendation, containerRecommendations]
                        valueFrom: [uncappedTarget, cpu]
                        labelsFromPath:
                          container: [containerName]
                    commonLabels:
                      resource: "cpu"
                      unit: "core"

Alerts

The mixin follows the monitoring-mixins guidelines for alerts.

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.github/workflows		.github/workflows
alerts		alerts
dashboards		dashboards
dashboards_out		dashboards_out
lib		lib
rules		rules
scripts		scripts
tests		tests
.gitignore		.gitignore
.lint		.lint
.vale.ini		.vale.ini
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config.libsonnet		config.libsonnet
jsonnetfile.json		jsonnetfile.json
jsonnetfile.lock.json		jsonnetfile.lock.json
mixin.libsonnet		mixin.libsonnet
prometheus_alerts.yaml		prometheus_alerts.yaml
prometheus_rules.yaml		prometheus_rules.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Prometheus Monitoring Mixin for Kubernetes Autoscaling

Dashboards

How to use

Generate config files

Configuration

VPA Requirements

Alerts

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

Uh oh!

License

Uh oh!

adinhodovic/kubernetes-autoscaling-mixin

Folders and files

Latest commit

History

Repository files navigation

Prometheus Monitoring Mixin for Kubernetes Autoscaling

Dashboards

How to use

Generate config files

Configuration

VPA Requirements

Alerts

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages