Skip to content

Commit 4ba97d8

Browse files
doddstr13doddpfefgeoffcline
authored
Updates metric references for modern Kubernetes versions (#695)
* Updates to metrics and api versions based upon modern k8s versions - Standardizes representation for histogram metric references - Removes metric names for unsupported versions of Kubernetes - Updates client.authentication.k8s.io version references * Replaces depricated metric reference * add github sync script * add github sync script * Addresses referneces to deprecated control plane metrics references * Updates to metrics and api versions based upon modern k8s versions - Standardizes representation for histogram metric references - Removes metric names for unsupported versions of Kubernetes - Updates client.authentication.k8s.io version references --------- Co-authored-by: doddpfef <[email protected]> Co-authored-by: Geoffrey Cline <[email protected]>
1 parent 5f71571 commit 4ba97d8

File tree

4 files changed

+27
-27
lines changed

4 files changed

+27
-27
lines changed

latest/bpg/reliability/controlplane.adoc

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -113,11 +113,11 @@ Consider monitoring these control plane metrics:
113113
for each verb, dry run value, group, version, resource, scope,
114114
component, and HTTP response code.
115115

116-
|`apiserver_request_duration_seconds*` |Response latency distribution
116+
|`apiserver_request_duration_seconds*` |Response latency histogram
117117
in seconds for each verb, dry run value, group, version, resource,
118118
subresource, scope, and component.
119119

120-
|`apiserver_admission_controller_admission_duration_seconds`
120+
|`apiserver_admission_controller_admission_duration_seconds*`
121121
|Admission controller latency histogram in seconds, identified by name
122122
and broken out for each operation and API resource and type (validate or
123123
admit).
@@ -127,32 +127,35 @@ webhook rejections. Identified by name, operation, rejection_code, type
127127
(validating or admit), error_type (calling_webhook_error,
128128
apiserver_internal_error, no_error)
129129

130-
|`rest_client_request_duration_seconds` |Request latency in seconds.
130+
|`rest_client_request_duration_seconds*` |Request latency histogram in seconds.
131131
Broken down by verb and URL.
132132

133133
|`rest_client_requests_total` |Number of HTTP requests, partitioned by
134134
status code, method, and host.
135135
|===
136136

137+
* Histogram metrics include _bucket, _sum, and _count suffixes.
138+
137139
=== etcd
138140

139141
[width="100%",cols="<99%,<1%",options="header",]
140142
|===
141143
|Metric |Description
142-
|`etcd_request_duration_seconds` |Etcd request latency in seconds for
144+
|`etcd_request_duration_seconds*` |Etcd request latency histogram in seconds for
143145
each operation and object type.
144146

145147
|`apiserver_storage_db_total_size_in_bytes`
146148
or `apiserver_storage_size_bytes` (starting with EKS v1.28) |Etcd
147149
database size.
148150
|===
149151

152+
* Histogram metrics include _bucket, _sum, and _count suffixes.
153+
150154
Consider using the
151155
https://grafana.com/grafana/dashboards/14623[Kubernetes Monitoring
152156
Overview Dashboard] to visualize and monitor Kubernetes API server
153157
requests and latency and etcd latency metrics.
154158

155-
156159
[IMPORTANT]
157160
====
158161
When the database size limit is exceeded, etcd emits a no space alarm and stops taking further write requests. In other words, the cluster becomes read-only, and all requests to mutate objects such as creating new pods, scaling deployments, etc., will be rejected by the cluster's API server.
@@ -241,7 +244,7 @@ users:
241244
#- name: arn:aws:eks:us-west-2:<account number>:cluster/<cluster name>
242245
# user:
243246
# exec:
244-
# apiVersion: client.authentication.k8s.io/v1alpha1
247+
# apiVersion: client.authentication.k8s.io/v1beta1
245248
# args:
246249
# - --region
247250
# - us-west-2

latest/bpg/scalability/control-plane.adoc

Lines changed: 16 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,6 @@
77

88
The Kubernetes control plane consists of the Kubernetes API Server, Kubernetes Controller Manager, Scheduler and other components that are required for Kubernetes to function. Scalability limits of these components are different depending on what you're running in the cluster, but the areas with the biggest impact to scaling include the Kubernetes version, utilization, and individual Node scaling.
99

10-
== Use EKS 1.24 or above
11-
12-
EKS 1.24 introduced a number of changes and switches the container runtime to https://containerd.io/[containerd] instead of docker. Containerd helps clusters scale by increasing individual node performance by limiting container runtime features to closely align with Kubernetes`' needs. Containerd is available in every supported version of EKS and if you would like to switch to containerd in versions prior to 1.24 please use the https://docs.aws.amazon.com/eks/latest/userguide/eks-optimized-ami.html#containerd-bootstrap[`--container-runtime` bootstrap flag].
13-
1410
== Limit workload and node bursting
1511

1612
[IMPORTANT]
@@ -115,16 +111,17 @@ To protect itself from being overloaded during periods of increased requests, th
115111

116112
The mechanism used by Kubernetes to configure how these inflights requests are divided among different request types is called https://kubernetes.io/docs/concepts/cluster-administration/flow-control/[API Priority and Fairness]. The API Server configures the total number of inflight requests it can accept by summing together the values specified by the `--max-requests-inflight` and `--max-mutating-requests-inflight` flags. EKS uses the default values of 400 and 200 requests for these flags, allowing a total of 600 requests to be dispatched at a given time. However, as it scales the control-plane to larger sizes in response to increased utilization and workload churn, it correspondingly increases the inflight request quota all the way till 2000 (subject to change). APF specifies how these inflight request quota is further sub-divided among different request types. Note that EKS control planes are highly available with at least 2 API Servers registered to each cluster. This means the total number of inflight requests your cluster can handle is twice (or higher if horizontally scaled out further) the inflight quota set per kube-apiserver. This amounts to several thousands of requests/second on the largest EKS clusters.
117113

118-
Two kinds of Kubernetes objects, called PriorityLevelConfigurations and FlowSchemas, configure how the total number of requests is divided between different request types. These objects are maintained by the API Server automatically and EKS uses the default configuration of these objects for the given Kubernetes minor version. PriorityLevelConfigurations represent a fraction of the total number of allowed requests. For example, the workload-high PriorityLevelConfiguration is allocated 98 out of the total of 600 requests. The sum of requests allocated to all PriorityLevelConfigurations will equal 600 (or slightly above 600 because the API Server will round up if a given level is granted a fraction of a request). To check the PriorityLevelConfigurations in your cluster and the number of requests allocated to each, you can run the following command. These are the defaults on EKS 1.24:
114+
Two kinds of Kubernetes objects, called PriorityLevelConfigurations and FlowSchemas, configure how the total number of requests is divided between different request types. These objects are maintained by the API Server automatically and EKS uses the default configuration of these objects for the given Kubernetes minor version. PriorityLevelConfigurations represent a fraction of the total number of allowed requests. For example, the workload-high PriorityLevelConfiguration is allocated 98 out of the total of 600 requests. The sum of requests allocated to all PriorityLevelConfigurations will equal 600 (or slightly above 600 because the API Server will round up if a given level is granted a fraction of a request). To check the PriorityLevelConfigurations in your cluster and the number of requests allocated to each, you can run the following command. These are the defaults on EKS 1.32:
119115

120-
$ kubectl get --raw /metrics | grep apiserver_flowcontrol_request_concurrency_limit
121-
apiserver_flowcontrol_request_concurrency_limit{priority_level="catch-all"} 13
122-
apiserver_flowcontrol_request_concurrency_limit{priority_level="global-default"} 49
123-
apiserver_flowcontrol_request_concurrency_limit{priority_level="leader-election"} 25
124-
apiserver_flowcontrol_request_concurrency_limit{priority_level="node-high"} 98
125-
apiserver_flowcontrol_request_concurrency_limit{priority_level="system"} 74
126-
apiserver_flowcontrol_request_concurrency_limit{priority_level="workload-high"} 98
127-
apiserver_flowcontrol_request_concurrency_limit{priority_level="workload-low"} 245
116+
$ kubectl get --raw /metrics | grep apiserver_flowcontrol_nominal_limit_seats
117+
apiserver_flowcontrol_nominal_limit_seats{priority_level="catch-all"} 13
118+
apiserver_flowcontrol_nominal_limit_seats{priority_level="exempt"} 0
119+
apiserver_flowcontrol_nominal_limit_seats{priority_level="global-default"} 49
120+
apiserver_flowcontrol_nominal_limit_seats{priority_level="leader-election"} 25
121+
apiserver_flowcontrol_nominal_limit_seats{priority_level="node-high"} 98
122+
apiserver_flowcontrol_nominal_limit_seats{priority_level="system"} 74
123+
apiserver_flowcontrol_nominal_limit_seats{priority_level="workload-high"} 98
124+
apiserver_flowcontrol_nominal_limit_seats{priority_level="workload-low"} 245
128125

129126
The second type of object are FlowSchemas. API Server requests with a given set of properties are classified under the same FlowSchema. These properties include either the authenticated user or attributes of the request, such as the API group, namespace, or resource. A FlowSchema also specifies which PriorityLevelConfiguration this type of request should map to. The two objects together say, "I want this type of request to count towards this share of inflight requests." When a request hits the API Server, it will check each of its FlowSchemas until it finds one that matches all the required properties. If multiple FlowSchemas match a request, the API Server will choose the FlowSchema with the smallest matching precedence which is specified as a property in the object.
130127

@@ -164,11 +161,11 @@ apiserver_flowcontrol_rejected_requests_total{flow_schema="service-accounts",pri
164161
To check how close a given PriorityLevelConfiguration is to receiving 429s or experiencing increased latency due to queuing, you can compare the difference between the concurrency limit and the concurrency in use. In this example, we have a buffer of 100 requests.
165162

166163
----
167-
% kubectl get --raw /metrics | grep 'apiserver_flowcontrol_request_concurrency_limit.*workload-low'
168-
apiserver_flowcontrol_request_concurrency_limit{priority_level="workload-low"} 245
164+
% kubectl get --raw /metrics | grep 'apiserver_flowcontrol_nominal_limit_seats.*workload-low'
165+
apiserver_flowcontrol_nominal_limit_seats{priority_level="workload-low"} 245
169166
170-
% kubectl get --raw /metrics | grep 'apiserver_flowcontrol_request_concurrency_in_use.*workload-low'
171-
apiserver_flowcontrol_request_concurrency_in_use{flow_schema="service-accounts",priority_level="workload-low"} 145
167+
% kubectl get --raw /metrics | grep 'apiserver_flowcontrol_current_executing_seats.*workload-low'
168+
apiserver_flowcontrol_current_executing_seats{flow_schema="service-accounts",priority_level="workload-low"} 145
172169
----
173170

174171
To check if a given PriorityLevelConfiguration is experiencing queuing but not necessarily dropped requests, the metric for `apiserver_flowcontrol_current_inqueue_requests` can be referenced:
@@ -219,14 +216,14 @@ Alternatively, new FlowSchema and PriorityLevelConfigurations objects can be cre
219216
When making changes to APF defaults, these metrics should be monitored on a non-production cluster to ensure changing the settings do not cause unintended 429s:
220217

221218
. The metric for `apiserver_flowcontrol_rejected_requests_total` should be monitored for all FlowSchemas to ensure that no buckets start to drop requests.
222-
. The values for `apiserver_flowcontrol_request_concurrency_limit` and `apiserver_flowcontrol_request_concurrency_in_use` should be compared to ensure that the concurrency in use is not at risk for breaching the limit for that priority level.
219+
. The values for `apiserver_flowcontrol_nominal_limit_seats` and `apiserver_flowcontrol_current_executing_seats` should be compared to ensure that the concurrency in use is not at risk for breaching the limit for that priority level.
223220

224221
One common use-case for defining a new FlowSchema and PriorityLevelConfiguration is for isolation. Suppose we want to isolate long-running list event calls from pods to their own share of requests. This will prevent important requests from pods using the existing service-accounts FlowSchema from receiving 429s and being starved of request capacity. Recall that the total number of inflight requests is finite, however, this example shows APF settings can be modified to better divide request capacity for the given workload:
225222

226223
Example FlowSchema object to isolate list event requests:
227224

228225
----
229-
apiVersion: flowcontrol.apiserver.k8s.io/v1beta1
226+
apiVersion: flowcontrol.apiserver.k8s.io/v1
230227
kind: FlowSchema
231228
metadata:
232229
name: list-events-default-service-accounts

latest/bpg/scalability/kcp_monitoring.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ With the move to API Priority and Fairness the total number of requests on the s
4949
Let's look at these queues with the following query:
5050

5151
----
52-
max without(instance)(apiserver_flowcontrol_request_concurrency_limit{})
52+
max without(instance)(apiserver_flowcontrol_nominal_limit_seats{})
5353
----
5454

5555
[NOTE]

latest/bpg/scalability/quotas.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -249,5 +249,5 @@ You can review the EC2 rate limit defaults and the steps to request a rate limit
249249
** https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/DNSLimitations.html#limits-api-requests[Route 53 also has a fairly low rate limit of 5 requests per second to the Route 53 API]. If you have a large number of domains to update with a project like External DNS you may see rate throttling and delays in updating domains.
250250
* Some https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/volume_limits.html#instance-type-volume-limits[Nitro instance types have a volume attachment limit of 28] that is shared between Amazon EBS volumes, network interfaces, and NVMe instance store volumes. If your workloads are mounting numerous EBS volumes you may encounter limits to the pod density you can achieve with these instance types
251251
* There is a maximum number of connections that can be tracked per Ec2 instance. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/security-group-connection-tracking.html#connection-tracking-throttling[If your workloads are handling a large number of connections you may see communication failures or errors because this maximum has been hit.] You can use the `conntrack_allowance_available` and `conntrack_allowance_exceeded` https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-network-performance-ena.html[network performance metrics to monitor the number of tracked connections on your EKS worker nodes].
252-
* In EKS environment, etcd storage limit is *8 GiB* as per https://etcd.io/docs/v3.5/dev-guide/limit/#storage-size-limit[upstream guidance]. Please monitor metric `etcd_db_total_size_in_bytes` to track etcd db size. You can refer to https://github.com/etcd-io/etcd/blob/main/contrib/mixin/mixin.libsonnet#L213-L240[alert rules] `etcdBackendQuotaLowSpace` and `etcdExcessiveDatabaseGrowth` to setup this monitoring.
252+
* In EKS environment, etcd storage limit is *8 GiB* as per https://etcd.io/docs/v3.5/dev-guide/limit/#storage-size-limit[upstream guidance]. Please monitor metric `apiserver_storage_size_bytes` to track etcd db size. You can refer to https://github.com/etcd-io/etcd/blob/main/contrib/mixin/mixin.libsonnet#L213-L240[alert rules] `etcdBackendQuotaLowSpace` and `etcdExcessiveDatabaseGrowth` to setup this monitoring.
253253

0 commit comments

Comments
 (0)