diff --git a/.github/ISSUE_TEMPLATE/vpa_release.md b/.github/ISSUE_TEMPLATE/vpa_release.md index c2adb862ac2e..351acd21d7c1 100644 --- a/.github/ISSUE_TEMPLATE/vpa_release.md +++ b/.github/ISSUE_TEMPLATE/vpa_release.md @@ -14,4 +14,4 @@ completed step on this issue. Please provide any information that is related to the release: - When we plan to do the release? -- Are there any issues / PRs blocking the release? \ No newline at end of file +- Are there any issues / PRs blocking the release? diff --git a/.github/workflows/ca-benchmark.yaml b/.github/workflows/ca-benchmark.yaml index aeb7b6a8cb70..3811d0a9dd14 100644 --- a/.github/workflows/ca-benchmark.yaml +++ b/.github/workflows/ca-benchmark.yaml @@ -56,12 +56,12 @@ jobs: echo "### Cluster Autoscaler Benchmark Results" >> $GITHUB_STEP_SUMMARY echo "Comparing PR branch against \`${{ github.event.pull_request.base.ref }}\`" >> $GITHUB_STEP_SUMMARY echo "" >> $GITHUB_STEP_SUMMARY - + if [ -s base.txt ] && [ -s pr.txt ]; then echo '```' >> $GITHUB_STEP_SUMMARY $(go env GOPATH)/bin/benchstat base.txt pr.txt | tee benchstat.txt >> $GITHUB_STEP_SUMMARY echo '```' >> $GITHUB_STEP_SUMMARY - + # Fail if any regression is > 10% grep -qE '\+[1-9][0-9].*%' benchstat.txt && { echo "Regression detected > 10%"; exit 1; } || true else diff --git a/.github/workflows/precommit.yaml b/.github/workflows/precommit.yaml new file mode 100644 index 000000000000..d7a4f853fc5f --- /dev/null +++ b/.github/workflows/precommit.yaml @@ -0,0 +1,16 @@ +name: pre-commit + +on: + - push + - pull_request + +permissions: + contents: read + +jobs: + pre-commit: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v3 + - uses: actions/setup-python@v3 + - uses: pre-commit/action@v3.0.1 diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index eb6183a7e2ac..f0debd4e8103 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -1,22 +1,38 @@ +# Generated folders and files are supposed to be excluded from the +# list if pre-commit inconsistencies in generated content by modifying +# them. +exclude: | + (?x)^( + addon-resizer/vendor/ | + cluster-autoscaler/cloudprovider/oci/vendor-internal/ | + vertical-pod-autoscaler/ + ) + repos: - - hooks: + - repo: https://github.com/pre-commit/pre-commit-hooks + hooks: - id: end-of-file-fixer - id: trailing-whitespace - repo: https://github.com/pre-commit/pre-commit-hooks + - id: mixed-line-ending + - id: check-added-large-files rev: v3.1.0 - - hooks: + - repo: https://github.com/gruntwork-io/pre-commit + hooks: - id: helmlint - repo: https://github.com/gruntwork-io/pre-commit rev: v0.1.9 - - hooks: - - id: helm-docs + - repo: https://github.com/norwoodj/helm-docs + hooks: + - id: helm-docs-built files: (README\.md\.gotmpl|(Chart|requirements|values)\.yaml)$ - repo: https://github.com/norwoodj/helm-docs - rev: v1.3.0 - - hooks: + rev: v1.14.2 + - repo: local + hooks: - id : update-flags name: Update Cluster-Autoscaler Flags Table entry: bash cluster-autoscaler/hack/update-faq-flags.sh language: system files: cluster-autoscaler/config/flags/flags\.go - repo: local + - repo: https://github.com/TekWizely/pre-commit-golang + rev: v1.0.0-rc.4 + hooks: + - id: go-fmt diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 20f30f793f3f..6247a80736a9 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -5,10 +5,10 @@ ### Signing Contributor License Agreements(CLA) We'd love to accept your patches! Before we can take them, we have to jump a couple of legal hurdles. - + Please fill out either the individual or corporate Contributor License Agreement (CLA). - + * If you are an individual writing original source code and you're sure you own the intellectual property, then you'll need to sign an [individual CLA](https://identity.linuxfoundation.org/node/285/node/285/individual-signup). @@ -21,15 +21,15 @@ We'd love to accept your patches! Before we can take them, we have to jump a cou * Fork the desired repo, develop and test your code changes. * Submit a pull request. -All changes must be code reviewed. Coding conventions and standards are explained in the official -[developer docs](https://github.com/kubernetes/community/tree/master/contributors/devel). Expect +All changes must be code reviewed. Coding conventions and standards are explained in the official +[developer docs](https://github.com/kubernetes/community/tree/master/contributors/devel). Expect reviewers to request that you avoid common [go style mistakes](https://go.dev/wiki/CodeReviewComments) in your PRs. ### Merge Approval -Autoscaler collaborators may add "LGTM" (Looks Good To Me) or an equivalent comment to indicate -that a PR is acceptable. Any change requires at least one LGTM. No pull requests can be merged +Autoscaler collaborators may add "LGTM" (Looks Good To Me) or an equivalent comment to indicate +that a PR is acceptable. Any change requires at least one LGTM. No pull requests can be merged until at least one Autoscaler collaborator signs off with an LGTM. ### Support Channels diff --git a/addon-resizer/README.md b/addon-resizer/README.md index 24151e0dacc6..448bcb84f9dc 100644 --- a/addon-resizer/README.md +++ b/addon-resizer/README.md @@ -77,7 +77,7 @@ parameters: *Note: Addon Resizer uses buckets of cluster sizes, so it will use n larger than the cluster size by up to 50% for clusters larger than 16 nodes. For smaller clusters, n = 16 will be used.* - + 2. Memory parameters: ``` --memory diff --git a/addon-resizer/enhancements/5546-scaling-based-on-container-count/README.md b/addon-resizer/enhancements/5546-scaling-based-on-container-count/README.md index d2de8c89f70a..8df46bc82284 100644 --- a/addon-resizer/enhancements/5546-scaling-based-on-container-count/README.md +++ b/addon-resizer/enhancements/5546-scaling-based-on-container-count/README.md @@ -15,7 +15,7 @@ Currently Addon Resizer supports scaling based on the number of nodes. Some workloads use resources proportionally to the number of containers in the cluster. Since number of containers per node is very different in different clusters -it's more resource-efficient to scale such workloads based directly on the container count. +it's more resource-efficient to scale such workloads based directly on the container count. ### Goals @@ -46,7 +46,7 @@ Addon Resizer 1.8 assumes in multiple places that it's scaling based on the numb to either node count or container count, depending on the value of the `--scaling-mode` flag. - Many variable names in code which now refer to node count will refer to cluster size and should be renamed accordingly. -In addition to implementing the feature we should also clean up the code and documentation. +In addition to implementing the feature we should also clean up the code and documentation. ### Risks and Mitigations @@ -59,7 +59,7 @@ all containers could result in higher load on the Cluster API server. Since Addo I don't expect this effect to be noticeable. Also I expect metrics-server to test for this before using the feature and any other users of Addon Resizer are likely -better off using metrics (which don't have this problem). +better off using metrics (which don't have this problem). ## Design Details @@ -120,4 +120,4 @@ Both tests should be performed with metrics- and API- based scaling. [`Status.Phase`]: https://github.com/kubernetes/api/blob/1528256abbdf8ff2510112b28a6aacd239789a36/core/v1/types.go#L4011 [selector excluding pods in terminal states in VPA]: https://github.com/kubernetes/autoscaler/blob/04e5bfc88363b4af9fdeb9dfd06c362ec5831f51/vertical-pod-autoscaler/e2e/v1beta2/common.go#L195 [`updateResources()`]: https://github.com/kubernetes/autoscaler/blob/da500188188d275a382be578ad3d0a758c3a170f/addon-resizer/nanny/nanny_lib.go#L126 -[`example.yaml`]: https://github.com/kubernetes/autoscaler/blob/c8d612725c4f186d5de205ed0114f21540a8ed39/addon-resizer/deploy/example.yaml \ No newline at end of file +[`example.yaml`]: https://github.com/kubernetes/autoscaler/blob/c8d612725c4f186d5de205ed0114f21540a8ed39/addon-resizer/deploy/example.yaml diff --git a/addon-resizer/enhancements/5700-nanny-configuration-reload/README.md b/addon-resizer/enhancements/5700-nanny-configuration-reload/README.md index c661c57f36a1..e55ab0241332 100644 --- a/addon-resizer/enhancements/5700-nanny-configuration-reload/README.md +++ b/addon-resizer/enhancements/5700-nanny-configuration-reload/README.md @@ -14,7 +14,7 @@ Sure, here's the enhancement proposal in the requested format: ## Summary -- **Goals:** The goal of this enhancement is to improve the user experience for applying nanny configuration changes in the addon-resizer 1.8 when used with the metrics server. The proposed solution involves automatically reloading the nanny configuration whenever changes occur, eliminating the need for manual intervention and sidecar containers. +- **Goals:** The goal of this enhancement is to improve the user experience for applying nanny configuration changes in the addon-resizer 1.8 when used with the metrics server. The proposed solution involves automatically reloading the nanny configuration whenever changes occur, eliminating the need for manual intervention and sidecar containers. - **Non-Goals:** This proposal does not aim to update the functional behavior of the addon-resizer. ## Proposal diff --git a/balancer/Makefile b/balancer/Makefile index 5a8a144c3fde..cf7dc93682a4 100644 --- a/balancer/Makefile +++ b/balancer/Makefile @@ -56,4 +56,3 @@ format: test -z "$$(find . -path ./vendor -prune -type f -o -name '*.go' -exec gofmt -s -w {} + | tee /dev/stderr)" .PHONY: all build test-unit clean format release - diff --git a/balancer/examples/nginx-priority.yaml b/balancer/examples/nginx-priority.yaml index 0fde69f35802..15b2de9b5def 100644 --- a/balancer/examples/nginx-priority.yaml +++ b/balancer/examples/nginx-priority.yaml @@ -1,4 +1,4 @@ -# +# # Balancer scaling 2 deployments using priority policy. # apiVersion: apps/v1 diff --git a/balancer/proposals/balancer.md b/balancer/proposals/balancer.md index 534eaa59f64f..afefd0eff43e 100644 --- a/balancer/proposals/balancer.md +++ b/balancer/proposals/balancer.md @@ -1,50 +1,50 @@ -# KEP - Balancer +# KEP - Balancer ## Introduction -One of the problems that the users are facing when running Kubernetes deployments is how to -deploy pods across several domains and keep them balanced and autoscaled at the same time. +One of the problems that the users are facing when running Kubernetes deployments is how to +deploy pods across several domains and keep them balanced and autoscaled at the same time. These domains may include: * Cloud provider zones inside a single region, to ensure that the application is still up and running, even if one of the zones has issues. -* Different types of Kubernetes nodes. These may involve nodes that are spot/preemptible, or of different machine families. +* Different types of Kubernetes nodes. These may involve nodes that are spot/preemptible, or of different machine families. -A single Kubernetes deployment may either leave the placement entirely up to the scheduler -(most likely leading to something not entirely desired, like all pods going to a single domain) or -focus on a single domain (thus not achieving the goal of being in two or more domains). +A single Kubernetes deployment may either leave the placement entirely up to the scheduler +(most likely leading to something not entirely desired, like all pods going to a single domain) or +focus on a single domain (thus not achieving the goal of being in two or more domains). -PodTopologySpreading solves the problem a bit, but not completely. It allows only even spreading -and once the deployment gets skewed it doesn’t do anything to rebalance. Pod topology spreading -(with skew and/or ScheduleAnyway flag) is also just a hint, if skewed placement is available and -allowed then Cluster Autoscaler is not triggered and the user ends up with a skewed deployment. +PodTopologySpreading solves the problem a bit, but not completely. It allows only even spreading +and once the deployment gets skewed it doesn’t do anything to rebalance. Pod topology spreading +(with skew and/or ScheduleAnyway flag) is also just a hint, if skewed placement is available and +allowed then Cluster Autoscaler is not triggered and the user ends up with a skewed deployment. A user could specify a strict pod topolog spreading but then, in case of problems the deployment -would not move its pods to the domains that are available. The growth of the deployment would also +would not move its pods to the domains that are available. The growth of the deployment would also be totally blocked as the available domains would be too much skewed. -Thus, if full flexibility is needed, the only option is to have multiple deployments, targeting -different domains. This setup however creates one big problem. How to consistently autoscale multiple -deployments? The simplest idea - having multiple HPAs is not stable, due to different loads, race -conditions or so, some domains may grow while the others are shrunk. As HPAs and deployments are -not connected anyhow, the skewed setup will not fix itself automatically. It may eventually come to -a semi-balanced state but it is not guaranteed. +Thus, if full flexibility is needed, the only option is to have multiple deployments, targeting +different domains. This setup however creates one big problem. How to consistently autoscale multiple +deployments? The simplest idea - having multiple HPAs is not stable, due to different loads, race +conditions or so, some domains may grow while the others are shrunk. As HPAs and deployments are +not connected anyhow, the skewed setup will not fix itself automatically. It may eventually come to +a semi-balanced state but it is not guaranteed. Thus there is a need for some component that will: * Keep multiple deployments aligned. For example it may keep an equal ratio between the number of pods in one deployment and the other. Or put everything to the first and overflow to the second and so on. -* React to individual deployment problems should it be zone outage or lack of spot/preemptible vms. +* React to individual deployment problems should it be zone outage or lack of spot/preemptible vms. * Actively try to rebalance and get to the desired layout. * Allow to autoscale all deployments with a single target, while maintaining the placement policy. -## Balancer +## Balancer -Balancer is a stand-alone controller, living in userspace (or in control plane, if needed) exposing -a CRD API object, also called Balancer. Each balancer object has pointers to multiple deployments -or other pod-controlling objects that expose the Scale subresource. Balancer periodically checks -the number of running and problematic pods inside each of the targets, compares it with the desired -number of replicas, constraints and policies and adjusts the number of replicas on the targets, +Balancer is a stand-alone controller, living in userspace (or in control plane, if needed) exposing +a CRD API object, also called Balancer. Each balancer object has pointers to multiple deployments +or other pod-controlling objects that expose the Scale subresource. Balancer periodically checks +the number of running and problematic pods inside each of the targets, compares it with the desired +number of replicas, constraints and policies and adjusts the number of replicas on the targets, should some of them run too many or too few of them. To allow being an HPA target Balancer itself exposes the Scale subresource. @@ -66,7 +66,7 @@ type Balancer struct { // +optional Status BalancerStatus } - + // BalancerSpec is the specification of the Balancer behavior. type BalancerSpec struct { // Targets is a list of targets between which Balancer tries to distribute @@ -84,7 +84,7 @@ type BalancerSpec struct { // Policy defines how the balancer should distribute replicas among targets. Policy BalancerPolicy } - + // BalancerTarget is the declaration of one of the targets between which the balancer // tries to distribute replicas. type BalancerTarget struct { @@ -105,14 +105,14 @@ type BalancerTarget struct { // +optional MaxReplicas *int32 } - + // BalancerPolicyName is the name of the balancer Policy. type BalancerPolicyName string const ( PriorityPolicyName BalancerPolicyName = "priority" ProportionalPolicyName BalancerPolicyName = "proportional" ) - + // BalancerPolicy defines Balancer policy for replica distribution. type BalancerPolicy struct { // PolicyName decides how to balance replicas across the targets. @@ -131,7 +131,7 @@ type BalancerPolicy struct { // +optional Fallback *Fallback } - + // PriorityPolicy contains details for Priority-based policy for Balancer. type PriorityPolicy struct { // TargetOrder is the priority-based list of Balancer targets names. The first target @@ -141,7 +141,7 @@ type PriorityPolicy struct { // list, and/or total Balancer's replica count. TargetOrder []string } - + // ProportionalPolicy contains details for Proportion-based policy for Balancer. type ProportionalPolicy struct { // TargetProportions is a map from Balancer targets names to rates. Replicas are @@ -152,7 +152,7 @@ type ProportionalPolicy struct { // of the total Balancer's replica count, proportions or the presence in the map. TargetProportions map[string]int32 } - + // Fallback contains information how to recognize and handle replicas // that failed to start within the specified time period. type Fallback struct { @@ -162,7 +162,7 @@ type Fallback struct { // may be stopped. StartupTimeout metav1.Duration } - + // BalancerStatus describes the Balancer runtime state. type BalancerStatus struct { // Replicas is an actual number of observed pods matching Balancer selector. diff --git a/builder/README.md b/builder/README.md index adcc2615c528..b9b9ab369d36 100644 --- a/builder/README.md +++ b/builder/README.md @@ -1 +1 @@ -A Docker image that is used to build autoscaling-related binaries. \ No newline at end of file +A Docker image that is used to build autoscaling-related binaries. diff --git a/cluster-autoscaler/FAQ.md b/cluster-autoscaler/FAQ.md index 9a3aacab8fcf..6521a3f76088 100644 --- a/cluster-autoscaler/FAQ.md +++ b/cluster-autoscaler/FAQ.md @@ -981,6 +981,7 @@ The following startup parameters are supported for cluster autoscaler: | `address` | The address to expose prometheus metrics. | ":8085" | | `allowed-scheduler-names` | If set to non-empty value, CA will proceed only with pods targeting schedulers in the list, from the list of unschedulable and scheduler unprocessed pods | | | `alsologtostderr` | log to standard error as well as files (no effect when -logtostderr=true) | | +| `alsologtostderrthreshold` | logs at or above this threshold go to stderr when -alsologtostderr=true (no effect when -logtostderr=true) | | | `async-node-groups` | Whether clusterautoscaler creates and deletes node groups asynchronously. Experimental: requires cloud provider supporting async node group operations, enable at your own risk. | | | `aws-use-static-instance-list` | Should CA fetch instance types in runtime or use a static list. AWS only | | | `balance-similar-node-groups` | Detect similar node groups and balance the number of nodes between them | | @@ -1047,6 +1048,7 @@ The following startup parameters are supported for cluster autoscaler: | `leader-elect-resource-name` | The name of resource object that is used for locking during leader election. | "cluster-autoscaler" | | `leader-elect-resource-namespace` | The namespace of resource object that is used for locking during leader election. | | | `leader-elect-retry-period` | The duration the clients should wait between attempting acquisition and renewal of a leadership. This is only applicable if leader election is enabled. | 2s | +| `legacy-stderr-threshold-behavior` | If true, stderrthreshold is ignored when logtostderr=true (legacy behavior). If false, stderrthreshold is honored even when logtostderr=true | true | | `log-backtrace-at` | when logging hits line file:N, emit a stack trace | :0 | | `log-dir` | If non-empty, write log files in this directory (no effect when -logtostderr=true) | | | `log-file` | If non-empty, use this log file (no effect when -logtostderr=true) | | @@ -1108,7 +1110,6 @@ The following startup parameters are supported for cluster autoscaler: | `scale-down-delay-after-delete` | How long after node deletion that scale down evaluation resumes | 0s | | `scale-down-delay-after-failure` | How long after scale down failure that scale down evaluation resumes | 3m0s | | `scale-down-delay-type-local` | Should --scale-down-delay-after-* flags be applied locally per nodegroup or globally across all nodegroups | | -| `scale-down-enabled` | [Deprecated] Should CA scale down the cluster | true | | `scale-down-gpu-utilization-threshold` | Sum of gpu requests of all pods running on the node divided by node's allocatable resource, below which a node can be considered for scale down.Utilization calculation only cares about gpu resource for accelerator node. cpu and memory utilization will be ignored. | 0.5 | | `scale-down-non-empty-candidates-count` | Maximum number of non empty nodes considered in one iteration as candidates for scale down with drain.Lower value means better CA responsiveness but possible slower scale down latency.Higher value can affect CA performance with big clusters (hundreds of nodes).Set to non positive value to turn this heuristic off - CA will not limit the number of nodes it considers. | 30 | | `scale-down-simulation-timeout` | How long should we run scale down simulation. | 30s | @@ -1118,6 +1119,7 @@ The following startup parameters are supported for cluster autoscaler: | `scale-down-utilization-threshold` | The maximum value between the sum of cpu requests and sum of memory requests of all pods running on the node divided by node's corresponding allocatable resource, below which a node can be considered for scale down | 0.5 | | `scale-from-unschedulable` | Specifies that the CA should ignore a node's .spec.unschedulable field in node templates when considering to scale a node group. | | | `scale-up-from-zero` | Should CA scale up when there are 0 ready nodes. | true | +| `scaleup-simulation-for-skipped-node-groups-enabled` | Whether to enable the scale up simulation for skipped node groups. | | | `scan-interval` | How often cluster is reevaluated for scale up or down | 10s | | `scheduler-config-file` | scheduler-config allows changing configuration of in-tree scheduler plugins acting on PreFilter and Filter extension points | | | `skip-headers` | If true, avoid header prefixes in the log messages | | @@ -1129,7 +1131,7 @@ The following startup parameters are supported for cluster autoscaler: | `startup-taint-prefix` | Specifies a taint key prefix. Any taint whose key starts with this prefix will be treated as a startup taint (in addition to the built-in prefixes). Can be used multiple times. | [] | | `status-config-map-name` | Status configmap name | "cluster-autoscaler-status" | | `status-taint` | Specifies a taint to ignore in node templates when considering to scale a node group but nodes will not be treated as unready | [] | -| `stderrthreshold` | logs at or above this threshold go to stderr when writing to files and stderr (no effect when -logtostderr=true or -alsologtostderr=true) | 2 | +| `stderrthreshold` | logs at or above this threshold go to stderr when writing to files and stderr (no effect when -logtostderr=true or -alsologtostderr=true unless -legacy_stderr_threshold_behavior=false) | 2 | | `unremovable-node-recheck-timeout` | The timeout before we check again a node that couldn't be removed before | 5m0s | | `user-agent` | User agent used for HTTP calls. | "cluster-autoscaler" | | `v` | number for the log level verbosity | | diff --git a/cluster-autoscaler/apis/hack/update-codegen.sh b/cluster-autoscaler/apis/hack/update-codegen.sh index 0dbc9a933dfa..ce88df978b57 100755 --- a/cluster-autoscaler/apis/hack/update-codegen.sh +++ b/cluster-autoscaler/apis/hack/update-codegen.sh @@ -15,7 +15,7 @@ # limitations under the License. ### -# This script is to be used when updating the generated clients of +# This script is to be used when updating the generated clients of # the Provisioning Request CRD. ### diff --git a/cluster-autoscaler/charts/cluster-autoscaler/Chart.yaml b/cluster-autoscaler/charts/cluster-autoscaler/Chart.yaml index 791aca6b1aa9..c241137f91bf 100644 --- a/cluster-autoscaler/charts/cluster-autoscaler/Chart.yaml +++ b/cluster-autoscaler/charts/cluster-autoscaler/Chart.yaml @@ -11,4 +11,4 @@ name: cluster-autoscaler sources: - https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler type: application -version: 9.56.0 +version: 9.56.1 diff --git a/cluster-autoscaler/charts/cluster-autoscaler/templates/configmap.yaml b/cluster-autoscaler/charts/cluster-autoscaler/templates/configmap.yaml index 6cd0c4064bfa..da475056b70f 100644 --- a/cluster-autoscaler/charts/cluster-autoscaler/templates/configmap.yaml +++ b/cluster-autoscaler/charts/cluster-autoscaler/templates/configmap.yaml @@ -4,7 +4,7 @@ kind: ConfigMap metadata: name: {{ .Values.kwokConfigMapName | default "kwok-provider-config" }} namespace: {{ .Release.Namespace }} -data: +data: config: |- # if you see '\n' everywhere, remove all the trailing spaces apiVersion: v1alpha1 @@ -38,13 +38,13 @@ data: # # you can also disable installing kwok in CA code (and install your own kwok release) # kwok: # install: false (true if not specified) ---- +--- apiVersion: v1 kind: ConfigMap metadata: name: kwok-provider-templates namespace: {{ .Release.Namespace }} -data: +data: templates: |- # if you see '\n' everywhere, remove all the trailing spaces apiVersion: v1 @@ -412,5 +412,5 @@ data: metadata: resourceVersion: "" - + {{- end }} diff --git a/cluster-autoscaler/cloudprovider/alicloud/README.md b/cluster-autoscaler/cloudprovider/alicloud/README.md index f67db568ef09..2f4d6ea75bee 100644 --- a/cluster-autoscaler/cloudprovider/alicloud/README.md +++ b/cluster-autoscaler/cloudprovider/alicloud/README.md @@ -4,19 +4,19 @@ The cluster autoscaler on AliCloud scales worker nodes within any specified auto ## Kubernetes Version Cluster autoscaler must run on v1.9.3 or greater. -## Instance Type Support +## Instance Type Support - **Standard Instance**x86-Architecture,suitable for common scenes such as websites or api services. - **GPU/FPGA Instance**Heterogeneous Computing,suitable for high performance computing. - **Bare Metal Instance**Both the elasticity of a virtual server and the high-performance and comprehensive features of a physical server. - **Spot Instance**Spot instance are on-demand instances. They are designed to reduce your ECS costs in some cases. -## ACS Console Deployment +## ACS Console Deployment doc: https://www.alibabacloud.com/help/en/container-service-for-kubernetes/latest/auto-scaling-of-nodes ## Custom Deployment ### 1.Prepare Identity authentication -#### Use access-key-id and access-key-secret +#### Use access-key-id and access-key-secret ```yaml apiVersion: v1 kind: Secret @@ -30,8 +30,8 @@ data: access-key-secret: "" region-id: "" ``` -#### Use STS with RAM Role -```yaml +#### Use STS with RAM Role +```yaml { "Version": "1", "Statement": [ @@ -56,19 +56,19 @@ data: } ``` -### 2.ASG Setup +### 2.ASG Setup * create a Scaling Group in ESS(https://essnew.console.aliyun.com) with valid configurations. * create a Scaling Configuration for this Scaling Group with valid instanceType and User Data.In User Data,you can specific the script to initialize the environment and join this node to kubernetes cluster.If your Kubernetes cluster is hosted by ACS.you can use the attach script like this. ```shell #!/bin/sh # The token is generated by ACS console. https://www.alibabacloud.com/help/doc-detail/64983.htm?spm=a2c63.l28256.b99.33.46395ad54ozJFq -curl http://aliacs-k8s-cn-hangzhou.oss-cn-hangzhou.aliyuncs.com/public/pkg/run/attach/[kubernetes_cluster_version]/attach_node.sh | bash -s -- --openapi-token [token] --ess true +curl http://aliacs-k8s-cn-hangzhou.oss-cn-hangzhou.aliyuncs.com/public/pkg/run/attach/[kubernetes_cluster_version]/attach_node.sh | bash -s -- --openapi-token [token] --ess true ``` -### 3.cluster-autoscaler deployment +### 3.cluster-autoscaler deployment -#### Use access-key-id and access-key-secret +#### Use access-key-id and access-key-secret ```yaml apiVersion: apps/v1 kind: Deployment @@ -133,7 +133,7 @@ spec: path: "/etc/ssl/certs/ca-certificates.crt" ``` -#### Use STS with RAM Role +#### Use STS with RAM Role ```yaml apiVersion: apps/v1 diff --git a/cluster-autoscaler/cloudprovider/aws/CA_with_AWS_IAM_OIDC.md b/cluster-autoscaler/cloudprovider/aws/CA_with_AWS_IAM_OIDC.md index 9c7f6c060a90..b3ad2c159357 100644 --- a/cluster-autoscaler/cloudprovider/aws/CA_with_AWS_IAM_OIDC.md +++ b/cluster-autoscaler/cloudprovider/aws/CA_with_AWS_IAM_OIDC.md @@ -1,12 +1,12 @@ -#### The following is an example to make use of the AWS IAM OIDC with the Cluster Autoscaler in an EKS cluster. +#### The following is an example to make use of the AWS IAM OIDC with the Cluster Autoscaler in an EKS cluster. -#### Prerequisites +#### Prerequisites - - An Active EKS cluster (1.14 preferred since it is the latest) against which the user is able to run kubectl commands. - - Cluster must consist of at least one worker node ASG. + - An Active EKS cluster (1.14 preferred since it is the latest) against which the user is able to run kubectl commands. + - Cluster must consist of at least one worker node ASG. -A) Create an IAM OIDC identity provider for your cluster with the AWS Management Console using the [documentation] . +A) Create an IAM OIDC identity provider for your cluster with the AWS Management Console using the [documentation] . B) Create a test [IAM policy] for your service accounts. @@ -28,21 +28,21 @@ B) Create a test [IAM policy] for your service accounts. ``` C) Create an IAM role for your service accounts in the console. -- Retrieve the OIDC issuer URL from the Amazon EKS console description of your cluster . It will look something identical to: +- Retrieve the OIDC issuer URL from the Amazon EKS console description of your cluster . It will look something identical to: 'https://oidc.eks.us-east-1.amazonaws.com/id/xxxxxxxxxx' - While creating a new IAM role, In the "Select type of trusted entity" section, choose "Web identity". - In the "Choose a web identity provider" section: For Identity provider, choose the URL for your cluster. For Audience, type sts.amazonaws.com. -- In the "Attach Policy" section, select the policy to use for your service account, that you created in Section B above. +- In the "Attach Policy" section, select the policy to use for your service account, that you created in Section B above. - After the role is created, choose the role in the console to open it for editing. - Choose the "Trust relationships" tab, and then choose "Edit trust relationship". Edit the OIDC provider suffix and change it from :aud to :sub. Replace sts.amazonaws.com to your service account ID. -- Update trust policy to finish. +- Update trust policy to finish. -D) Set up [Cluster Autoscaler Auto-Discovery] using the [tutorial](README.md#auto-discovery-setup) . +D) Set up [Cluster Autoscaler Auto-Discovery] using the [tutorial](README.md#auto-discovery-setup) . - Open the Amazon EC2 console, and then choose EKS worker node Auto Scaling Groups from the navigation pane. - In the "Add/Edit Auto Scaling Group Tags" window, please make sure you enter the following tags by replacing 'awsExampleClusterName' with the name of your EKS cluster. Then, choose "Save". @@ -92,11 +92,11 @@ __NOTE:__ Please see [the README](README.md#IAM-Policy) for more information on $ wget https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml ``` -- Open the downloaded YAML file in an editor. +- Open the downloaded YAML file in an editor. -##### Change 1: +##### Change 1: -Set the EKS cluster name (awsExampleClusterName) and environment variable (us-east-1) based on the following example. +Set the EKS cluster name (awsExampleClusterName) and environment variable (us-east-1) based on the following example. ```sh spec: @@ -124,9 +124,9 @@ Set the EKS cluster name (awsExampleClusterName) and environment variable (us-ea value: <> ``` -##### Change 2: +##### Change 2: -To use IAM with OIDC, you will have to make the below changes to the file as well. +To use IAM with OIDC, you will have to make the below changes to the file as well. ```sh apiVersion: v1 @@ -148,14 +148,14 @@ $ kubectl get pods -n kube-system $ kubectl exec -n kube-system cluster-autoscaler-xxxxxx-xxxxx env | grep AWS ``` -Output of the exec command should ideally display the values for AWS_REGION, AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE where the role arn must be the same as the role provided in the service account annotations. +Output of the exec command should ideally display the values for AWS_REGION, AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE where the role arn must be the same as the role provided in the service account annotations. -The cluster autoscaler scaling the worker nodes can also be tested: +The cluster autoscaler scaling the worker nodes can also be tested: ```sh $ kubectl scale deployment autoscaler-demo --replicas=50 deployment.extensions/autoscaler-demo scaled - + $ kubectl get deployment NAME READY UP-TO-DATE AVAILABLE AGE autoscaler-demo 55/55 55 55 143m @@ -168,13 +168,9 @@ I1025 13:48:42.975037 1 scale_up.go:529] Final scale-up plan: [{eksctl-xxx ``` -[//]: # +[//]: # [Cluster Autoscaler Auto-Discovery]: - [IAM OIDC]: + [IAM OIDC]: [IAM policy]: - [documentation]: - - - - + [documentation]: diff --git a/cluster-autoscaler/cloudprovider/aws/MixedInstancePolicy.md b/cluster-autoscaler/cloudprovider/aws/MixedInstancePolicy.md index 69b4cc98b636..af347f602786 100644 --- a/cluster-autoscaler/cloudprovider/aws/MixedInstancePolicy.md +++ b/cluster-autoscaler/cloudprovider/aws/MixedInstancePolicy.md @@ -64,6 +64,6 @@ The following is an excerpt from a CloudFormation template showing how a MixedIn } ``` -[r5.2xlarge](https://aws.amazon.com/ec2/instance-types/#Memory_Optimized) is the 'base' instance type, with overrides for r5d.2xlarge, i3.2xlarge, r5a.2xlarge and r5ad.2xlarge. +[r5.2xlarge](https://aws.amazon.com/ec2/instance-types/#Memory_Optimized) is the 'base' instance type, with overrides for r5d.2xlarge, i3.2xlarge, r5a.2xlarge and r5ad.2xlarge. Note how one Auto Scaling Group is created per Availability Zone, since CA does not currently support ASGs that span multiple Availability Zones. See [Common Notes and Gotchas](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider/aws#common-notes-and-gotchas). diff --git a/cluster-autoscaler/cloudprovider/aws/examples/values-cloudconfig-example.yaml b/cluster-autoscaler/cloudprovider/aws/examples/values-cloudconfig-example.yaml index 08586e09cc04..a10ce9184adf 100644 --- a/cluster-autoscaler/cloudprovider/aws/examples/values-cloudconfig-example.yaml +++ b/cluster-autoscaler/cloudprovider/aws/examples/values-cloudconfig-example.yaml @@ -19,4 +19,3 @@ extraVolumes: extraVolumeMounts: - name: cloud-config mountPath: config - diff --git a/cluster-autoscaler/cloudprovider/azure/examples/dev/aks-dev-deploy.sh b/cluster-autoscaler/cloudprovider/azure/examples/dev/aks-dev-deploy.sh index f66cec602163..e7d6ea35da8d 100755 --- a/cluster-autoscaler/cloudprovider/azure/examples/dev/aks-dev-deploy.sh +++ b/cluster-autoscaler/cloudprovider/azure/examples/dev/aks-dev-deploy.sh @@ -75,7 +75,7 @@ exit # To recover access after restarting codespace with existing AKS and ACR: # az login & az account set -n ... -# az aks get-credentials -n cas-test -g $CODESPACE_NAME +# az aks get-credentials -n cas-test -g $CODESPACE_NAME # ACR_NAME=$(echo "$CODESPACE_NAME" | tr -d -) # az acr login -n $ACR_NAME # skaffold config set default-repo "${ACR_NAME}.azurecr.io/cluster-autoscaler" diff --git a/cluster-autoscaler/cloudprovider/azure/examples/dev/aks-dev.bicep b/cluster-autoscaler/cloudprovider/azure/examples/dev/aks-dev.bicep index 0998e2725070..3550f651db16 100644 --- a/cluster-autoscaler/cloudprovider/azure/examples/dev/aks-dev.bicep +++ b/cluster-autoscaler/cloudprovider/azure/examples/dev/aks-dev.bicep @@ -16,8 +16,8 @@ resource aks 'Microsoft.ContainerService/managedClusters@2023-11-01' = { dnsPrefix: dnsPrefix oidcIssuerProfile: { enabled: true } // --enable-oidc-issuer securityProfile: { - workloadIdentity: { enabled: true } // --enable-workload-identity - } + workloadIdentity: { enabled: true } // --enable-workload-identity + } agentPoolProfiles: [ { count: 1 @@ -37,7 +37,7 @@ resource aks 'Microsoft.ContainerService/managedClusters@2023-11-01' = { networkProfile: { networkPlugin: 'azure' networkPluginMode: 'overlay' - } + } } } diff --git a/cluster-autoscaler/cloudprovider/azure/examples/dev/cluster-autoscaler-vmss-wi-dynamic.yaml.tpl b/cluster-autoscaler/cloudprovider/azure/examples/dev/cluster-autoscaler-vmss-wi-dynamic.yaml.tpl index 277f2cfee0a0..cfd42db9b5a7 100644 --- a/cluster-autoscaler/cloudprovider/azure/examples/dev/cluster-autoscaler-vmss-wi-dynamic.yaml.tpl +++ b/cluster-autoscaler/cloudprovider/azure/examples/dev/cluster-autoscaler-vmss-wi-dynamic.yaml.tpl @@ -203,8 +203,8 @@ spec: operator: In values: ["system"] restartPolicy: Always - volumes: + volumes: - hostPath: path: /etc/ssl/certs/ca-certificates.crt type: "" - name: ssl-certs \ No newline at end of file + name: ssl-certs diff --git a/cluster-autoscaler/cloudprovider/azure/examples/dev/skaffold.yaml b/cluster-autoscaler/cloudprovider/azure/examples/dev/skaffold.yaml index d8a4f6d2b135..51368452d3bd 100644 --- a/cluster-autoscaler/cloudprovider/azure/examples/dev/skaffold.yaml +++ b/cluster-autoscaler/cloudprovider/azure/examples/dev/skaffold.yaml @@ -11,4 +11,4 @@ manifests: - "cloudprovider/azure/examples/dev/cluster-autoscaler-vmss-wi-dynamic.yaml" # include workload here to have it deployed _and removed_ with CAS; # comment out if this does not fit your workflow - - "cloudprovider/azure/examples/workloads/inflate.yaml" \ No newline at end of file + - "cloudprovider/azure/examples/workloads/inflate.yaml" diff --git a/cluster-autoscaler/cloudprovider/azure/test/go.sum b/cluster-autoscaler/cloudprovider/azure/test/go.sum index 259786c5f79e..a4bb955d9125 100644 --- a/cluster-autoscaler/cloudprovider/azure/test/go.sum +++ b/cluster-autoscaler/cloudprovider/azure/test/go.sum @@ -484,4 +484,4 @@ sigs.k8s.io/randfill v1.0.0/go.mod h1:XeLlZ/jmk4i1HRopwe7/aU3H5n1zNUcX6TM94b3QxO sigs.k8s.io/structured-merge-diff/v4 v4.7.0 h1:qPeWmscJcXP0snki5IYF79Z8xrl8ETFxgMd7wez1XkI= sigs.k8s.io/structured-merge-diff/v4 v4.7.0/go.mod h1:dDy58f92j70zLsuZVuUX5Wp9vtxXpaZnkPGWeqDfCps= sigs.k8s.io/yaml v1.4.0 h1:Mk1wCc2gy/F0THH0TAp1QYyJNzRm2KCLy3o5ASXVI5E= -sigs.k8s.io/yaml v1.4.0/go.mod h1:Ejl7/uTz7PSA4eKMyQCUTnhZYNmLIl+5c2lQPGR2BPY= \ No newline at end of file +sigs.k8s.io/yaml v1.4.0/go.mod h1:Ejl7/uTz7PSA4eKMyQCUTnhZYNmLIl+5c2lQPGR2BPY= diff --git a/cluster-autoscaler/cloudprovider/baiducloud/README.md b/cluster-autoscaler/cloudprovider/baiducloud/README.md index eaaaecdca5fd..33219da294aa 100644 --- a/cluster-autoscaler/cloudprovider/baiducloud/README.md +++ b/cluster-autoscaler/cloudprovider/baiducloud/README.md @@ -21,5 +21,5 @@ kubectl apply -f examples/cluster-autoscaler-multiple-asg.yaml - By default, cluster autoscaler will wait 10 minutes between scale down operations, you can adjust this using the `--scale-down-delay` flag. E.g. `--scale-down-delay=5m` to decrease the scale down delay to 5 minutes. ## Maintainer -* Hongbin Mao [@hello2mao](https://github.com/hello2mao) -* Ti Zhou [@tizhou86](https://github.com/tizhou86) \ No newline at end of file +* Hongbin Mao [@hello2mao](https://github.com/hello2mao) +* Ti Zhou [@tizhou86](https://github.com/tizhou86) diff --git a/cluster-autoscaler/cloudprovider/bizflycloud/gobizfly/Dockerfile b/cluster-autoscaler/cloudprovider/bizflycloud/gobizfly/Dockerfile index d69e3f108fcb..d8388e011fb1 100644 --- a/cluster-autoscaler/cloudprovider/bizflycloud/gobizfly/Dockerfile +++ b/cluster-autoscaler/cloudprovider/bizflycloud/gobizfly/Dockerfile @@ -25,5 +25,3 @@ COPY . . RUN CGO_ENABLED=1 GOOS=linux GOARCH=amd64 go build -gcflags="-N -l" -o /bin/gobizfly *.go ENTRYPOINT [ "/bin/gobizfly" ] - - diff --git a/cluster-autoscaler/cloudprovider/bizflycloud/gobizfly/Makefile b/cluster-autoscaler/cloudprovider/bizflycloud/gobizfly/Makefile index b4a556259179..ae776e13f531 100644 --- a/cluster-autoscaler/cloudprovider/bizflycloud/gobizfly/Makefile +++ b/cluster-autoscaler/cloudprovider/bizflycloud/gobizfly/Makefile @@ -2,9 +2,9 @@ PROJECT_NAME := "gobizfly" PKG := "github.com/bizflycloud/$(PROJECT_NAME)" PKG_LIST := $(shell go list ${PKG}/... | grep -v /vendor/) GO_FILES := $(shell find . -name '*.go' | grep -v /vendor/ | grep -v _test.go) - + .PHONY: all dep lint vet test test-coverage build clean - + all: build dep: ## Get the dependencies @@ -20,15 +20,15 @@ test: ## Run unittests @go test -short ${PKG_LIST} test-coverage: ## Run tests with coverage - @go test -short -coverprofile cover.out -covermode=atomic ${PKG_LIST} + @go test -short -coverprofile cover.out -covermode=atomic ${PKG_LIST} @cat cover.out >> coverage.txt build: dep ## Build the binary file @go build -i -o build/main *.go ## $(PKG) - + clean: ## Remove previous build @rm -f $(PROJECT_NAME)/build - + help: ## Display this help screen @grep -h -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-30s\033[0m %s\n", $$1, $$2}' diff --git a/cluster-autoscaler/cloudprovider/bizflycloud/manifest/cluster-autoscaler.yaml b/cluster-autoscaler/cloudprovider/bizflycloud/manifest/cluster-autoscaler.yaml index f414ab5cfbba..e25c362063f0 100644 --- a/cluster-autoscaler/cloudprovider/bizflycloud/manifest/cluster-autoscaler.yaml +++ b/cluster-autoscaler/cloudprovider/bizflycloud/manifest/cluster-autoscaler.yaml @@ -33,7 +33,7 @@ spec: - --skip-nodes-with-local-storage=false - --leader-elect=true - --expander=least-waste - - --kubeconfig=/var/lib/kubernetes/clusterxxxx.kubeconfig + - --kubeconfig=/var/lib/kubernetes/clusterxxxx.kubeconfig env: - name: BIZFLYCLOUD_AUTH_METHOD value: password #application_credential @@ -57,10 +57,10 @@ spec: value: xxxxxxxxxxxxxxxxx volumeMounts: - name: kubeconfig - mountPath: /var/lib/kubernetes/clusterxxxx.kubeconfig + mountPath: /var/lib/kubernetes/clusterxxxx.kubeconfig readOnly: true imagePullPolicy: "Always" volumes: - name: ssl-cekubeconfigrts hostPath: - path: "/var/lib/kubernetes/clusterxxxx.kubeconfig" \ No newline at end of file + path: "/var/lib/kubernetes/clusterxxxx.kubeconfig" diff --git a/cluster-autoscaler/cloudprovider/bizflycloud/manifest/demo.yaml b/cluster-autoscaler/cloudprovider/bizflycloud/manifest/demo.yaml index fe6e8be85556..f3817081a310 100644 --- a/cluster-autoscaler/cloudprovider/bizflycloud/manifest/demo.yaml +++ b/cluster-autoscaler/cloudprovider/bizflycloud/manifest/demo.yaml @@ -24,4 +24,4 @@ spec: memory: 200Mi requests: cpu: 200m - memory: 200Mi \ No newline at end of file + memory: 200Mi diff --git a/cluster-autoscaler/cloudprovider/bizflycloud/manifest/rbac.yaml b/cluster-autoscaler/cloudprovider/bizflycloud/manifest/rbac.yaml index 13bfd3f5943b..8a3b25b97276 100644 --- a/cluster-autoscaler/cloudprovider/bizflycloud/manifest/rbac.yaml +++ b/cluster-autoscaler/cloudprovider/bizflycloud/manifest/rbac.yaml @@ -114,4 +114,4 @@ roleRef: subjects: - kind: ServiceAccount name: cluster-autoscaler - namespace: kube-system \ No newline at end of file + namespace: kube-system diff --git a/cluster-autoscaler/cloudprovider/brightbox/Makefile b/cluster-autoscaler/cloudprovider/brightbox/Makefile index d7391aad7e5d..a98bfe23a7f0 100644 --- a/cluster-autoscaler/cloudprovider/brightbox/Makefile +++ b/cluster-autoscaler/cloudprovider/brightbox/Makefile @@ -40,5 +40,5 @@ secret: ${HOME}/.docker/config.json build: ../../cluster-autoscaler .PHONY: clean -clean: +clean: $(MAKE) -C ../.. $@ diff --git a/cluster-autoscaler/cloudprovider/brightbox/examples/rebase.sh b/cluster-autoscaler/cloudprovider/brightbox/examples/rebase.sh index 0d751faddaa2..a7506bb81fbc 100644 --- a/cluster-autoscaler/cloudprovider/brightbox/examples/rebase.sh +++ b/cluster-autoscaler/cloudprovider/brightbox/examples/rebase.sh @@ -18,4 +18,3 @@ set -e git rebase --onto cluster-autoscaler-1.17.2 cluster-autoscaler-1.17.1 autoscaler-brightbox-cloudprovider-1.17 git rebase --onto cluster-autoscaler-1.18.1 cluster-autoscaler-1.18.0 autoscaler-brightbox-cloudprovider-1.18 - diff --git a/cluster-autoscaler/cloudprovider/brightbox/linkheader/README.mkd b/cluster-autoscaler/cloudprovider/brightbox/linkheader/README.mkd index 2a949cac2f72..ae0ed21f0b14 100644 --- a/cluster-autoscaler/cloudprovider/brightbox/linkheader/README.mkd +++ b/cluster-autoscaler/cloudprovider/brightbox/linkheader/README.mkd @@ -31,5 +31,3 @@ func main() { // URL: https://api.github.com/user/58276/repos?page=2; Rel: next // URL: https://api.github.com/user/58276/repos?page=2; Rel: last ``` - - diff --git a/cluster-autoscaler/cloudprovider/cherryservers/README.md b/cluster-autoscaler/cloudprovider/cherryservers/README.md index d4361a9f4cee..0e21c13a892c 100644 --- a/cluster-autoscaler/cloudprovider/cherryservers/README.md +++ b/cluster-autoscaler/cloudprovider/cherryservers/README.md @@ -101,7 +101,7 @@ affinity: By default, autoscaler assumes that you have a recent version of [Cherry Servers CCM](https://github.com/cherryservers/cloud-provider-cherry) installed in your -cluster. +cluster. ## Notes @@ -145,7 +145,7 @@ To run the CherryServers cluster-autoscaler locally: The command-line format is: ``` -cluster-autoscaler --alsologtostderr --cluster-name=$CLUSTER_NAME --cloud-config=$CLOUD_CONFIG \ +cluster-autoscaler --alsologtostderr --cluster-name=$CLUSTER_NAME --cloud-config=$CLOUD_CONFIG \ --cloud-provider=cherryservers \ --nodes=0:10:pool1 \ --nodes=0:10:pool2 \ @@ -170,7 +170,7 @@ but this must be run from the `cluster-autoscaler` directory, i.e. not within th cloudprovider implementation: ``` -go run . --alsologtostderr --cluster-name=$CLUSTER_NAME --cloud-config=$CLOUD_CONFIG \ +go run . --alsologtostderr --cluster-name=$CLUSTER_NAME --cloud-config=$CLOUD_CONFIG \ --cloud-provider=cherryservers \ --nodes=0:10:pool1 \ --nodes=0:10:pool2 \ diff --git a/cluster-autoscaler/cloudprovider/coreweave/README.md b/cluster-autoscaler/cloudprovider/coreweave/README.md index 2250dd5c01be..1365a7bda57b 100644 --- a/cluster-autoscaler/cloudprovider/coreweave/README.md +++ b/cluster-autoscaler/cloudprovider/coreweave/README.md @@ -43,12 +43,12 @@ To enable the CoreWeave provider, set the following flag when running the autosc ``` ## Usage with Helm Charts -When deploying the Cluster Autoscaler for CoreWeave using the provided Helm chart, you can customize its behavior using the `extraArgs` section in your `values.yaml` file. +When deploying the Cluster Autoscaler for CoreWeave using the provided Helm chart, you can customize its behavior using the `extraArgs` section in your `values.yaml` file. These arguments are passed directly to the Cluster Autoscaler container. ## Helm Chart Deployment -You can deploy the Cluster Autoscaler for CoreWeave using the official Helm chart. +You can deploy the Cluster Autoscaler for CoreWeave using the official Helm chart. Below are the basic steps: 1. **Add the Helm repository (if not already added):** diff --git a/cluster-autoscaler/cloudprovider/digitalocean/testdata/whitespace_token b/cluster-autoscaler/cloudprovider/digitalocean/testdata/whitespace_token index 139597f9cb07..e69de29bb2d1 100644 --- a/cluster-autoscaler/cloudprovider/digitalocean/testdata/whitespace_token +++ b/cluster-autoscaler/cloudprovider/digitalocean/testdata/whitespace_token @@ -1,2 +0,0 @@ - - diff --git a/cluster-autoscaler/cloudprovider/exoscale/vendor_internal.sh b/cluster-autoscaler/cloudprovider/exoscale/vendor_internal.sh index b4651dab1f8d..55d5e9c9033b 100755 --- a/cluster-autoscaler/cloudprovider/exoscale/vendor_internal.sh +++ b/cluster-autoscaler/cloudprovider/exoscale/vendor_internal.sh @@ -8,7 +8,7 @@ # - github.com/deepmap/oapi-codegen # - k8s.io/klog -if [[ $# -ne 1 ]]; then +if [[ $# -ne 1 ]]; then echo "usage: $0 " exit 1 fi diff --git a/cluster-autoscaler/cloudprovider/gce/fixtures/diskTypes_list.json b/cluster-autoscaler/cloudprovider/gce/fixtures/diskTypes_list.json index 08eb170da8f8..02855489b143 100644 --- a/cluster-autoscaler/cloudprovider/gce/fixtures/diskTypes_list.json +++ b/cluster-autoscaler/cloudprovider/gce/fixtures/diskTypes_list.json @@ -92,4 +92,4 @@ } ], "selfLink": "https://www.googleapis.com/compute/v1/projects/project/zones/us-central1-b/diskTypes" -} \ No newline at end of file +} diff --git a/cluster-autoscaler/cloudprovider/hetzner/hack/update-vendor.sh b/cluster-autoscaler/cloudprovider/hetzner/hack/update-vendor.sh index c7ce398a2da9..8c990cd3acb1 100755 --- a/cluster-autoscaler/cloudprovider/hetzner/hack/update-vendor.sh +++ b/cluster-autoscaler/cloudprovider/hetzner/hack/update-vendor.sh @@ -25,4 +25,3 @@ find "$vendor_path" -type d -empty -delete echo "# Rewriting module path" find "$vendor_path" -type f -exec sed -i "s@${original_module_path}@${vendor_module_path}@g" {} + - diff --git a/cluster-autoscaler/cloudprovider/huaweicloud/README.md b/cluster-autoscaler/cloudprovider/huaweicloud/README.md index b90d76540f34..e21054d446dd 100644 --- a/cluster-autoscaler/cloudprovider/huaweicloud/README.md +++ b/cluster-autoscaler/cloudprovider/huaweicloud/README.md @@ -1,9 +1,9 @@ -# Cluster Autoscaler on Huawei Cloud +# Cluster Autoscaler on Huawei Cloud ## Overview The cluster autoscaler works with self-built Kubernetes cluster on [Huaweicloud ECS](https://www.huaweicloud.com/intl/en-us/product/ecs.html) and -specified [Huaweicloud Auto Scaling Groups](https://www.huaweicloud.com/intl/en-us/product/as.html) -It runs as a Deployment on a worker node in the cluster. This README will go over some of the necessary steps required +specified [Huaweicloud Auto Scaling Groups](https://www.huaweicloud.com/intl/en-us/product/as.html) +It runs as a Deployment on a worker node in the cluster. This README will go over some of the necessary steps required to get the cluster autoscaler up and running. ## Deployment Steps @@ -11,19 +11,19 @@ to get the cluster autoscaler up and running. #### Environment 1. Download Project - Get the latest `autoscaler` project and download it to `${GOPATH}/src/k8s.io`. - - This is used for building your image, so the machine you use here should be able to access GCR. Do not use a Huawei + Get the latest `autoscaler` project and download it to `${GOPATH}/src/k8s.io`. + + This is used for building your image, so the machine you use here should be able to access GCR. Do not use a Huawei Cloud ECS. 2. Go environment Make sure you have Go installed in the above machine. - + 3. Docker environment Make sure you have Docker installed in the above machine. - + #### Build and push the image Execute the following commands in the directory of `autoscaler/cluster-autoscaler` of the autoscaler project downloaded previously. The following steps use Huawei SoftWare Repository for Container (SWR) as an example registry. @@ -42,37 +42,37 @@ The following steps use Huawei SoftWare Repository for Container (SWR) as an exa ``` Follow the `Pull/Push Image` section of `Interactive Walkthroughs` under the SWR console to find the image repository address and organization name, and also refer to `My Images` -> `Upload Through Docker Client` in SWR console. - + 3. Login to SWR: ``` docker login -u {Encoded username} -p {Encoded password} {SWR endpoint} ``` - + For example: ``` docker login -u cn-north-4@ABCD1EFGH2IJ34KLMN -p 1a23bc45678def9g01hi23jk4l56m789nop01q2r3s4t567u89v0w1x23y4z5678 swr.cn-north-4.myhuaweicloud.com ``` Follow the `Pull/Push Image` section of `Interactive Walkthroughs` under the SWR console to find the encoded username, encoded password and swr endpoint, and also refer to `My Images` -> `Upload Through Docker Client` in SWR console. - + 4. Push the docker image to SWR: ``` docker push {Image repository address}/{Organization name}/{Image name:tag} ``` - + For example: ``` docker push swr.cn-north-4.myhuaweicloud.com/{Organization name}/cluster-autoscaler:dev ``` - + 5. For the cluster autoscaler to function normally, make sure the `Sharing Type` of the image is `Public`. - If the cluster has trouble pulling the image, go to SWR console and check whether the `Sharing Type` of the image is - `Private`. If it is, click `Edit` button on top right and set the `Sharing Type` to `Public`. - + If the cluster has trouble pulling the image, go to SWR console and check whether the `Sharing Type` of the image is + `Private`. If it is, click `Edit` button on top right and set the `Sharing Type` to `Public`. + -## Build Kubernetes Cluster on ECS +## Build Kubernetes Cluster on ECS -### 1. Install kubelet, kubeadm and kubectl +### 1. Install kubelet, kubeadm and kubectl Please see installation [here](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/) @@ -190,7 +190,7 @@ sudo chown $(id -u):$(id -g) $HOME/.kube/config ``` ### 4. Install Flannel Network -```bash +```bash kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml ``` ### 5. Generate Token @@ -289,19 +289,19 @@ openssl x509 -in /etc/kubernetes/pki/ca.crt -noout -pubkey | openssl rsa -pubin ### Deploy Cluster Autoscaler #### Configure credentials -The autoscaler needs a `ServiceAccount` which is granted permissions to the cluster's resources and a `Secret` which +The autoscaler needs a `ServiceAccount` which is granted permissions to the cluster's resources and a `Secret` which stores credential (AK/SK in this case) information for authenticating with Huawei cloud. - + Examples of `ServiceAccount` and `Secret` are provided in [examples/cluster-autoscaler-svcaccount.yaml](examples/cluster-autoscaler-svcaccount.yaml) -and [examples/cluster-autoscaler-secret.yaml](examples/cluster-autoscaler-secret.yaml). Modify the Secret +and [examples/cluster-autoscaler-secret.yaml](examples/cluster-autoscaler-secret.yaml). Modify the Secret object yaml file with your credentials. The following parameters are required in the Secret object yaml file: - `as-endpoint` - Find the as endpoint for different regions [here](https://developer.huaweicloud.com/endpoint?AS), - + Find the as endpoint for different regions [here](https://developer.huaweicloud.com/endpoint?AS), + For example, for region `cn-north-4`, the endpoint is ``` as.cn-north-4.myhuaweicloud.com @@ -309,15 +309,15 @@ The following parameters are required in the Secret object yaml file: - `ecs-endpoint` - Find the ecs endpoint for different regions [here](https://developer.huaweicloud.com/endpoint?ECS), - - For example, for region `cn-north-4`, the endpoint is + Find the ecs endpoint for different regions [here](https://developer.huaweicloud.com/endpoint?ECS), + + For example, for region `cn-north-4`, the endpoint is ``` ecs.cn-north-4.myhuaweicloud.com ``` - `project-id` - + Follow this link to find the project-id: [Obtaining a Project ID](https://support.huaweicloud.com/en-us/api-servicestage/servicestage_api_0023.html) - `access-key` and `secret-key` @@ -328,14 +328,14 @@ and [My Credentials](https://support.huaweicloud.com/en-us/usermanual-ca/ca_01_0 #### Configure deployment - An example deployment file is provided at [examples/cluster-autoscaler-deployment.yaml](examples/cluster-autoscaler-deployment.yaml). + An example deployment file is provided at [examples/cluster-autoscaler-deployment.yaml](examples/cluster-autoscaler-deployment.yaml). Change the `image` to the image you just pushed, the `cluster-name` to the cluster's id and `nodes` to your own configurations of the node pool with format ``` {Minimum number of nodes}:{Maximum number of nodes}:{Node pool name} ``` The above parameters should match the parameters of the AS Group you created. - + More configuration options can be added to the cluster autoscaler, such as `scale-down-delay-after-add`, `scale-down-unneeded-time`, etc. See available configuration options [here](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-are-the-parameters-to-ca). @@ -376,33 +376,32 @@ A simple testing method is like this: by executing something like this: ``` kubectl autoscale deployment [Deployment name] --cpu-percent=10 --min=1 --max=20 - ``` - The above command creates an HPA policy on the deployment with target average cpu usage of 10%. The number of + ``` + The above command creates an HPA policy on the deployment with target average cpu usage of 10%. The number of pods will grow if average cpu usage is above 10%, and will shrink otherwise. The `min` and `max` parameters set the minimum and maximum number of pods of this deployment. - Generate load to the above service Example tools for generating workload to an http service are: - * [Use `hey` command](https://github.com/rakyll/hey) + * [Use `hey` command](https://github.com/rakyll/hey) * Use `busybox` image: ``` kubectl run --generator=run-pod/v1 -it --rm load-generator --image=busybox /bin/sh - + # send an infinite loop of queries to the service while true; do wget -q -O- {Service access address}; done ``` - + Feel free to use other tools which have a similar function. - + - Wait for pods to be added: as load increases, more pods will be added by HPA -- Wait for nodes to be added: when there's insufficient resource for additional pods, new nodes will be added to the +- Wait for nodes to be added: when there's insufficient resource for additional pods, new nodes will be added to the cluster by the cluster autoscaler - Stop the load - Wait for pods to be removed: as load decreases, pods will be removed by HPA -- Wait for nodes to be removed: as pods being removed from nodes, several nodes will become underutilized or empty, +- Wait for nodes to be removed: as pods being removed from nodes, several nodes will become underutilized or empty, and will be removed by the cluster autoscaler - diff --git a/cluster-autoscaler/cloudprovider/ionoscloud/ionos-cloud-sdk-go/LICENSE b/cluster-autoscaler/cloudprovider/ionoscloud/ionos-cloud-sdk-go/LICENSE index b9d5d805f754..eaf4d9b45b80 100644 --- a/cluster-autoscaler/cloudprovider/ionoscloud/ionos-cloud-sdk-go/LICENSE +++ b/cluster-autoscaler/cloudprovider/ionoscloud/ionos-cloud-sdk-go/LICENSE @@ -187,4 +187,4 @@ distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and - limitations under the License. \ No newline at end of file + limitations under the License. diff --git a/cluster-autoscaler/cloudprovider/kamatera/README.md b/cluster-autoscaler/cloudprovider/kamatera/README.md index c20098496690..ca110619a123 100644 --- a/cluster-autoscaler/cloudprovider/kamatera/README.md +++ b/cluster-autoscaler/cloudprovider/kamatera/README.md @@ -5,7 +5,7 @@ The cluster autoscaler for Kamatera scales nodes in a Kamatera cluster. ## Kamatera Kubernetes [Kamatera](https://www.kamatera.com/express/compute/) supports Kubernetes clusters using our Rancher app -or by creating a self-managed cluster directly on Kamatera compute servers, the autoscaler supports +or by creating a self-managed cluster directly on Kamatera compute servers, the autoscaler supports both methods. ## Cluster Autoscaler Node Groups @@ -113,7 +113,7 @@ how you create and manage the cluster. See below for some common configurations, but the exact script may need to be modified depending on your requirements and server image. -The script needs to be provided as a base64 encoded string. You can encode your script using the following command: +The script needs to be provided as a base64 encoded string. You can encode your script using the following command: `cat script.sh | base64 -w0`. #### Kamatera Rancher Server Initialization Script diff --git a/cluster-autoscaler/cloudprovider/kwok/OWNERS b/cluster-autoscaler/cloudprovider/kwok/OWNERS index 585a63b17faa..6e63250c7ac1 100644 --- a/cluster-autoscaler/cloudprovider/kwok/OWNERS +++ b/cluster-autoscaler/cloudprovider/kwok/OWNERS @@ -4,4 +4,4 @@ reviewers: - vadasambar labels: -- area/provider/kwok \ No newline at end of file +- area/provider/kwok diff --git a/cluster-autoscaler/cloudprovider/linode/examples/cluster-autoscale-pdb.yaml b/cluster-autoscaler/cloudprovider/linode/examples/cluster-autoscale-pdb.yaml index e811124b651d..fbf3cf951d3e 100644 --- a/cluster-autoscaler/cloudprovider/linode/examples/cluster-autoscale-pdb.yaml +++ b/cluster-autoscaler/cloudprovider/linode/examples/cluster-autoscale-pdb.yaml @@ -6,4 +6,4 @@ spec: minAvailable: 1 selector: matchLabels: - app: cluster-autoscaler \ No newline at end of file + app: cluster-autoscaler diff --git a/cluster-autoscaler/cloudprovider/linode/examples/cluster-autoscaler-secret.yaml b/cluster-autoscaler/cloudprovider/linode/examples/cluster-autoscaler-secret.yaml index db7c8371ada2..cf378f040b5e 100644 --- a/cluster-autoscaler/cloudprovider/linode/examples/cluster-autoscaler-secret.yaml +++ b/cluster-autoscaler/cloudprovider/linode/examples/cluster-autoscaler-secret.yaml @@ -23,4 +23,4 @@ stringData: max-size=4 [nodegroup "g6-standard-4"] - max-size=2 \ No newline at end of file + max-size=2 diff --git a/cluster-autoscaler/cloudprovider/linode/linodego/README.md b/cluster-autoscaler/cloudprovider/linode/linodego/README.md index 81fcf5de1cad..55ce32db468b 100644 --- a/cluster-autoscaler/cloudprovider/linode/linodego/README.md +++ b/cluster-autoscaler/cloudprovider/linode/linodego/README.md @@ -2,4 +2,4 @@ This package implements a minimal REST API client for the Linode REST v4 API. -This client only implements the endpoints usend on the Linode cloud provider, it is intended to be a drop-in replacement for the [linodego](https://github.com/linode/linodego) official go client. \ No newline at end of file +This client only implements the endpoints usend on the Linode cloud provider, it is intended to be a drop-in replacement for the [linodego](https://github.com/linode/linodego) official go client. diff --git a/cluster-autoscaler/cloudprovider/magnum/gophercloud/CHANGELOG.md b/cluster-autoscaler/cloudprovider/magnum/gophercloud/CHANGELOG.md index faca6a8b8537..cc41928132f4 100644 --- a/cluster-autoscaler/cloudprovider/magnum/gophercloud/CHANGELOG.md +++ b/cluster-autoscaler/cloudprovider/magnum/gophercloud/CHANGELOG.md @@ -204,4 +204,4 @@ BUG FIXES ## 0.1.0 (May 27, 2019) -Initial tagged release. +Initial tagged release. diff --git a/cluster-autoscaler/cloudprovider/magnum/gophercloud/LICENSE b/cluster-autoscaler/cloudprovider/magnum/gophercloud/LICENSE index fbbbc9e4cbad..f235f85d5519 100644 --- a/cluster-autoscaler/cloudprovider/magnum/gophercloud/LICENSE +++ b/cluster-autoscaler/cloudprovider/magnum/gophercloud/LICENSE @@ -9,10 +9,10 @@ License at Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the -specific language governing permissions and limitations under the License. +specific language governing permissions and limitations under the License. ------ - + Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ diff --git a/cluster-autoscaler/cloudprovider/oci/THIRD_PARTY_LICENSE.txt b/cluster-autoscaler/cloudprovider/oci/THIRD_PARTY_LICENSE.txt index ddb17fb57714..7a958487a8f0 100644 --- a/cluster-autoscaler/cloudprovider/oci/THIRD_PARTY_LICENSE.txt +++ b/cluster-autoscaler/cloudprovider/oci/THIRD_PARTY_LICENSE.txt @@ -1171,4 +1171,4 @@ k8s.io/kubernetes distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and - limitations under the License. \ No newline at end of file + limitations under the License. diff --git a/cluster-autoscaler/cloudprovider/oci/examples/instance-details.json b/cluster-autoscaler/cloudprovider/oci/examples/instance-details.json index fc4f25c93dc5..d501cb6077ba 100644 --- a/cluster-autoscaler/cloudprovider/oci/examples/instance-details.json +++ b/cluster-autoscaler/cloudprovider/oci/examples/instance-details.json @@ -16,4 +16,4 @@ "assignPublicIp": true } } -} \ No newline at end of file +} diff --git a/cluster-autoscaler/cloudprovider/oci/examples/oci-ip-cluster-autoscaler-w-principals.yaml b/cluster-autoscaler/cloudprovider/oci/examples/oci-ip-cluster-autoscaler-w-principals.yaml index e954b814c3a2..fad9cd2928d1 100644 --- a/cluster-autoscaler/cloudprovider/oci/examples/oci-ip-cluster-autoscaler-w-principals.yaml +++ b/cluster-autoscaler/cloudprovider/oci/examples/oci-ip-cluster-autoscaler-w-principals.yaml @@ -162,4 +162,4 @@ spec: - name: OCI_USE_INSTANCE_PRINCIPAL value: "true" - name: OCI_SDK_APPEND_USER_AGENT - value: "oci-oke-cluster-autoscaler" \ No newline at end of file + value: "oci-oke-cluster-autoscaler" diff --git a/cluster-autoscaler/cloudprovider/oci/examples/oci-nodepool-cluster-autoscaler-w-principals.yaml b/cluster-autoscaler/cloudprovider/oci/examples/oci-nodepool-cluster-autoscaler-w-principals.yaml index 86082cca8e19..67fefafb1cfc 100644 --- a/cluster-autoscaler/cloudprovider/oci/examples/oci-nodepool-cluster-autoscaler-w-principals.yaml +++ b/cluster-autoscaler/cloudprovider/oci/examples/oci-nodepool-cluster-autoscaler-w-principals.yaml @@ -171,4 +171,4 @@ spec: - name: OKE_USE_INSTANCE_PRINCIPAL value: "true" - name: OCI_SDK_APPEND_USER_AGENT - value: "oci-oke-cluster-autoscaler" \ No newline at end of file + value: "oci-oke-cluster-autoscaler" diff --git a/cluster-autoscaler/cloudprovider/oci/examples/placement-config.json b/cluster-autoscaler/cloudprovider/oci/examples/placement-config.json index b253ce4e27a8..7be3df24341f 100644 --- a/cluster-autoscaler/cloudprovider/oci/examples/placement-config.json +++ b/cluster-autoscaler/cloudprovider/oci/examples/placement-config.json @@ -3,4 +3,4 @@ "availabilityDomain": "hXgQ:PHX-AD-2", "primarySubnetId": "ocid1.subnet.oc1.phx.aaaaaaaaouihv645dp2xaee6w4uvx6emjwuscsrxcn3miwa6vmijtpdnqdeq" } -] \ No newline at end of file +] diff --git a/cluster-autoscaler/cloudprovider/rancher/examples/config.yaml b/cluster-autoscaler/cloudprovider/rancher/examples/config.yaml index 81d4eb58091e..796fa5841b84 100644 --- a/cluster-autoscaler/cloudprovider/rancher/examples/config.yaml +++ b/cluster-autoscaler/cloudprovider/rancher/examples/config.yaml @@ -6,4 +6,4 @@ token: clusterName: my-cluster clusterNamespace: fleet-default # optional, will be auto-discovered if not specified -#clusterAPIVersion: v1alpha4 \ No newline at end of file +#clusterAPIVersion: v1alpha4 diff --git a/cluster-autoscaler/cloudprovider/scaleway/README.md b/cluster-autoscaler/cloudprovider/scaleway/README.md index 5d446ef6910a..27e9d9f4d742 100644 --- a/cluster-autoscaler/cloudprovider/scaleway/README.md +++ b/cluster-autoscaler/cloudprovider/scaleway/README.md @@ -8,7 +8,7 @@ The cluster pools need to have the option `Autoscaling` set to true to be manage Cluster Autoscaler can be configured with 2 options ### Config file -a config file can be passed with the `--cloud-config` flag. +a config file can be passed with the `--cloud-config` flag. here is the corresponding JSON schema: * `cluster_id`: Kapsule Cluster Id * `secret_key`: Secret Key used to manage associated Kapsule resources diff --git a/cluster-autoscaler/cloudprovider/tencentcloud/README.md b/cluster-autoscaler/cloudprovider/tencentcloud/README.md index 793eaef6cbe3..2cb10e00198a 100644 --- a/cluster-autoscaler/cloudprovider/tencentcloud/README.md +++ b/cluster-autoscaler/cloudprovider/tencentcloud/README.md @@ -190,4 +190,4 @@ identical to the units used in the `resources` field of a Pod specification. Example tags: -- `k8s.io/cluster-autoscaler/node-template/resources/ephemeral-storage`: `100G` \ No newline at end of file +- `k8s.io/cluster-autoscaler/node-template/resources/ephemeral-storage`: `100G` diff --git a/cluster-autoscaler/cloudprovider/utho/examples/stress-test.yaml b/cluster-autoscaler/cloudprovider/utho/examples/stress-test.yaml index 705e1e04919a..ac0352ed5175 100644 --- a/cluster-autoscaler/cloudprovider/utho/examples/stress-test.yaml +++ b/cluster-autoscaler/cloudprovider/utho/examples/stress-test.yaml @@ -14,7 +14,7 @@ spec: spec: containers: - name: stress-container - image: nginx + image: nginx resources: requests: - cpu: "750m" # Request .75 CPU cores \ No newline at end of file + cpu: "750m" # Request .75 CPU cores diff --git a/cluster-autoscaler/cloudprovider/volcengine/README.md b/cluster-autoscaler/cloudprovider/volcengine/README.md index 9ee3f32e38bc..436ff141d993 100644 --- a/cluster-autoscaler/cloudprovider/volcengine/README.md +++ b/cluster-autoscaler/cloudprovider/volcengine/README.md @@ -215,4 +215,4 @@ spec: ## Auto-Discovery Setup -Auto Discovery is not currently supported in Volcengine. \ No newline at end of file +Auto Discovery is not currently supported in Volcengine. diff --git a/cluster-autoscaler/cloudprovider/volcengine/examples/cluster-autoscaler-deployment.yaml b/cluster-autoscaler/cloudprovider/volcengine/examples/cluster-autoscaler-deployment.yaml index 882413a70c41..e443dc1c02fe 100644 --- a/cluster-autoscaler/cloudprovider/volcengine/examples/cluster-autoscaler-deployment.yaml +++ b/cluster-autoscaler/cloudprovider/volcengine/examples/cluster-autoscaler-deployment.yaml @@ -49,4 +49,4 @@ spec: valueFrom: secretKeyRef: name: cloud-config - key: endpoint \ No newline at end of file + key: endpoint diff --git a/cluster-autoscaler/cloudprovider/volcengine/examples/cluster-autoscaler-secret.yaml b/cluster-autoscaler/cloudprovider/volcengine/examples/cluster-autoscaler-secret.yaml index 9be6ff38f0a0..3e0185eb87f8 100644 --- a/cluster-autoscaler/cloudprovider/volcengine/examples/cluster-autoscaler-secret.yaml +++ b/cluster-autoscaler/cloudprovider/volcengine/examples/cluster-autoscaler-secret.yaml @@ -8,4 +8,4 @@ data: access-key: [YOUR_BASE64_AK_ID] secret-key: [YOUR_BASE64_AK_SECRET] region-id: [YOUR_BASE64_REGION_ID] - endpoint: [YOUR_BASE64_ENDPOINT] \ No newline at end of file + endpoint: [YOUR_BASE64_ENDPOINT] diff --git a/cluster-autoscaler/cloudprovider/vultr/examples/cluster-autoscaler-deployment.yaml b/cluster-autoscaler/cloudprovider/vultr/examples/cluster-autoscaler-deployment.yaml index b0eef7dfb028..67c0402a2ee0 100644 --- a/cluster-autoscaler/cloudprovider/vultr/examples/cluster-autoscaler-deployment.yaml +++ b/cluster-autoscaler/cloudprovider/vultr/examples/cluster-autoscaler-deployment.yaml @@ -164,4 +164,4 @@ spec: path: "/etc/ssl/certs/ca-certificates.crt" - name: cloud-config secret: - secretName: cluster-autoscaler-cloud-config \ No newline at end of file + secretName: cluster-autoscaler-cloud-config diff --git a/cluster-autoscaler/cloudprovider/vultr/examples/cluster-autoscaler-secret.yaml b/cluster-autoscaler/cloudprovider/vultr/examples/cluster-autoscaler-secret.yaml index 55fb1278f285..712890c22468 100644 --- a/cluster-autoscaler/cloudprovider/vultr/examples/cluster-autoscaler-secret.yaml +++ b/cluster-autoscaler/cloudprovider/vultr/examples/cluster-autoscaler-secret.yaml @@ -10,4 +10,4 @@ stringData: { "cluster_id": "6b8b7c8e-7314-4d17-bb0c-fcc9777230fa", "token": "IAOBTDG35OTGH2UKCC3S6CNMDUPCN3ZNVBASQ" - } \ No newline at end of file + } diff --git a/cluster-autoscaler/expander/priority/priority-expander-configmap.yaml b/cluster-autoscaler/expander/priority/priority-expander-configmap.yaml index cd4e0d3496a5..e10547fe4377 100644 --- a/cluster-autoscaler/expander/priority/priority-expander-configmap.yaml +++ b/cluster-autoscaler/expander/priority/priority-expander-configmap.yaml @@ -4,8 +4,8 @@ metadata: name: cluster-autoscaler-priority-expander data: priorities: |- - 10: + 10: - .*t2\.large.* - .*t3\.large.* - 50: - - .*m4\.4xlarge.* \ No newline at end of file + 50: + - .*m4\.4xlarge.* diff --git a/cluster-autoscaler/expander/priority/priority_test.go b/cluster-autoscaler/expander/priority/priority_test.go index 35515a16b1e4..586f0d5fe7a4 100644 --- a/cluster-autoscaler/expander/priority/priority_test.go +++ b/cluster-autoscaler/expander/priority/priority_test.go @@ -46,20 +46,20 @@ var ( config = ` 5: - ".*t2\\.micro.*" -10: +10: - ".*t2\\.large.*" - ".*t3\\.large.*" -50: +50: - ".*m4\\.4xlarge.*" ` oneEntryConfig = ` -10: +10: - ".*t2\\.large.*" ` notMatchingConfig = ` 5: - ".*t\\.micro.*" -10: +10: - ".*t\\.large.*" ` wildcardMatchConfig = ` diff --git a/cluster-autoscaler/expander/priority/readme.md b/cluster-autoscaler/expander/priority/readme.md index a11174af35a9..dde042a70361 100644 --- a/cluster-autoscaler/expander/priority/readme.md +++ b/cluster-autoscaler/expander/priority/readme.md @@ -23,10 +23,10 @@ metadata: namespace: kube-system data: priorities: |- - 10: + 10: - .*t2\.large.* - .*t3\.large.* - 50: + 50: - .*m4\.4xlarge.* ``` diff --git a/cluster-autoscaler/proposals/buffers.md b/cluster-autoscaler/proposals/buffers.md index 511e9137e44b..1c9b3257eabe 100644 --- a/cluster-autoscaler/proposals/buffers.md +++ b/cluster-autoscaler/proposals/buffers.md @@ -17,8 +17,8 @@ - [ ] E2e test implemented and healthy - [ ] In beta for at least 1 full version -- [ ] Waiting up to k8s 1.37 (inclusive) in beta for second OSS implementation - (karpenter). In case of no implementation in order to avoid permanent beta +- [ ] Waiting up to k8s 1.37 (inclusive) in beta for second OSS implementation + (karpenter). In case of no implementation in order to avoid permanent beta (following the spirit of [guidance for k8s REST APIs](https://kubernetes.io/blog/2020/08/21/moving-forward-from-beta/#avoiding-permanent-beta)) we will reevaluate the graduation criteria with sig-autoscaling leads based on: - existing adoption and feedback @@ -43,9 +43,9 @@ drive scaling decisions for the cluster. # Motivation -While some use cases of buffers can be already accomplished using balloon -pods/deployments -([overprovisioning node capacity documentation](https://kubernetes.io/docs/tasks/administer-cluster/node-overprovisioning/)) +While some use cases of buffers can be already accomplished using balloon +pods/deployments +([overprovisioning node capacity documentation](https://kubernetes.io/docs/tasks/administer-cluster/node-overprovisioning/)) there are reasons to introduce Buffers as a separate API concept: * simplify configuration of additional capacity in the cluster. * Today it is possible to create balloon pods and deployments, but they @@ -111,7 +111,7 @@ In order to support buffers the cluster will need to run: * A controller that translates the buffer configuration into a number of pods specs that would represent the spare space in the cluster that is needed. The reference implementation would be provided within the cluster autoscaler - repository with the controller running as a subprocess of the cluster + repository with the controller running as a subprocess of the cluster autoscaler, but it is expected that other integrations will depend on it as a library rather than full implementation * An autoscaler should allow for processing additional pods that are @@ -199,7 +199,7 @@ the deployment size so that when the HPA scales my deployment the new pods are started faster. Additionally, this speeds up the feedback loop of the metrics allowing the HPA also to faster provide next scaling decisions. -Note: the size of the buffer will be calculated based on existing replicas to +Note: the size of the buffer will be calculated based on existing replicas to avoid situation when the buffer grows before all pods are there to be considered for a scale up. @@ -262,7 +262,7 @@ spec: ## Notes/Constraints/Caveats (Optional) -* The NodeAutoscalingBuffer implementation assumes that the pods run as a part +* The NodeAutoscalingBuffer implementation assumes that the pods run as a part of scale subresource are homogeneous. ## Risks and Mitigations @@ -321,13 +321,13 @@ type CapacityBufferSpec struct { // +optional Percentage *int - // If empty it will create additional nodes to provide capacity in the cluster. + // If empty it will create additional nodes to provide capacity in the cluster. // Cloud providers can offer their own buffering strategies. // +optional ProvisioningStrategy *string - // If specified it will limit the number of chunks created for this buffer. - // If there are no other limitations for the number of chunks it will be used to + // If specified it will limit the number of chunks created for this buffer. + // If there are no other limitations for the number of chunks it will be used to // create as many chunks as fit into these limits. // +optional Limits *ResourceList @@ -336,7 +336,7 @@ type CapacityBufferSpec struct { type CapacityBufferStatus struct { // If pod template, replicas and generation id are not set conditions will //provide details about the error state. - // +optional + // +optional PodTemplateRef *PodTemplateRef // Number of replicas calculated by the buffer controller that autoscaler // should act on. @@ -344,7 +344,7 @@ type CapacityBufferStatus struct { Replicas *int // Number of replicas from this buffer that have provisioned capacity in the // cluster that is ready to be used. - // +optional + // +optional ReadyReplicas *int // +optional PodTemplateGeneration *int @@ -383,12 +383,12 @@ library that would be used in autoscaler code: Base on the full Buffer spec writes the status including conditions (to mark it as ready for provisioning) and PodBufferCapacity. In the future the number of -chunks will take into account the k8s Resource Quotas. +chunks will take into account the k8s Resource Quotas. #### Autoscaler Autoscaler will use the buffer status field and buffer type to determine how -many and what spare capacity to deploy. +many and what spare capacity to deploy. Autoscaler can additionally set conditions on the buffer to communicate error states and unforeseen circumstances. @@ -402,7 +402,7 @@ than specified because of the quotas. The message field will contain info which quotas are blocking the buffer. - Provisioning - set by autoscaler once it processes the buffer for the first time. False in case the buffer is unhelpable (otherwise the scale up events will -be available for each scale up decision). +be available for each scale up decision). ### How to directly define cluster capacity? @@ -430,8 +430,8 @@ users can already set any scaling targets for HPA. However: ### Buffers and k8s resource quotas Today it is possible to set a resource quota per namespace that will prevent the -user who has access to the namespace from creating too many objects of a -given type or pods that use too many resources. +user who has access to the namespace from creating too many objects of a +given type or pods that use too many resources. Since buffers are not pods they will not be accounted by the quota system and so any user who is able to create buffers will be able to create a buffer of any @@ -440,18 +440,18 @@ size. For the initial implementation on buffers the users should use other tools to limit the size of the cluster (max size on the GKE node pool/CCC or karpenter pool, max total size). Once we launch we will reassess the need for other -mechanisms for buffers. +mechanisms for buffers. Two options that should be considered are: - [recommended at the moment, to reevaluate before implementation] integrating -with k8s resource quotas mechanism: make the buffer controller quota aware, -apply the quota when translating the buffer config to a pod spec and +with k8s resource quotas mechanism: make the buffer controller quota aware, +apply the quota when translating the buffer config to a pod spec and periodically check the total resources used by the namespace - create a separate API surface (like `BufferPolicy`) to manage what buffers can be created. This may be needed if we would like to offer different quotas depending on the `Buffer` type. -Other considered alternatives: +Other considered alternatives: [Buffers and k8s quotas](https://docs.google.com/document/d/1dNxpAb5_fSv4SOIOUYsBYKpE-o_qsCUiSUTS5wCXiT8) ## Dependencies diff --git a/cluster-autoscaler/proposals/circumvent-tag-limit-aws.md b/cluster-autoscaler/proposals/circumvent-tag-limit-aws.md index 09a0002f3d3a..bcb38f37ee39 100644 --- a/cluster-autoscaler/proposals/circumvent-tag-limit-aws.md +++ b/cluster-autoscaler/proposals/circumvent-tag-limit-aws.md @@ -39,15 +39,15 @@ This is what a DescribeNodegroup API response looks like (also see [here][]): HTTP/1.1 200 Content-type: application/json { - "nodegroup": { + "nodegroup": { "amiType": "*string*", "capacityType": "*string*", "clusterName": "*string*", "createdAt": *number*, "diskSize": *number*, - "health": { - "issues": [ - { + "health": { + "issues": [ + { "code": "*string*", "message": "*string*", "resourceIds": [ "*string*" ] @@ -55,10 +55,10 @@ Content-type: application/json ] }, "instanceTypes": [ "*string*" ], - "labels": { - "*string*" : "*string*" + "labels": { + "*string*" : "*string*" }, - "launchTemplate": { + "launchTemplate": { "id": "*string*", "name": "*string*", "version": "*string*" @@ -68,27 +68,27 @@ Content-type: application/json "nodegroupName": "*string*", "nodeRole": "*string*", "releaseVersion": "*string*", - "remoteAccess": { + "remoteAccess": { "ec2SshKey": "*string*", "sourceSecurityGroups": [ "*string*" ] }, - "resources": { - "autoScalingGroups": [ - { + "resources": { + "autoScalingGroups": [ + { "name": "*string*" } ], "remoteAccessSecurityGroup": "*string*" }, - "scalingConfig": { + "scalingConfig": { "desiredSize": *number*, "maxSize": *number*, "minSize": *number* }, "status": "*string*", "subnets": [ "*string*" ], - "tags": { - "*string*" : "*string*" + "tags": { + "*string*" : "*string*" }, "version": "*string*" } @@ -127,7 +127,7 @@ Content-type: application/json Latency and throttling: -By default, Cluster Autoscaler runs every 10 seconds. Our best practices documentation notes that this short interval can cause throttling because Cluster Autoscaler already makes AWS API calls during each loop. Our documentation recommends that customers increase the interval, so adding this API shouldn’t cause latency problems for customers. ([EKS Best Practices][]) Also, the API call will only happen for EKS ManagedNodegroups. If this increase in latency is too much for even one run of the loop, we will look into moving the API calls into parallel goroutines. +By default, Cluster Autoscaler runs every 10 seconds. Our best practices documentation notes that this short interval can cause throttling because Cluster Autoscaler already makes AWS API calls during each loop. Our documentation recommends that customers increase the interval, so adding this API shouldn’t cause latency problems for customers. ([EKS Best Practices][]) Also, the API call will only happen for EKS ManagedNodegroups. If this increase in latency is too much for even one run of the loop, we will look into moving the API calls into parallel goroutines. EKS also throttles describe API calls by default. To mitigate this issue, Cluster Autoscaler will keep a cache of DescribeNodegroup responses: ManagedNodegroupCache. Each cache bucket will have a TTL and, when the TTL expires, DescribeNodegroup will be called again. If there are errors during the call to DescribeNodegroup, Cluster Autoscaler will move on and just look at the ASG tags and existing default allocatable resource values. diff --git a/cluster-autoscaler/proposals/clusterstate.md b/cluster-autoscaler/proposals/clusterstate.md index 6c762c3d9b64..0ccb4fb987c5 100644 --- a/cluster-autoscaler/proposals/clusterstate.md +++ b/cluster-autoscaler/proposals/clusterstate.md @@ -1,5 +1,5 @@ ### Cluster State Registry -### Handling unready nodes +### Handling unready nodes ### Introduction @@ -9,7 +9,7 @@ Currently ClusterAutoscaler stops working when the number of nodes observed on t The number of ready nodes can be different than on the mig side in the following situations: -* [UC1] A new node is being added to the cluster. The node group has been increased but the node has not been created/started/registered in K8S yet. On GCP this usually takes couple minutes. +* [UC1] A new node is being added to the cluster. The node group has been increased but the node has not been created/started/registered in K8S yet. On GCP this usually takes couple minutes. Indicating factors: * There was a scale up in the last couple minutes. * The number of missing node is at most the size of executed scale-up. @@ -19,7 +19,7 @@ Suggested action: Continue operations, however include all yet-to-arrive nodes i * The unready node is new. CreateTime in the last couple minutes. Suggested action: Continue operations, however include all yet-to-arrive nodes in all scale-up considerations. -* [UC3] A new node was added to the cluster, it registered in K8S but failed to fully start within +* [UC3] A new node was added to the cluster, it registered in K8S but failed to fully start within the reasonable time. There is little chance that it will start anytime soon. Indicating factors: * Node is unready * CreateTime == unready NodeCondition.LastTransitionTime @@ -28,7 +28,7 @@ Suggested action: Continue operations, however do not expand this node pool. The * [UC4] A new node is being added to the cluster. However the cloud provider cannot provision the node within the reasonable time due to either no quota or technical problems. Indicating factors: * The target number of nodes on the cloud provider side is greater than the number of nodes in K8S for the prolonged time (more than couple minutes) and the difference doesn’t change. * There are no new nodes when listing nodes on the cluster provider side. -Suggested action: Reduce the target size of the problematic node group to the current size. +Suggested action: Reduce the target size of the problematic node group to the current size. * [UC5] A new node was provided by the cloud. However, it failed to register. Indicating factors: * There are no new nodes on the cluster provider side that have not appeared in K8S for the long time. @@ -36,7 +36,7 @@ Suggested action: Remove the unregistered nodes one by one. * [UC6] A node is in an unready state for quite a while (+20min) and the total number of unready/not-present nodes is low (less than XX%). Something crashed on the node and could not be recovered. Indicating factors: * Node condition is unready and last transition time is >= 20 min. - * The number of TOTAL nodes in K8S is equal to the target number of nodes on the cloud provider side. + * The number of TOTAL nodes in K8S is equal to the target number of nodes on the cloud provider side. Suggested action: Include the node in scale down, although with greater (configurable) unneeded time and only if node controller has already removed all of its pods. @@ -44,8 +44,8 @@ if node controller has already removed all of its pods. * Node is unready and has ToBeRemoved taint. Suggested action: Continue operations. Nodes should be removed soon. -* [UC8] The number of unjustified (not related to scale-up and scale-down) unready nodes is greater than XX%. Something is broken, possibly due to network partition or generic failure. Indicating factors: - * >XX% of nodes are unready +* [UC8] The number of unjustified (not related to scale-up and scale-down) unready nodes is greater than XX%. Something is broken, possibly due to network partition or generic failure. Indicating factors: + * >XX% of nodes are unready Suggested action: halt operations. ### Proposed solution @@ -55,11 +55,11 @@ Introduce a cluster state registry that provides the following information: * [S1] Is the cluster, in general, in a good enough shape for CA to operate. The cluster is in the good shape if most of the nodes are in the ready state, and the number of nodes that are in the unjustified unready state (not related to scale down or scale up operations) is limited. CA should halt operations if the cluster is unhealthy and alert the system administrator. * [S2] Is the given Node group, in general, in a good enough shape for CA to operate on it. The NodeGroup is in the good shape if the number of nodes that are unready (but not due to current scale-up/scale-down operations) or not present at all (not yet started by cloud provider) is limited. CA should take extra care about these unhealthy -groups and not scale up them further until the situation improves. +groups and not scale up them further until the situation improves. * [S3] What nodes should soon arrive to the cluster. So that estimator takes them into account and don't ask again for resources for the already handled pods. Also, with that, the estimator won't need to wait for nodes to appear in the cluster. -* [S4] How long the given node group has been missing nodes. If a fixed number of nodes is missing for a long time this may indicate quota problems. Such node groups should be resized to the actual size. +* [S4] How long the given node group has been missing nodes. If a fixed number of nodes is missing for a long time this may indicate quota problems. Such node groups should be resized to the actual size. CA will operate with unready nodes possibly present in the cluster. Such nodes will be picked by scale down as K8S controller manager eventually removes all pods from unready nodes. As the result all of the unready nodes, if not brought back into shape will be removed after being unready for long enough (and possibly replaced by new nodes). @@ -73,10 +73,9 @@ The main loop algorithm will look as follows: [UC5]. 4. Check if any of the node groups has long-time missing nodes. If yes, reduce the size of the node group by the number of long-missing nodes. Skip the rest of the iteration. Helps with [UC4], uses [S4]. -5. Check if there are any pending pods. Skip pending pods that can be scheduled on the currently available ready nodes (not including nodes that are to be deleted soon [UC7]). -6. If there are still some pending pods, find which of the node group can be expanded to accommodate them. Skip node groups that are not healthy (contains many unready nodes or nodes that failed to start). Helps with [UC3] uses [S2]. +5. Check if there are any pending pods. Skip pending pods that can be scheduled on the currently available ready nodes (not including nodes that are to be deleted soon [UC7]). +6. If there are still some pending pods, find which of the node group can be expanded to accommodate them. Skip node groups that are not healthy (contains many unready nodes or nodes that failed to start). Helps with [UC3] uses [S2]. 7. Estimate the number of needed nodes, account yet-to-come nodes [UC1], [UC2], [S3]. Expand the chosen node group if needed. 8. Calculate the unneeded nodes in the whole cluster, including the unready nodes [UC6]. Unneeded nodes must be monitored every iteration to be sure that they have been unneeded for the prolonged time. 9. Try to remove some unneeded node, if there was no recent scale up and the node has been unneeded for more than 10 min. Use higher delay for unready nodes [UC6]. - diff --git a/cluster-autoscaler/proposals/expander-plugin-grpc.md b/cluster-autoscaler/proposals/expander-plugin-grpc.md index 2e593c84ced7..16caec2eba06 100644 --- a/cluster-autoscaler/proposals/expander-plugin-grpc.md +++ b/cluster-autoscaler/proposals/expander-plugin-grpc.md @@ -18,7 +18,7 @@ To do this, we propose a solution that adds a new expander to CA, but does not b ## Proposal We will extend CA to utilize a pluggable external expander. The design for this expander plugin is heavily based off of this [proposal](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/plugable-provider-grpc.md) to CA, for a pluggable cloud provider interface. - + The solution will include a server acting as an external expander alongside CA, and communicate via gRPC with TLS. This expander will run in another pod, as a separate service, deployed independently of CA. This is depicted below. diff --git a/cluster-autoscaler/proposals/metrics.md b/cluster-autoscaler/proposals/metrics.md index 28855b5e3598..87a19e9b7640 100644 --- a/cluster-autoscaler/proposals/metrics.md +++ b/cluster-autoscaler/proposals/metrics.md @@ -138,4 +138,3 @@ feature. | nap_enabled | Gauge | | Whether or not Node Autoprovisioning is enabled. 1 if it is, 0 otherwise. | | created_node_groups_total | Counter | | Number of node groups created by Node Autoprovisioning. | | deleted_node_groups_total | Counter | | Number of node groups deleted by Node Autoprovisioning. | - diff --git a/cluster-autoscaler/proposals/min_at_zero_gcp.md b/cluster-autoscaler/proposals/min_at_zero_gcp.md index b1c54437525f..6bb13c1b84a3 100644 --- a/cluster-autoscaler/proposals/min_at_zero_gcp.md +++ b/cluster-autoscaler/proposals/min_at_zero_gcp.md @@ -1,4 +1,4 @@ -# Cluster Autoscaler - min at 0 +# Cluster Autoscaler - min at 0 ### Design Document for Google Cloud Platform ##### Author: mwielgus @@ -115,7 +115,7 @@ So it is also quite easy to get all of the capacity information from it. ### [1B] - Node allocatable -In GKE 1.5.6 allocatable for new nodes is equal to capacity, however on GCE there is allocatable memory is a bit smaller than capacity. +In GKE 1.5.6 allocatable for new nodes is equal to capacity, however on GCE there is allocatable memory is a bit smaller than capacity. Initially, for simplicity, we can assume that the new node will have -0.1cpu/-200mb of capacity, but we will have to be more precise before the release. More details of how the allocatables are calculated are available here: https://github.com/kubernetes/kubernetes/blob/c20e63bfb98fecef7461dbaf8ed52e31fe12cd11/pkg/kubelet/cm/node_container_manager.go#L184. Being wrong or underestimating here is not fatal, most users will probably be OK with this. Once some nodes are present we will have more precise estimates. The worst thing that can happen is that the scale up may not be triggered if the request is exactly at the node capacity - system pods. @@ -154,7 +154,7 @@ NODE_LABELS: a=b,c=d,cloud.google.com/gke-nodepool=pool-3,cloud.google.com/gke-p NODE_LABELS: cloud.google.com/gke-local-ssd=true,cloud.google.com/gke-nodepool=pool-1 ``` -The kubelet code that populates labels not available in kube_env is here: +The kubelet code that populates labels not available in kube_env is here: https://github.com/kubernetes/kubernetes/blob/ceff8d8d4d7ac271cd03dcae73edde048a685df5/pkg/kubelet/kubelet_node_status.go#L196 The bottom line is that all of the labels can be easily obtained. @@ -164,14 +164,14 @@ The bottom line is that all of the labels can be easily obtained. In GKE (since 1.6) we run 1 types of pods by default on the node - Kube-proxy that requires only cpu. Unfortunately the amount of cpu is not well-defined and is hidden inside the startup script. https://github.com/kubernetes/kubernetes/blob/6bf9f2f0bbf25c550e9dd93bfa0a3cda4feec954/cluster/gce/gci/configure-helper.sh#L797 The amount is fixed at 100m and I guess it is unlikely to change so it can be probably hardcoded in CA as well. - + ### [3] - There is no live example of what DaemonSets would be run on the new node. All daemon sets can be listed from apiserver and checked against the node with all the labels, capacities, allocatables and manifest-run pods obtained in previous steps. CA codebase already has the set of predicates imported so checking which pods should run on the node will be relatively easy. # Solution -Given all the information above it should be relatively simple to write a module that given the access to GCP Api +Given all the information above it should be relatively simple to write a module that given the access to GCP Api and Kubernetes API server. We will expand the NodeGroup interface (https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/cloud_provider.go#L40) with a method TemplateNodeInfo, taking no parameters and returning NodeInfo (containing api.Node and all pods running by default on the node) or error if unable to do so. diff --git a/cluster-autoscaler/proposals/node_autoprovisioning.md b/cluster-autoscaler/proposals/node_autoprovisioning.md index 0b33b7687886..20fa688aa39d 100644 --- a/cluster-autoscaler/proposals/node_autoprovisioning.md +++ b/cluster-autoscaler/proposals/node_autoprovisioning.md @@ -1,4 +1,4 @@ -# Node Auto-provisioning +# Node Auto-provisioning author: mwielgus # Introduction @@ -14,7 +14,7 @@ the configuration process. # Changes Allowing CA to create node pools at will requires multiple changes in various places in CA and around it. -## Configuration +## Configuration Previously, when the node groups were fixed, the user had to just specify the minimum and maximum size of each of the node groups to keep the cluster within the allowed budget. With NAP it is becoming more complex. GCP machine prices (and other cloud provides as well) linearly depend on the number of cores and gb of memory (and possibly GPUs in the future) we can just ask user to set the min/max amount for each of the resources. So we will add the following flags to the CA: * `--node-autoprovisioning` - enables node autoprovisioning in CA @@ -28,32 +28,32 @@ Moreover, the users might want to keep some of the node pools always in the clus * `--nodes=min:max:id` -It is assumed that if there is any extra node group/node pool in the cluster, that hasn’t been mentioned in the command line should stay exactly “as-is”. +It is assumed that if there is any extra node group/node pool in the cluster, that hasn’t been mentioned in the command line should stay exactly “as-is”. While there are two options to express the boundaries for CA operations the precedence order is as follows: * `--min-cpu`, `--max-cpu`, `--min-memory`, `--max-memory` go first. They are not enforcing. If there is more/less resources in the cluster than desired CA will not immediately start/kill nodes. It will move only towards the expected boundaries when needed/appropriate. The difference between the expected cluster size and the current size will not grow. The flags are optional. If they are not specified then the assumption is that the limit is either 0 or +Inf. -* `--nodes=min:max:id` will come second, when applicable. Nodes in that group may go between min and max only if --min-cpu/max-cpu/min-memory/max-memory constraints are met. +* `--nodes=min:max:id` will come second, when applicable. Nodes in that group may go between min and max only if --min-cpu/max-cpu/min-memory/max-memory constraints are met. Example: There 3 groups in the cluster: -* “main” - not autoprovisioned (no prefix), not autoscaled (not be mentioned in CA’s configuration flags). Contains 2 x n1-standard2. +* “main” - not autoprovisioned (no prefix), not autoscaled (not be mentioned in CA’s configuration flags). Contains 2 x n1-standard2. * “as” - not autprovisioned (no prefix), autoscaled (mentioned in CA configuration flags) between 0 and 2 nodes. Contains 1 x n1-standard16. -* “nodeautoprovisioning_n1-highmem4_1234129” - autoprovisioned (has prefix). Currently contains 2 x n1-highmem4. +* “nodeautoprovisioning_n1-highmem4_1234129” - autoprovisioned (has prefix). Currently contains 2 x n1-highmem4. * If `--max-cpu=5` then no node can be added to any of the groups. No new groups will be created. * If `--max-cpu=32` then 1 node might be added to “nodeautoprovisioning_n1-highmem4...” or a new node group, with up to 4 n1-standard1 machines created. * If `--max-cpu=80` then: * 1 node might be added to “as” “Nodeautoprovisioning_n1-highmem4_1234129” may grow up to 15 nodes, - * Some other node groups might be created. + * Some other node groups might be created. Similar logic applies to `--min-cpu`. It might be good to set this value relatively low so that CA is able to disband unneeded machines. -To allow power users to have some control over what exactly can be The provided new methods in cloudprovider API autprovisioned there will be an semi-internal flag with a list of all machine types that CA can autoprovision: +To allow power users to have some control over what exactly can be The provided new methods in cloudprovider API autprovisioned there will be an semi-internal flag with a list of all machine types that CA can autoprovision: `--machine-types=n1-standard-1,n1-standard-2,n1-standard-4,n1-standard-8,n1-standard-16, n1-highmem-1,n1-highmem-2,...` Also, for sanity, there will be a flag to limit the total number of node groups in a cluster, set to 50 or so. @@ -72,9 +72,9 @@ Right now the scale up code assumes that all node groups are already known and s https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/core/scale_up.go#L77 -The list of current node groups to check will have to be expanded with, probably bigger, list of all potential node groups that could be added to the cluster. CA will analyze what labels and other resources are needed by the pods and calculate a set of all labels and additional resources that are useful (and commonly needed) for most of the pods. +The list of current node groups to check will have to be expanded with, probably bigger, list of all potential node groups that could be added to the cluster. CA will analyze what labels and other resources are needed by the pods and calculate a set of all labels and additional resources that are useful (and commonly needed) for most of the pods. -By default CA will not set any taints on the nodes. Tolerations, set on a pod, are not requirements. +By default CA will not set any taints on the nodes. Tolerations, set on a pod, are not requirements. Then it will add these labels to all machine types available in the cloud provider and evaluate the theoretical node groups along with the node groups that are already in the cluster. If the picked node group doesn’t exist then CA should create it. @@ -87,14 +87,14 @@ To allow this the following extensions will be made in CloudProvider interface: Moreover an extension will be made to node group interface: * `Exists() (bool, error)` - checks if the node group really exists on the cloud provider side. Allows to tell the theoretical node group from the real one. - * `Create() error` - creates the node group on the cloud provider side. + * `Create() error` - creates the node group on the cloud provider side. * `Delete() error` - deletes the node group on the cloud provider side. This will be executed only for autoprovisioned node groups, once their size drops to 0. # Calculating best label set for nodes -Assume that PS is a set of pods that would fit on a node of type NG if it had the labels matching to its selector. For each of the machine types we can build a node with no labels and for each pod set the labels according to the pod requirements. If the pod fits to the node it goes to PS. +Assume that PS is a set of pods that would fit on a node of type NG if it had the labels matching to its selector. For each of the machine types we can build a node with no labels and for each pod set the labels according to the pod requirements. If the pod fits to the node it goes to PS. -Then we calculate the stats of all node selectors of the pods. For each significantly different node selector we calculate the number of pods that has this specific node selector. We pick the most popular one, and then check if this selector is “compatible” with the second most popular, third (and so on) as well as the selected machine type. +Then we calculate the stats of all node selectors of the pods. For each significantly different node selector we calculate the number of pods that has this specific node selector. We pick the most popular one, and then check if this selector is “compatible” with the second most popular, third (and so on) as well as the selected machine type. Example: @@ -108,7 +108,7 @@ S1 is compatible with S2 and S4. S2 is compatible with S1, S3 and S4. S3 is comp The label selector that would come from S1, S2 and S4 would be x="a" and y="b" and machine_type = "n1-standard-2", however depending on popularity, the other option is S2, S3, S4 => x="c", y="b" and machine_type = "n1-standard-16". -# Testing +# Testing The following e2e test scenarios will be created to check whether NAP works as expected: diff --git a/cluster-autoscaler/proposals/parallel_drain.md b/cluster-autoscaler/proposals/parallel_drain.md index 6394f55dff60..546271e29723 100644 --- a/cluster-autoscaler/proposals/parallel_drain.md +++ b/cluster-autoscaler/proposals/parallel_drain.md @@ -341,4 +341,3 @@ period, `--max-scale-down-parallelism` will default to the value of * node has no-scaledown annotation * node utilization is too high * node is already marked as destination - diff --git a/cluster-autoscaler/proposals/plugable-provider-grpc.md b/cluster-autoscaler/proposals/plugable-provider-grpc.md index 0758cdacab0f..d65e61e36419 100644 --- a/cluster-autoscaler/proposals/plugable-provider-grpc.md +++ b/cluster-autoscaler/proposals/plugable-provider-grpc.md @@ -121,7 +121,7 @@ service CloudProvider { // CloudProvider specific RPC functions rpc NodeGroups(NodeGroupsRequest) - returns (GetNameResponse) {} + returns (GetNameResponse) {} rpc NodeGroupForNode(NodeGroupForNodeRequest) returns (NodeGroupForNodeResponse) {} @@ -133,35 +133,35 @@ service CloudProvider { returns (PricingPodPriceResponse) rpc GPULabel(GPULabelRequest) - returns (GPULabelResponse) {} + returns (GPULabelResponse) {} rpc GetAvailableGPUTypes(GetAvailableGPUTypesRequest) - returns (GetAvailableGPUTypesResponse) {} + returns (GetAvailableGPUTypesResponse) {} rpc Cleanup(CleanupRequest) - returns (CleanupResponse) {} + returns (CleanupResponse) {} rpc Refresh(RefreshRequest) returns (RefreshResponse) {} // NodeGroup specific RPC functions rpc NodeGroupTargetSize(NodeGroupTargetSizeRequest) - returns (NodeGroupTargetSizeResponse) {} + returns (NodeGroupTargetSizeResponse) {} rpc NodeGroupIncreaseSize(NodeGroupIncreaseSizeRequest) - returns (NodeGroupIncreaseSizeResponse) {} + returns (NodeGroupIncreaseSizeResponse) {} rpc NodeGroupDeleteNodes(NodeGroupDeleteNodesRequest) - returns (NodeGroupDeleteNodesResponse) {} + returns (NodeGroupDeleteNodesResponse) {} rpc NodeGroupDecreaseTargetSize(NodeGroupDecreaseTargetSizeRequest) - returns (NodeGroupDecreaseTargetSizeResponse) {} + returns (NodeGroupDecreaseTargetSizeResponse) {} rpc NodeGroupNodes(NodeGroupNodesRequest) returns (NodeGroupNodesResponse) {} rpc NodeGroupTemplateNodeInfo(NodeGroupDTemplateNodeInfoRequest) - returns (NodeGroupTemplateNodeInfoResponse) {} + returns (NodeGroupTemplateNodeInfoResponse) {} } ``` diff --git a/cluster-autoscaler/proposals/pricing.md b/cluster-autoscaler/proposals/pricing.md index 8bdbd9cbdce1..8de336f7eec6 100644 --- a/cluster-autoscaler/proposals/pricing.md +++ b/cluster-autoscaler/proposals/pricing.md @@ -1,60 +1,60 @@ # Cost-based node group ranking function for Cluster Autoscaler ##### Author: mwielgus - + # Introduction - + Cluster autoscaler tries to increase the cluster size if there are some pods that are unable to fit into the nodes currently present in the cluster. If there are two or more types of nodes in the cluster CA has to decide which one to add. At this moment it, by default, just picks a random one (there are some other expanders but they are also relatively simple). So it may add a 32 cpu node with gpu and 500 gb of ram to accommodate a pod that requires just a single node with a bit of memory. In order to properly support heterogeneous clusters we need to properly choose a node group and decide which one to expand, and be able to tell expensive expansion option from the cost effective one. - + # Node cost estimation - + To correctly choose the cheapest option from the set of available options we need to be able to calculate the cost of a single node. For example, for GKE/GCE node the cost is well known, however these numbers are not available through the api, so some short config file would be probably needed. As we don’t do real billing here but try to estimate the differences between the cluster expansion options the numbers don’t have to be super exact. CA should just price the cheaper instances lower than the more expensive ones. - + # Choosing the best node pool to expand - + Knowing the cost of a single node is only a part of the story. We need to choose the best node pool to accommodate the unscheduled pods, as a whole. However putting all of the pending pods to a single node pool may not be always an option because: - + * Node pool min/max boundaries. Pending pods may require expanding it beyond the max range. * Some pods may not fit into the machines. * Different node selectors. - + Different node pools may accept different pods and accommodate them on different number of nodes what can result in completely different cost. Let's denote the costs of the expansion as C. Let me give you an example. - + * Option1: requires 3 nodes of Type1, accommodates pods P1, P2, P3 and costs C1=10$ -* Option2: requires 2 nodes of Type2, accommodates pods P1, P3, P4, P5 and costs C2=20$. - +* Option2: requires 2 nodes of Type2, accommodates pods P1, P3, P4, P5 and costs C2=20$. + It is hard to tell whether we will get a better deal with paying 10$ for having 3 pods running or with paying 20$ for a different set of pods. We need to make C1 and C2 somehow comparable. - + We can compute how much would it cost to run a pod on a machine that is perfectly suited to its needs. From GKE pricing we know that: - - * 1 cpu cost $0.033174 / hour, + + * 1 cpu cost $0.033174 / hour, * 1 gb of memory cost $0.004446 / hour - * 1 GPU is 0.7 / hour. - -We can simplify pod ssd storage request and assume that it needs 50gb for local storage that cost 0.01$ per hour. - -For two expansion options we could compute what is the theoretical cost of having all of these pods running on perfectly-fitted machines and mark it as T1 and T2. - -Then C1/T1 and C2/T2 would denote how effective are we with a particular node group expansion. For example C1/T1 may equal to 2 which means that we pay 2 times more than we would need in a perfect world. And if C2/T2 equal to 1.05 it means that we are super effective and it’s hard to expect that some other option would be much better. - -If we consistently pick the option with the lowest real to theoretical cost we should get quite good approximation of the perfect (from the money perspective) cluster layout. It may not be the most optimal but finding the most optimal one seems to be an NP problem (it’s a kind of binpacking). - - + * 1 GPU is 0.7 / hour. + +We can simplify pod ssd storage request and assume that it needs 50gb for local storage that cost 0.01$ per hour. + +For two expansion options we could compute what is the theoretical cost of having all of these pods running on perfectly-fitted machines and mark it as T1 and T2. + +Then C1/T1 and C2/T2 would denote how effective are we with a particular node group expansion. For example C1/T1 may equal to 2 which means that we pay 2 times more than we would need in a perfect world. And if C2/T2 equal to 1.05 it means that we are super effective and it’s hard to expect that some other option would be much better. + +If we consistently pick the option with the lowest real to theoretical cost we should get quite good approximation of the perfect (from the money perspective) cluster layout. It may not be the most optimal but finding the most optimal one seems to be an NP problem (it’s a kind of binpacking). + + # Adding “preferred” node type to the formula - + C/T is a simple formula that doesn’t include other aspects as, for example, the preference to have bigger machines in a bigger cluster or have more consistent set of nodes. The advantage of big nodes is that they usually offer smaller resource fragmentation and are more likely to have enough resources to accept new pods. For example 2 x n1-standard-2 packed to 75% will only accept pods requesting less than 0.5 cpu, while 1 x n1-standard-2 can take 2 pods requesting 0.5 cpu OR 1 pod requesting -1 cpu. Having more consistent set of machines makes cluster management easier. - -To include this preference in the formula we introduce a per-node metric called NodeUnFitness. -It will be small for nodes that match “overall” to the cluster shape and big for nodes that don’t match there well. +1 cpu. Having more consistent set of machines makes cluster management easier. + +To include this preference in the formula we introduce a per-node metric called NodeUnFitness. +It will be small for nodes that match “overall” to the cluster shape and big for nodes that don’t match there well. One of the possible (simple) NodeUnFitness implementations can be defined as a ratio between a perfect node for the cluster and the node from the pool. To be more precise: -``` +``` NodeUnFitness = max(preferred_cpu/node_cpu, node_cpu/preferred_cpu) -``` +``` Max is used to ensure that NodeUnfitness is equal to 1 for a perfect node and greater than that for not-so-perfect nodes. For example, if n1-standard-8 is the preferred node then the unfitness of n1-standard-2 is 4. @@ -63,87 +63,87 @@ This number can be, in theory, combined with the existing number using a linear ``` W1 * C/T + W2 * NodeUnFitness -``` - +``` + While this linear combination sounds cool it is a bit problematic. -For small or single pods C/T strongly prefers smaller machines that may not be the best for the overall cluster well-being. For example C/T for a 100m pod with n1-standard-8 is ~80. +For small or single pods C/T strongly prefers smaller machines that may not be the best for the overall cluster well-being. For example C/T for a 100m pod with n1-standard-8 is ~80. C/T for n2-standard-2 is 20. Assuming that n1-standard-8 would be a node of choice if 100 node clusters W2*NodeUnFitness would have to be 60 (assuming, for simplicity, W1 = 1). -n1-standard-2 is only 4 times smaller than n1-standard-8 so W2 = 15. But then everything collapses with even smaller pod. For a 50milli cpu pod W2 would have to be 30. So it’s bad. C/T is not good for the linear combination. - -So we need something better. - +n1-standard-2 is only 4 times smaller than n1-standard-8 so W2 = 15. But then everything collapses with even smaller pod. For a 50milli cpu pod W2 would have to be 30. So it’s bad. C/T is not good for the linear combination. + +So we need something better. + We are looking for a pricing function that: - -* [I1] Doesn’t go 2 times up with a small change in absolute terms (100->50 mill cpu), is more or less constant for small pods. -* [I2] Still penalizes node types that have some completely unneeded resources (GPU). - -C/T can be stabilized by adding some value X to C and T. Lets call it a big cluster damper. -So the formula is like (C+X)/(T+X). - + +* [I1] Doesn’t go 2 times up with a small change in absolute terms (100->50 mill cpu), is more or less constant for small pods. +* [I2] Still penalizes node types that have some completely unneeded resources (GPU). + +C/T can be stabilized by adding some value X to C and T. Lets call it a big cluster damper. +So the formula is like (C+X)/(T+X). + Lets see what happens if X is the cost of running a 0.5 cpu pod and we have 1 pending pod of size 0.1 cpu. The preferred node is n1-standard-8. - + | Machine type | Calculation | Rank | -|--------------|-------|------| +|--------------|-------|------| | n1-standard-2 | (0.095 + 0.016) / (0.003 + 0.016) | 5.84 | | n1-standard-8 | (0.380 + 0.016) / (0.003 + 0.016) | 20.84 | | n1-standard-2+GPU | (0.795 + 0.016) / (0.003 + 0.016) | 42 | - + And what if 1.5 cpu. - + | Machine type | Calculation | Rank | -|--------------|-------|------| +|--------------|-------|------| | n1-standard-2 | (0.095 + 0.016) / (0.003*15 + 0.016)| 1.81 | | n1-standard-8 | (0.380 + 0.016) / (0.003*15 + 0.016) | 6.49 | | n1-standard-2+GPU | (0.795 + 0.016) / (0.003*15 + 0.016) | 13.0 | - + Slightly better, but still hard to combine linearly with NodeUnfitness being equal 1 or 4. - -Let’s try something different: + +Let’s try something different: ``` NodeUnfitness*(C + X)/(T+X) ``` 0.1 cpu request: | Machine type | Calculation | Rank | -|--------------|-------|------| +|--------------|-------|------| | n1-standard-2 | 4 * (0.095 + 0.016) / (0.003 + 0.016) | 23.36 | | n1-standard-8 | 1 * (0.380 + 0.016) / (0.003 + 0.016) | 20.84 | | n1-standard-2+GPU | 4 * (0.795 + 0.016) / (0.003 + 0.016) | 168.0 | - + 1.5 cpu request: | Machine type | Calculation | Rank | -|--------------|-------|------| +|--------------|-------|------| | n1-standard-2 | 4 * (0.095 + 0.016) / (0.003*15 + 0.016) | 7.24 | | n1-standard-8 | 1 * (0.380 + 0.016) / (0.003*15 + 0.016) | 6.49 | | n1-standard-2+GPU | 4 * (0.795 + 0.016) / (0.003*15 + 0.016) | 52 | - + Looks better. So we are able to promote having bigger nodes if needed. However, what if we were to create 50 n1-standard-8 nodes to accommodate 50 x 1.5 cpu pods with strict PodAntiAffinity? Well, in that case we should probably go for n1-standard-2 nodes, however the above formula doesn’t promote that, because it considers the node unfit. So when requesting a larger number of nodes (in a single scale-up) we should probably suppress NodeUnfitness a bit. The suppress function should reduce the effect of NodeUnfitness when there is a good reason to do it. One of the good reasons is that the other option is significantly cheaper. In general the more nodes we are requesting the bigger the price difference can be. And if we are requesting just a single node -then this node should be well fitted to the cluster (than to the pod) so that other pods can also use it and the cluster administrator has less types of nodes to focus on. - +then this node should be well fitted to the cluster (than to the pod) so that other pods can also use it and the cluster administrator has less types of nodes to focus on. + We are looking for a function suppress(NodeUnfitness, NodeCount) that: - + * For NodeCount = 1 returns NodeUnfitness * For NodeCount = 2 returns NodeUnfitness * 0.95 * For NodeCount = 5 returns NodeUnfitness * 0.8 * For NodeCount = 50 returns ~1 which means that the node is perfectly OK for the cluster. - + Where NodeCount is the number of nodes that need to be added to the cluster for that particular option. In future we will -probably have to use some more sophisticated/normalized number as the node count obviously depends on the machine type. +probably have to use some more sophisticated/normalized number as the node count obviously depends on the machine type. A slightly modified sigmoid function has such properties. Lets define - + ``` suppress(u,n) = (u-1)*(1-math.tanh((n-1)/15.0))+1 ``` Please keep in mind that unfitness is >= 1. - + Then: -``` +``` suppress(4, 1)=4.000000 == 4 * 1.00 suppress(4, 2)=3.800296 == 4 * 0.95 suppress(4, 3)=3.602354 == 4 * 0.90 @@ -153,18 +153,18 @@ suppress(4,10)=2.388851 == 4 * 0.60 suppress(4,20)=1.441325 == 4 * 0.36 suppress(4,50)=1.008712 == 4 * 0.25 ``` - -Exactly what we wanted to have! However, should we need a steeper function we can replace 15 with a smaller number. - + +Exactly what we wanted to have! However, should we need a steeper function we can replace 15 with a smaller number. + So, to summarize, the whole ranking function would be: -``` +``` suppress(NodeUnfitness, NodeCount) * (C + X)/(T + X) ``` where: -``` +``` suppress(u,n) = (u-1)*(1-math.tanh((n-1)/15.0))+1 nodeUnFitness = max(preferred_cpu/node_cpu, node_cpu/preferred_cpu) ``` diff --git a/cluster-autoscaler/proposals/scalability_tests.md b/cluster-autoscaler/proposals/scalability_tests.md index 68b53b3b4529..fe66c0c0df52 100644 --- a/cluster-autoscaler/proposals/scalability_tests.md +++ b/cluster-autoscaler/proposals/scalability_tests.md @@ -7,7 +7,7 @@ As a part of Cluster Autoscaler graduation to GA we want to guarantee a certain ## CA scales to 1000 nodes -Cluster Autoscaler scales up to a certain number of nodes if it stays responsive. It performs scales up and scale down operations on the cluster within reasonable time frame. If CA is not responsive it can be killed by the liveness probe or fail to provide/release computational resources in cluster when needed, resulting in inability of the cluster to handle additional workload, or in higher cloud provider bills. +Cluster Autoscaler scales up to a certain number of nodes if it stays responsive. It performs scales up and scale down operations on the cluster within reasonable time frame. If CA is not responsive it can be killed by the liveness probe or fail to provide/release computational resources in cluster when needed, resulting in inability of the cluster to handle additional workload, or in higher cloud provider bills. ## Expected performance @@ -19,7 +19,7 @@ Using Kubernetes and [kubemark](https://github.com/kubernetes/community/blob/mas * 1 master - 1-core VM * 17 nodes - 8-core VMs, each core running up to 8 Kubemark nodes. * 1 Kubemark master - 32-core VM -* 1 dedicated VM for Cluster Autoscaler +* 1 dedicated VM for Cluster Autoscaler ## Test execution @@ -53,7 +53,7 @@ We have run multiple test scenarios with a general setup targeting load of ~1000 * Do: nothing * Expected result: 30 nodes are removed from cluster -5. [Scale-down] Doesn't scale down with underutilized but unremovable nodes +5. [Scale-down] Doesn't scale down with underutilized but unremovable nodes * Scenario: With a cluster that has a significant number of underutilized but unremovable nodes, we simulate a sudden drop of activity in a cluster that has unremovable nodes. * Start with: 1000 pods running on 1000 nodes, 700 nodes 90% full, 300 nodes about 30% full (underutilized, but unremovable due to host post conflicts) * Do: nothing diff --git a/hack/boilerplate/boilerplate.go.txt b/hack/boilerplate/boilerplate.go.txt index b7c650da4701..0926592d3895 100644 --- a/hack/boilerplate/boilerplate.go.txt +++ b/hack/boilerplate/boilerplate.go.txt @@ -13,4 +13,3 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. */ - diff --git a/hack/boilerplate/boilerplate.py.txt b/hack/boilerplate/boilerplate.py.txt index 6118b2faf279..a1866b015ce5 100644 --- a/hack/boilerplate/boilerplate.py.txt +++ b/hack/boilerplate/boilerplate.py.txt @@ -13,4 +13,3 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. - diff --git a/hack/boilerplate/boilerplate.sh.txt b/hack/boilerplate/boilerplate.sh.txt index 069e282bc855..c06e635b3e7c 100644 --- a/hack/boilerplate/boilerplate.sh.txt +++ b/hack/boilerplate/boilerplate.sh.txt @@ -11,4 +11,3 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. - diff --git a/hack/scripts/ca_metrics_parser.py b/hack/scripts/ca_metrics_parser.py index 51083c0d74ec..1f08439d3923 100755 --- a/hack/scripts/ca_metrics_parser.py +++ b/hack/scripts/ca_metrics_parser.py @@ -121,4 +121,3 @@ def main(): if __name__ == '__main__': main() -