Skip to content

Commit 2693c3e

Browse files
authored
Merge branch 'main' into feature/support-tz-for-disruption-budgets-schedule
2 parents 1e24532 + 740331f commit 2693c3e

File tree

70 files changed

+2647
-504
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

70 files changed

+2647
-504
lines changed

.github/actions/install-deps/action.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ runs:
1616
# Root path permission workaround for caching https://github.com/actions/cache/issues/845#issuecomment-1252594999
1717
- run: sudo chown "$USER" /usr/local
1818
shell: bash
19-
- uses: actions/cache@0c907a75c2c80ebcb7f088228285e798b750cf8f # v4.2.1
19+
- uses: actions/cache@d4323d4df104b026a6aa633fdb11d772146be0bf # v4.2.2
2020
id: cache-toolchain
2121
with:
2222
path: |

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ coverage.html
66
*.heapprofile
77
go.work
88
go.work.sum
9+
report.json
910

1011
# Common in OSs and IDEs
1112
.idea

OWNERS_ALIASES

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,10 @@ aliases:
99
- ellistarn
1010
- jonathan-innis
1111
- tzneal
12-
- bwagner5
1312
- njtran
13+
- jmdeal
1414
karpenter-reviewers:
15-
- jackfrancis
1615
- tallaxes
1716
- engedaam
18-
- jmdeal
17+
emeritus-karpenter-maintainers:
18+
- bwagner5

contributing-guidelines.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This document’s goal is to define clear, scalable, and transparent criteria to
44

55
Ultimately, the criteria in this doc is aspirational. No set of written requirements can encapsulate the full criteria when determining if someone meets the bar to be a reviewer or approver, as some of the criteria are subjective and relies on the trust that each nominee has established with the community. To help guide readers, this document outlines ways to demonstrate expertise of the code base, sound judgement on decision tradeoffs, end user advocacy, care for community, and ability to work as a distributed team.
66

7-
Much of this document uses the [SIG-Node Contributor Ladder](https://github.com/kubernetes/community/blob/master/sig-node/sig-node-contributor-ladder.md) as prior art. The goal is to mold these requirements to fit the Karpenter’s community. These requirements also lean on the established Kubernetes[membership documentation](https://github.com/kubernetes/community/blob/master/community-membership.md) for terminology.
7+
Much of this document uses the [SIG-Node Contributor Ladder](https://github.com/kubernetes/community/blob/master/sig-node/sig-node-contributor-ladder.md) as prior art. The goal is to mold these requirements to fit the Karpenter’s community. These requirements also lean on the established Kubernetes [membership documentation](https://github.com/kubernetes/community/blob/master/community-membership.md) for terminology.
88

99
As a final precursor, to become a reviewer or approver, users must nominate themselves. They are responsible for cutting a PR to the upstream repository, providing evidence in-line with the suggested requirements. Users should feel free to reach out to an existing approver to understand what how they land in respect to the criteria. The following sections are guiding criteria and guidelines, where the final decision lies with the maintainers.
1010

@@ -92,4 +92,8 @@ In addition to the formal requirements for the [approver role](https://github.co
9292
* Have approval rights in a well-known cloud provider implementation of Karpenter or in an adjacent SIG Autoscaling sub-project.
9393
* Be a primary PR reviewer for numerous PRs in multiple areas listed as a requirement for a reviewer.
9494
* Actively triage issues and PRs, provide support to contributors to drive their PRs to completion.
95-
* Be present, and participate in Karpenter meetings by speaking about features or improvements driven, or find some other way to prove the identity behind GitHub handle.
95+
* Be present, and participate in Karpenter meetings by speaking about features or improvements driven, or find some other way to prove the identity behind GitHub handle.
96+
97+
### Cleanup and Emeritus
98+
99+
The `kubernetes-sigs/karpenter` sub-project abides by the same cleanup process prescribed in https://github.com/kubernetes/community/blob/master/contributors/guide/owners.md#cleanup. It is generally recommended that reviewers or approvers who know that they are no longer going to be actively invovled _remove themselves_ from the OWNERS_ALIASES file; however, approvers may also initate a PR and reachout to the relevant reviewer/approver if they recognize that the user is no longer actively involved in the project (as defined in the community contributors doc linked above).

designs/capacity-reservations.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,13 +22,13 @@ Karpenter doesn't currently support reasoning about this capacity type. Karpente
2222
3. Karpenter should add logic to its scheduler to reason about this availability as an `int` -- ensuring that the scheduler never schedules more offerings of an instance type for a capacity type than are available
2323
4. Karpenter should extend its CloudProvider [InstanceType](https://github.com/kubernetes-sigs/karpenter/blob/35d6197e38e64cd6abfef71a082aee80e38d09fd/pkg/cloudprovider/types.go#L75) struct to allow offerings to represent availability of an offering as an `int` rather than a `bool` -- allowing Cloud Providers to represent the constrained capacity of `reserved`
2424
5. Karpenter should consolidate between `on-demand` and/or `spot` instance types to `reserved` when the capacity type is available
25-
6. Karpenter should introduce a feature flag `FEATURE_FLAG=CapacityReservations` to gate this new feature in `ALPHA` when it's introduced
25+
6. Karpenter should introduce a feature flag `FEATURE_FLAG=ReservedCapacity` to gate this new feature in `ALPHA` when it's introduced
2626

2727
### `karpenter.sh/capacity-type` API
2828

2929
_Note: Some excerpts taken from [`aws/karpenter-provider-aws` RFC](https://github.com/aws/karpenter-provider-aws/blob/main/designs/odcr.md#nodepool-api)._
3030

31-
This RFC proposes the addition of a new `karpenter.sh/capacity-type` label value, called `reserved`. A cluster admin could then select to support only launching reserved node capacity and falling back between reserved capacity to on-demand (or even spot) capacity respectively.
31+
This RFC proposes the addition of a new `karpenter.sh/capacity-type` label value, called `reserved`. A cluster admin could then select to support only launching reserved node capacity and falling back between reserved capacity to on-demand (or even spot) capacity respectively.
3232

3333
_Note: This option requires any applications (pods) that are using node selection on `karpenter.sh/capacity-type: "on-demand"` to expand their selection to include `reserved` or to update it to perform a `NotIn` node affinity on `karpenter.sh/capacity-type: spot`_
3434

@@ -140,4 +140,4 @@ In practice, this means that if a user has two capacity reservation offerings av
140140

141141
## Appendix
142142

143-
1. AWS Cloud Provider's RFC for On-Demand Capacity Reservations: https://github.com/aws/karpenter-provider-aws/blob/main/designs/odcr.md
143+
1. AWS Cloud Provider's RFC for On-Demand Capacity Reservations: https://github.com/aws/karpenter-provider-aws/blob/main/designs/odcr.md
Loading
Loading
Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
# RFC: NodeRegistrationHealthy Status Condition on NodePool
2+
3+
## Motivation
4+
5+
Karpenter may initiate the creation of nodes based on a NodePool configuration, but these nodes might fail to join the cluster due to unforeseen registration issues that Karpenter cannot anticipate or prevent. An example illustrating this issue is when network connectivity is impeded by incorrect cluster security group configuration, such as missing outbound rule that allows outbound access to any IPv4 address. In such cases, Karpenter will continue its attempts to provision compute resources, but these resources will fail to join the cluster until the outbound rule for the security group is updated. Currently, there isn't a way to surface these kind of failures to the users.
6+
7+
This RFC proposes enhancing the visibility of these failure modes by introducing a `NodeRegistrationHealthy` status condition on the NodePool. We can then create new metrics around this status condition which will improve observability by alerting cluster administrators to potential issues within a NodePool that require investigation and resolution.
8+
9+
The `NodeRegistrationHealthy` status would specifically highlight instance launch/registration failures that Karpenter cannot fully diagnose or predict. However, this status should not be a mechanism to catch all types of launch/registration failures. Karpenter should not mark resources as `NodeRegistrationHealthy` if it can definitively determine, based on the NodePool/NodeClass configurations or through dry-run, that launch or registration will fail. For instance, if a NodePool is restricted to a specific zone using the `topology.kubernetes.io/zone` label, but the specified zone is not accessible through the provided subnet configurations, this inconsistency shouldn't trigger a `NodeRegistrationHealthy: False` status. CloudProviders should also try implementing deterministic mechanisms to identify launch failures like this [validation controller](https://github.com/aws/karpenter-provider-aws/blob/main/pkg/controllers/nodeclass/validation.go) added for AWS provider.
10+
11+
Currently, while launch and registration processes have defined timeouts, the initialization phase does not. As a result, there's no concept of initialization failures today. However, the proposed design can be extended to potentially incorporate initialization failure detection in future iterations.
12+
13+
## 🔑 Introduce a NodeRegistrationHealthy Status Condition on the NodePool Status
14+
15+
```
16+
// 'NodeRegistrationHealthy' condition indicates if a misconfiguration exists that prevents the normal, successful use of a Karpenter resource
17+
Status:
18+
Conditions:
19+
Last Transition Time: 2025-01-13T18:57:20Z
20+
Message:
21+
Observed Generation: 1
22+
Reason: NodeRegistrationHealthy
23+
Status: True
24+
Type: NodeRegistrationHealthy
25+
```
26+
`NodeRegistrationHealthy` status condition is introduced in the NodePool status which can be set to -
27+
1. Unknown - When the NodePool is first created, `NodeRegistrationHealthy` is set to Unknown. This means that we don't have enough data to tell if the nodes launched using this NodePool can successfully register or not.
28+
2. False - NodePool has configuration issues that require user investigation and resolution. Since Karpenter cannot automatically detect these specific launch or registration failures, we will document common failure scenarios and possible fixes in our troubleshooting guide to assist users. The cause for the failure will also be surfaced through the status condition reason and message fields.
29+
3. True - There has been successful node registration using this unique combination of NodePool and NodeClass spec.
30+
31+
A NodePool marked with `NodeRegistrationHealthy: False` can still be used for provisioning workloads, as this status isn't a precondition for readiness. We can expand this in the future with a follow-up where we will have some cooldown period for trying to schedule with a NodePool that has `NodeRegistrationHealthy: False`.
32+
33+
## Goals
34+
1. Tolerate transient errors.
35+
2. Respond to corrections in external configuration (i.e. can remove the NodeRegistrationHealthy status condition from a NodePool if an external fix allows Nodes to register).
36+
37+
### Option 1: Track if a node was successfully launched using a NodePool - Recommended
38+
This option sets `NodeRegistrationHealthy` status condition on the nodePool by checking if the node launched using this nodePool failed to launch/register.
39+
40+
![](./images/noderegistrationhealthy-nodepools2.png)
41+
42+
Evaluation conditions -
43+
44+
1. When a nodePool is first created, set `NodeRegistrationHealthy: Unknown`. A nodePool which already has `NodeRegistrationHealthy: True` will not go back to unknown unless there is an update to the NodePool or the referenced nodeClass.
45+
2. On a failed launch/registration, set `NodeRegistrationHealthy: False`. Do not update the status condition to false if the nodePool already has `NodeRegistrationHealthy: True`.
46+
3. On successful registration, set `NodeRegistrationHealthy: True`.
47+
4. Do not update the `NodeRegistrationHealthy` status condition when Karpenter restarts.
48+
49+
#### Considerations
50+
51+
1. 👍 This approach is particularly helpful for pod binding/ready metrics for pods that are scheduled against NodePools that have `NodeRegistrationHealthy: True`. Metrics collection will begin only after its first successful node registration. Once `NodeRegistrationHealthy: True`, and a node is launched with a bad AMI then we will still see pod metrics that can help identify node launch failures due to bad AMIs.
52+
53+
### Option 2: In-memory Buffer to store history
54+
55+
This option will have an in-memory FIFO buffer, which will grow to a max size of 10 (this can be changed later). This buffer will store data about the success or failure during launch/registration and is evaluated by a controller to determine the relative health of the NodePool. This would be implemented as a `[]bool`, where `true` indicates a launch success, and `false` represents a failure. The state of the `NodeRegistrationHealthy` condition would be based on the number of `false` entries in the buffer.
56+
57+
![](./images/noderegistrationhealthy-nodepools1.png)
58+
59+
Evaluation conditions -
60+
61+
1. We start with an empty buffer with `NodeRegistrationHealthy: Unknown`.
62+
2. There have to be 2 minimum failures in the buffer for `NodeRegistrationHealthy` to transition to `False`.
63+
3. If the buffer starts with a success then `NodeRegistrationHealthy: True`.
64+
4. If Karpenter restarts then we flush the buffer but don't change the existing state of `NodeRegistrationHealthy` status condition.
65+
5. If there is an update to a Nodepool/Nodeclass, flush the buffer and set `NodeRegistrationHealthy: Unknown`.
66+
6. Since the buffer is FIFO, we remove the oldest launch result when the max buffer size is reached.
67+
7. If no new launches/registrations happen for 2 consecutive registration ttl cycles (currently 15 minutes), expire/remove the oldest entry in the buffer and revaluate the status condition if it was previously set to False. This ensures that a NodePool's unhealthy status does not persist indefinitely when no new launch attempts have occurred within 30 minutes.
68+
69+
See below for example evaluations:
70+
71+
```
72+
Successful Launch: true
73+
Unsuccessful Launch: false
74+
75+
[] = 'NodeRegistrationHealthy: Unknown'
76+
[true] = 'NodeRegistrationHealthy: True'
77+
[false, true] = 'NodeRegistrationHealthy: True'
78+
[false, false] = 'NodeRegistrationHealthy: False'
79+
[false, true, false] = 'NodeRegistrationHealthy: False'
80+
[false, true, true, true, true, true, true, true, true, true] = 'NodeRegistrationHealthy: True'
81+
```
82+
83+
#### Considerations
84+
85+
1. 👍 Tolerates transient failures such as those that happen due to underlying hardware failure because we keep track of recent launch history and set `NodeRegistrationHealthy: True` only when there are 2 or more launch/registration failures.
86+
2. 👍 Can be easily expanded if we want to update the buffer size depending on the cluster size.
87+
3. 👎 This approach tends to get more complex as we need to scale with the cluster size.
88+
4. 👎 Managing buffer entry expiration adds another layer of complexity. We need to ensure that a NodePool's unhealthy status doesn't persist indefinitely when no new launch attempts have occurred within a specified timeframe.
89+
90+
### How Does this Improve Observability?
91+
The introduction of `NodeRegistrationHealthy` status condition serves two key purposes:
92+
93+
1. It provides users with clearer visibility into NodeClaim launch and registration failures.
94+
2. It establishes groundwork for future scheduling optimizations, where we can assign lower priority to NodePools marked with `NodeRegistrationHealthy: False`.

go.mod

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -6,21 +6,21 @@ require (
66
github.com/Pallinder/go-randomdata v1.2.0
77
github.com/avast/retry-go v3.0.0+incompatible
88
github.com/awslabs/operatorpkg v0.0.0-20241205163410-0fff9f28d115
9-
github.com/docker/docker v28.0.0+incompatible
9+
github.com/docker/docker v28.0.1+incompatible
1010
github.com/go-logr/logr v1.4.2
1111
github.com/imdario/mergo v0.3.16
1212
github.com/klauspost/compress v1.17.11 // indirect
1313
github.com/mitchellh/hashstructure/v2 v2.0.2
14-
github.com/onsi/ginkgo/v2 v2.22.2
14+
github.com/onsi/ginkgo/v2 v2.23.0
1515
github.com/onsi/gomega v1.36.2
1616
github.com/patrickmn/go-cache v2.1.0+incompatible
17-
github.com/prometheus/client_golang v1.21.0
17+
github.com/prometheus/client_golang v1.21.1
1818
github.com/prometheus/client_model v0.6.1
1919
github.com/samber/lo v1.49.1
2020
go.uber.org/multierr v1.11.0
2121
go.uber.org/zap v1.27.0
22-
golang.org/x/text v0.22.0
23-
golang.org/x/time v0.10.0
22+
golang.org/x/text v0.23.0
23+
golang.org/x/time v0.11.0
2424
k8s.io/api v0.32.2
2525
k8s.io/apiextensions-apiserver v0.32.2
2626
k8s.io/apimachinery v0.32.2
@@ -30,7 +30,7 @@ require (
3030
k8s.io/csi-translation-lib v0.32.2
3131
k8s.io/klog/v2 v2.130.1
3232
k8s.io/utils v0.0.0-20241104100929-3ea5e8cea738
33-
sigs.k8s.io/controller-runtime v0.20.2
33+
sigs.k8s.io/controller-runtime v0.20.3
3434
)
3535

3636
require (
@@ -66,7 +66,7 @@ require (
6666
github.com/spf13/cobra v1.8.1 // indirect
6767
github.com/spf13/pflag v1.0.6 // indirect
6868
github.com/x448/float16 v0.8.4 // indirect
69-
golang.org/x/net v0.35.0 // indirect
69+
golang.org/x/net v0.36.0 // indirect
7070
golang.org/x/oauth2 v0.24.0 // indirect
7171
golang.org/x/sys v0.30.0 // indirect
7272
golang.org/x/term v0.29.0 // indirect
@@ -86,7 +86,7 @@ require (
8686
github.com/fsnotify/fsnotify v1.7.0 // indirect
8787
github.com/google/btree v1.1.3 // indirect
8888
github.com/rogpeppe/go-internal v1.13.1 // indirect
89-
golang.org/x/sync v0.11.0 // indirect
89+
golang.org/x/sync v0.12.0 // indirect
9090
)
9191

9292
retract (

0 commit comments

Comments
 (0)