Skip to content

Commit

Permalink
Merge branch 'main' into multi-dns
Browse files Browse the repository at this point in the history
  • Loading branch information
woehrl01 authored Jan 19, 2025
2 parents 1197759 + 69ed8b9 commit 62cb956
Show file tree
Hide file tree
Showing 64 changed files with 2,001 additions and 1,701 deletions.
1 change: 1 addition & 0 deletions ADOPTERS.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,5 +59,6 @@ If you are open to others contacting you about your use of Karpenter on Slack, a
| Whoosh | Using Karpenter to scale the EKS clusters for many purposes | `@vainkop` | [Whoosh](https://whoosh.bike) |
| Next Insurance | Using Karpenter to manage the nodes in all our EKS clusters, including dev and prod, on demand and spots | `@moshebs` | [Homepage](https://www.nextinsurance.com)|
| Grover Group GmbH | We use Karpenter for efficient and cost effective scaling of our nodes in all of our EKS clusters | `@suraj2410` | [Homepage](https://www.grover.com/de-en) & [Engineering Techblog](https://engineering.grover.com)|
| Legit Security | We run Karpenter across all our EKS clusters to ensure efficient and cost-effective scaling across our infrastructure | `@Tal Balash`, `@Matan Ryngler` | [Homepage](https://www.legitsecurity.com)|
| Logz.io | Using Karpenter in all of our EKS clusters for efficient and cost effective scaling of all our K8s workloads | `@pincher95`, `@Samplify` | [Homepage](https://logz.io/)|
| X3M ads | We have been using Karpenter for (almost) all our workloads since 2023 | `@mreparaz`, `@fmansilla`, `@mrmartinez95` | [Homepage](https://x3mads.com) |
4 changes: 4 additions & 0 deletions charts/karpenter-crd/templates/karpenter.sh_nodeclaims.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,10 @@ spec:
- jsonPath: .metadata.creationTimestamp
name: Age
type: date
- jsonPath: .status.imageID
name: ImageID
priority: 1
type: string
- jsonPath: .status.providerID
name: ID
priority: 1
Expand Down
3 changes: 3 additions & 0 deletions cmd/controller/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ import (

"sigs.k8s.io/karpenter/pkg/cloudprovider/metrics"
corecontrollers "sigs.k8s.io/karpenter/pkg/controllers"
"sigs.k8s.io/karpenter/pkg/controllers/state"
coreoperator "sigs.k8s.io/karpenter/pkg/operator"
)

Expand All @@ -36,6 +37,7 @@ func main() {
op.SecurityGroupProvider,
)
cloudProvider := metrics.Decorate(awsCloudProvider)
clusterState := state.NewCluster(op.Clock, op.GetClient(), cloudProvider)

op.
WithControllers(ctx, corecontrollers.NewControllers(
Expand All @@ -45,6 +47,7 @@ func main() {
op.GetClient(),
op.EventRecorder,
cloudProvider,
clusterState,
)...).
WithControllers(ctx, controllers.NewControllers(
ctx,
Expand Down
58 changes: 30 additions & 28 deletions designs/interruption-handling.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,15 +29,17 @@ There are two ways in-which Spot interruption notifications and Rebalance Recomm
EC2 IMDS is an HTTP API that can only be locally accessed from an EC2 instance.

```
`curl 169.254.169.254/latest/meta-data/spot/instance-action
# Termination Check
curl 169.254.169.254/latest/meta-data/spot/instance-action
{
"action": "terminate",
"time": "2022-07-11T17:11:44Z"
}
curl 169.254.169.254``/``latest``/``meta``-``data``/``events``/``recommendations``/``rebalance`
`{`
` ``"noticeTime"``:`` ``"2022-07-16T19:18:24Z"`
# Rebalance Check
curl 169.254.169.254/latest/meta-data/events/recommendations/rebalance
{
"noticeTime": "2022-07-16T19:18:24Z"
}
```
Expand All @@ -47,19 +49,19 @@ curl 169.254.169.254``/``latest``/``meta``-``data``/``events``/``recommendations
EventBridge is an Event Bus service within AWS that allows users to set rules on events to capture and then target destinations for those events. Relevant targets for Spot interruption notifications include SQS, Lambda, and EC2-Terminate-Instance.

```
`# Example spot interruption notification EventBridge rule`
`$ aws events put``-``rule \`
` ``--``name ``MyK8sSpotTermRule`` \`
` ``--``event``-``pattern ``"{\"source\": [\"aws.ec2\"],\"detail-type\": [\"EC2 Spot Instance Interruption\"]}"`
`# Example rebalance recommendation EventBridge rule``
$ aws events put-rule \
--name MyK8sRebalanceRule \
--event-pattern "{\"source\": [\"aws.ec2\"],\"detail-type\": [\"EC2 Instance Rebalance Recommendation\"]}"
`` `
`# Example targeting an SQS queue`
`$ aws events put``-``targets ``--``rule ``MyK8sSpotTermRule`` \`
` ``--``targets ``"Id"``=``"1"``,``"Arn"``=``"arn:aws:sqs:us-east-1:123456789012:MyK8sTermQueue"`` `
# Example spot interruption notification EventBridge rule
aws events put-rule \
--name MyK8sSpotTermRule \
--event-pattern "{\"source\": [\"aws.ec2\"],\"detail-type\": [\"EC2 Spot Instance Interruption\"]}"
# Example rebalance recommendation EventBridge rule
aws events put-rule \
--name MyK8sRebalanceRule \
--event-pattern "{\"source\": [\"aws.ec2\"],\"detail-type\": [\"EC2 Instance Rebalance Recommendation\"]}"
# Example targeting an SQS queue
aws events put-targets --rule MyK8sSpotTermRule \
--targets "Id=1,Arn=arn:aws:sqs:us-east-1:123456789012:MyK8sTermQueue"
```


Expand Down Expand Up @@ -113,17 +115,17 @@ SQS exposes a VPC Endpoint which will fulfill the isolated VPC use-case.
Dynamically creating the SQS infrastructure and EventBridge rules means that Karpenter’s IAM role would need permissions to SQS and EventBridge:

```
`"sqs:GetQueueUrl",`
`"sqs:ListQueues"``,`
`"sqs:ReceiveMessage"``,`
`"sqs:CreateQueue"``,`
`"sqs:DeleteMessage"``,`
`"events:ListRules",`
"`events:DescribeRule`",
"events:PutRule",
"sqs:GetQueueUrl",
"sqs:ListQueues",
"sqs:ReceiveMessage",
"sqs:CreateQueue",
"sqs:DeleteMessage",
"events:ListRules",
"events:DescribeRule",
"events:PutRule",
"events:PutTargets",
"`events:DeleteRule`",
`"events:RemoveTargets"`
"events:DeleteRule",
"events:RemoveTargets"
```

The policy can be setup with a predefined name based on the cluster name. For example, `karpenter-events-${CLUSTER_NAME}` which would allow for a more constrained resource policy.
Expand All @@ -144,7 +146,7 @@ The simplest option is to include [NTH IMDS mode](https://quip-amazon.com/EUgPAQ

**3B: Build a System Daemon (nthd)**

An option to transparently handle spot interruption notifications is to build a system daemon in a separate repo that performs the IMDS monitoring and triggers an instance shutdown when an interruption is observed. This would rely on K8s’ new [graceful shutdown](https://kubernetes.io/docs/concepts/architecture/nodes/#graceful-node-shutdown) feature which went beta in K8s 1.21.
An option to transparently handle spot interruption notifications is to build a system daemon in a separate repo that performs the IMDS monitoring and triggers an instance shutdown when an interruption is observed. This would rely on K8s’ new [graceful shutdown](https://kubernetes.io/docs/concepts/cluster-administration/node-shutdown/#graceful-node-shutdown) feature which went beta in K8s 1.21.

With graceful shutdown, the kubelet registers [systemd-inhibitor-locks](https://www.freedesktop.org/wiki/Software/systemd/inhibit/) to stop the shutdown flow until locks are relinquished, which in this case would be when the kubelet has drained pods off of the node. Two parameters were added to the kubelet to tune the drain timeouts: `shutdownGracePeriod` & `shutdownGracePeriodCriticalPods`

Expand Down
74 changes: 38 additions & 36 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,21 @@ go 1.23.2

require (
github.com/Pallinder/go-randomdata v1.2.0
github.com/PuerkitoBio/goquery v1.10.0
github.com/PuerkitoBio/goquery v1.10.1
github.com/avast/retry-go v3.0.0+incompatible
github.com/aws/aws-sdk-go-v2 v1.32.6
github.com/aws/aws-sdk-go-v2/config v1.28.6
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.16.21
github.com/aws/aws-sdk-go-v2/service/ec2 v1.197.0
github.com/aws/aws-sdk-go-v2/service/eks v1.54.0
github.com/aws/aws-sdk-go-v2/service/fis v1.31.2
github.com/aws/aws-sdk-go-v2/service/iam v1.38.2
github.com/aws/aws-sdk-go-v2/service/pricing v1.32.7
github.com/aws/aws-sdk-go-v2/service/sqs v1.37.2
github.com/aws/aws-sdk-go-v2/service/ssm v1.56.1
github.com/aws/aws-sdk-go-v2/service/sts v1.33.2
github.com/aws/aws-sdk-go-v2/service/timestreamwrite v1.29.8
github.com/aws/amazon-vpc-resource-controller-k8s v1.6.3
github.com/aws/aws-sdk-go-v2 v1.32.7
github.com/aws/aws-sdk-go-v2/config v1.28.7
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.16.22
github.com/aws/aws-sdk-go-v2/service/ec2 v1.198.1
github.com/aws/aws-sdk-go-v2/service/eks v1.56.0
github.com/aws/aws-sdk-go-v2/service/fis v1.31.3
github.com/aws/aws-sdk-go-v2/service/iam v1.38.3
github.com/aws/aws-sdk-go-v2/service/pricing v1.32.8
github.com/aws/aws-sdk-go-v2/service/sqs v1.37.4
github.com/aws/aws-sdk-go-v2/service/ssm v1.56.2
github.com/aws/aws-sdk-go-v2/service/sts v1.33.3
github.com/aws/aws-sdk-go-v2/service/timestreamwrite v1.29.9
github.com/aws/karpenter-provider-aws/tools/kompat v0.0.0-20240410220356-6b868db24881
github.com/aws/smithy-go v1.22.1
github.com/awslabs/amazon-eks-ami/nodeadm v0.0.0-20240229193347-cfab22a10647
Expand All @@ -26,8 +27,8 @@ require (
github.com/imdario/mergo v0.3.16
github.com/jonathan-innis/aws-sdk-go-prometheus v0.1.1
github.com/mitchellh/hashstructure/v2 v2.0.2
github.com/onsi/ginkgo/v2 v2.22.0
github.com/onsi/gomega v1.36.1
github.com/onsi/ginkgo/v2 v2.22.2
github.com/onsi/gomega v1.36.2
github.com/patrickmn/go-cache v2.1.0+incompatible
github.com/pelletier/go-toml/v2 v2.2.3
github.com/prometheus/client_golang v1.20.5
Expand All @@ -42,29 +43,30 @@ require (
k8s.io/client-go v0.32.0
k8s.io/klog/v2 v2.130.1
k8s.io/utils v0.0.0-20241104100929-3ea5e8cea738
sigs.k8s.io/controller-runtime v0.19.3
sigs.k8s.io/karpenter v1.1.1
sigs.k8s.io/controller-runtime v0.19.4
sigs.k8s.io/karpenter v1.1.2-0.20250117235835-ff44f7325bf0
sigs.k8s.io/yaml v1.4.0
)

require (
github.com/Masterminds/semver/v3 v3.2.1 // indirect
github.com/andybalholm/cascadia v1.3.2 // indirect
github.com/aws/aws-sdk-go-v2/credentials v1.17.47 // indirect
github.com/aws/aws-sdk-go-v2/internal/configsources v1.3.25 // indirect
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.6.25 // indirect
github.com/andybalholm/cascadia v1.3.3 // indirect
github.com/aws/aws-sdk-go-v2/credentials v1.17.48 // indirect
github.com/aws/aws-sdk-go-v2/internal/configsources v1.3.26 // indirect
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.6.26 // indirect
github.com/aws/aws-sdk-go-v2/internal/ini v1.8.1 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.12.1 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/endpoint-discovery v1.10.6 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.12.6 // indirect
github.com/aws/aws-sdk-go-v2/service/sso v1.24.7 // indirect
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.28.6 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/endpoint-discovery v1.10.7 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.12.7 // indirect
github.com/aws/aws-sdk-go-v2/service/sso v1.24.8 // indirect
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.28.7 // indirect
github.com/beorn7/perks v1.0.1 // indirect
github.com/cespare/xxhash/v2 v2.3.0 // indirect
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
github.com/emicklei/go-restful/v3 v3.11.0 // indirect
github.com/evanphx/json-patch v5.7.0+incompatible // indirect
github.com/evanphx/json-patch/v5 v5.9.0 // indirect
github.com/fsnotify/fsnotify v1.7.0 // indirect
github.com/fxamacker/cbor/v2 v2.7.0 // indirect
github.com/go-logr/logr v1.4.2 // indirect
github.com/go-openapi/jsonpointer v0.21.0 // indirect
Expand All @@ -73,10 +75,10 @@ require (
github.com/go-task/slim-sprig/v3 v3.0.0 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/golang/protobuf v1.5.4 // indirect
github.com/google/gnostic-models v0.6.8 // indirect
github.com/google/gnostic-models v0.6.9-0.20230804172637-c7be7c783f49 // indirect
github.com/google/go-cmp v0.6.0 // indirect
github.com/google/gofuzz v1.2.0 // indirect
github.com/google/pprof v0.0.0-20241029153458-d1b30febd7db // indirect
github.com/google/pprof v0.0.0-20241210010833-40e02aabc2ad // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/inconshreveable/mousetrap v1.1.0 // indirect
github.com/jmespath/go-jmespath v0.4.0 // indirect
Expand All @@ -99,21 +101,21 @@ require (
github.com/spf13/cobra v1.8.1 // indirect
github.com/spf13/pflag v1.0.5 // indirect
github.com/x448/float16 v0.8.4 // indirect
golang.org/x/net v0.30.0 // indirect
golang.org/x/net v0.33.0 // indirect
golang.org/x/oauth2 v0.23.0 // indirect
golang.org/x/sys v0.26.0 // indirect
golang.org/x/term v0.25.0 // indirect
golang.org/x/text v0.20.0 // indirect
golang.org/x/time v0.8.0 // indirect
golang.org/x/tools v0.26.0 // indirect
golang.org/x/sys v0.28.0 // indirect
golang.org/x/term v0.27.0 // indirect
golang.org/x/text v0.21.0 // indirect
golang.org/x/time v0.9.0 // indirect
golang.org/x/tools v0.28.0 // indirect
gomodules.xyz/jsonpatch/v2 v2.4.0 // indirect
google.golang.org/protobuf v1.35.1 // indirect
google.golang.org/protobuf v1.36.1 // indirect
gopkg.in/evanphx/json-patch.v4 v4.12.0 // indirect
gopkg.in/inf.v0 v0.9.1 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
k8s.io/cloud-provider v0.31.3 // indirect
k8s.io/cloud-provider v0.32.0 // indirect
k8s.io/component-base v0.32.0 // indirect
k8s.io/csi-translation-lib v0.31.3 // indirect
k8s.io/csi-translation-lib v0.32.0 // indirect
k8s.io/kube-openapi v0.0.0-20241105132330-32ad38e42d3f // indirect
sigs.k8s.io/json v0.0.0-20241010143419-9aa6b5e7a4b3 // indirect
sigs.k8s.io/structured-merge-diff/v4 v4.4.2 // indirect
Expand Down
Loading

0 comments on commit 62cb956

Please sign in to comment.