Description
Description
Note:
This is a known BUG, with a known cause. It also applies to other forms of Disruption which also provision replacement nodes, beyond just Drift. It requires both changes in kubernetes-sigs/karpenter
, along with changes in the Azure/karpenter-provider-azure repo, so opening a BUG in both (tracking Azure/karpenter-provider-azure
side here 600)
Observed Behavior:
Karpenter is only provisioning and replacing 1 Drifted node at a time, even with a Budget of "5".
Had all 10
nodeclaims marked as Drifted:
kubectl get nodeclaims -o go-template='{{range $item := .items}}{{with $nodeclaimname := $item.metadata.name}}{{range $condition := $item.status.conditions}}{{if and (eq $condition.type "Drifted") (eq $condition.status "True")}}{{printf "%s\n" $nodeclaimname}}{{end}}{{end}}{{end}}{{end}}'
general-purpose-4d8rn
general-purpose-brt2r
general-purpose-l9pv6
general-purpose-n6xv7
general-purpose-s9rnn
general-purpose-v76nm
general-purpose-vrnbl
general-purpose-vt4mv
general-purpose-wkk62
general-purpose-x6b47
Even after over a minute, there was only one new nodeclaim being created:
kubectl get karpenter
NAME AGE
aksnodeclass.karpenter.azure.com/default 99m
NAME TYPE CAPACITY ZONE NODE READY AGE
nodeclaim.karpenter.sh/general-purpose-4d8rn Standard_D2ls_v5 on-demand westus2-3 aks-general-purpose-4d8rn True 78m
nodeclaim.karpenter.sh/general-purpose-brt2r Standard_D2ls_v5 on-demand westus2-1 aks-general-purpose-brt2r True 70m
nodeclaim.karpenter.sh/general-purpose-l9pv6 Standard_D2ls_v5 on-demand westus2-3 aks-general-purpose-l9pv6 True 52m
nodeclaim.karpenter.sh/general-purpose-n6xv7 Standard_D2as_v5 on-demand westus2-1 aks-general-purpose-n6xv7 True 85m
nodeclaim.karpenter.sh/general-purpose-s9rnn Standard_D2as_v5 on-demand westus2-2 aks-general-purpose-s9rnn True 48m
nodeclaim.karpenter.sh/general-purpose-v76nm Standard_D2ls_v5 on-demand westus2-1 aks-general-purpose-v76nm True 68m
nodeclaim.karpenter.sh/general-purpose-vrnbl Standard_D2as_v5 on-demand westus2-3 aks-general-purpose-vrnbl True 82m
nodeclaim.karpenter.sh/general-purpose-vt4mv Standard_D2ls_v5 on-demand westus2-3 aks-general-purpose-vt4mv True 76m
nodeclaim.karpenter.sh/general-purpose-wkk62 Standard_D2ls_v5 on-demand westus2-1 aks-general-purpose-wkk62 True 67m
nodeclaim.karpenter.sh/general-purpose-x6b47 Standard_D2ls_v5 on-demand westus2-2 aks-general-purpose-x6b47 True 50m
nodeclaim.karpenter.sh/general-purpose-xf2ss Unknown 82s
NAME NODECLASS NODES READY AGE
nodepool.karpenter.sh/general-purpose default 10 True 98m
And only one node tainted as disrupted:
kubectl get nodes -o go-template='{{range $item := .items}}{{with $nodename := $item.metadata.name}}{{range $taint := $item.spec.taints}}{{if and (eq $taint.key "karpenter.sh/disrupted") (eq $taint.effect "NoSchedule")}}{{printf "%s\n" $nodename}}{{end}}{{end}}{{end}}{{end}}'
aks-general-purpose-4d8rn
Looking at the logs, we can see that the cluster is stalled on waiting on cluster sync
:
{
"level": "DEBUG",
"time": "2024-12-03T00:20:35.932Z",
"logger": "controller",
"caller": "singleton/controller.go:26",
"message": "waiting on cluster sync",
"commit": "f1b33a7",
"controller": "disruption",
"namespace": "",
"name": "",
"reconcileID": "c11d69f3-97eb-4677-8c48-54bccc32a1bf"
}
This is due to the patterning that exists between kubernetes-sigs/karpenter
, and Azure/karpenter-provider-azure
, stalling the NodeClaim creation at the CloudProvider Create call:
Internally, the Azure provider has a few LRO (long running operations) which are part of this creation call, which has polling preformed on the call to Azure to ensure the resources have been created correctly, reporting back any issue otherwise. If this Polling is skipped, there would be an issue of certain types of errors on node creation being missed, and thus certain errors would have to wait for the registrationTTL of 15 min, which is unacceptable in these cases.
Expected Behavior:
Karpenter to provision X nodes equal to the Budget for Drift (in this case 5) asynchronously, to replace the Drifted nodes in accordance with the Budget.
Reproduction Steps (Please include YAML):
Deployed a dev version of Azure Karpenter, from the HEAD of main, using make az-all
:
- de7dee7
ran make az-taintsystemnodes
to ensure workload pods will only land on karpenter
nodes.
Updated, and applied: examples/v1/general-purpose.yaml
with karpenter.azure.com/sku-cpu
to less than 3
, budgets.nodes
to "5"
, and consolidateAfter
to Never
:
# This example NodePool will provision general purpose instances
---
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: general-purpose
annotations:
kubernetes.io/description: "General purpose NodePool"
spec:
disruption:
consolidateAfter: Never
budgets:
- nodes: "5"
# Optional: Uncomment if you want to put a cap on the max resources available for provisioning
# limits:
# cpu: "30"
template:
metadata:
labels:
# required for Karpenter to predict overhead from cilium DaemonSet
kubernetes.azure.com/ebpf-dataplane: cilium
spec:
nodeClassRef:
group: karpenter.azure.com
kind: AKSNodeClass
name: default
startupTaints:
# https://karpenter.sh/docs/concepts/nodepools/#cilium-startup-taint
- key: node.cilium.io/agent-not-ready
effect: NoExecute
value: "true"
expireAfter: Never
requirements:
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: kubernetes.io/os
operator: In
values: ["linux"]
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand"]
- key: karpenter.azure.com/sku-family
operator: In
values: [D]
# Optional: Uncomment if you want to add a restriction on max sku cpus.
# Useful for ensuring karpneter provisions multiple nodes for feature testing.
- key: karpenter.azure.com/sku-cpu
operator: Lt
values: ["3"]
---
apiVersion: karpenter.azure.com/v1alpha2
kind: AKSNodeClass
metadata:
name: default
annotations:
kubernetes.io/description: "General purpose AKSNodeClass for running Ubuntu2204 nodes"
spec:
imageFamily: Ubuntu2204
Updated, and applied: examples/workloads/inflate.yaml
with replicas
to 10
, and cpu requests to 1.1
:
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
spec:
replicas: 10
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
containers:
- image: mcr.microsoft.com/oss/kubernetes/pause:3.6
name: inflate
resources:
requests:
cpu: "1.1"
memory: 256M
Update imageFamily: AzureLinux
in the AKSNodeClass, to trigger a Drift.
Versions:
- Chart Version: Using Dev Build of Azure provider on HEAD of main with commit d84e7c8
sigs.k8s.io/karpenter: v1.0.5
- Kubernetes Version (
kubectl version
): v1.29.9
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Activity