-
Notifications
You must be signed in to change notification settings - Fork 762
Description
What version of descheduler are you using?
descheduler version: v0.34.0
Does this issue reproduce with the latest release?
yes
Which descheduler CLI options are you using?
- "--policy-config-file=/policy-dir/policy.yaml"
- "--v=3"
Please provide a copy of your descheduler policy config file
apiVersion: "descheduler/v1alpha2"
kind: "DeschedulerPolicy"
maxNoOfPodsToEvictPerNamespace: 5
maxNoOfPodsToEvictTotal: 30
profiles:
- name: default
pluginConfig:
- args:
podProtections:
defaultDisabled:
- PodsWithLocalStorage
- FailedBarePods
extraEnabled: null
name: DefaultEvictor
- args:
includingInitContainers: true
podRestartThreshold: 20
name: RemovePodsHavingTooManyRestarts
- name: RemovePodsViolatingNodeTaints
- name: RemovePodsViolatingInterPodAntiAffinity
- args:
constraints:
- DoNotSchedule
name: RemovePodsViolatingTopologySpreadConstraint
- args:
nodeAffinityType:
- requiredDuringSchedulingIgnoredDuringExecution
name: RemovePodsViolatingNodeAffinity
- args:
excludeOwnerKinds:
- Job
minPodLifetimeSeconds: 120
reasons:
- Evicted
- Shutdown
- NodeLost
- NodeAffinity
- UnexpectedAdmissionError
- Preempting
name: RemoveFailedPods
plugins:
balance:
enabled: []
deschedule:
enabled:
- RemovePodsHavingTooManyRestarts
- RemovePodsViolatingNodeAffinity
- RemovePodsViolatingInterPodAntiAffinity
- RemoveFailedPods
What k8s version are you using (kubectl version)?
kubectl version Output
$ kubectl version Client Version: v1.33.5 Kustomize Version: v5.6.0 Server Version: v1.33.5-eks-3025e55
What did you do?
What did you expect to see?
Failed Pods are removed, also those with with system critical priority.
While not evicting Pods with system critical priority makes sense for eviction plugins, this filter logic might not make sense for the RemoveFailedPods pluign. The RemoveFailedPods plugin is about Failed Pods, which means Pods which are not running anymore and removing Failed Pods should not break anything.
What did you see instead?
Not all Failed Pods are removed. Failed Pods with system critical priority are not removed.
I1117 09:24:03.412833 1 defaultevictor.go:356] "Pod fails the following checks" plugin="DefaultEvictor" ExtensionPoint="Filter" pod="kube-system/cert-manager-fb975885b-clf87" checks="[pod has system critical priority and is protected against eviction, pod has higher priority than specified priority class threshold]"
I suspect this is caused by https://github.com/kubernetes-sigs/descheduler/blob/v0.34.0/pkg/framework/plugins/removefailedpods/failedpods.go#L63
// We can combine Filter and PreEvictionFilter since for this strategy it does not matter where we run PreEvictionFilter
podFilter, err := podutil.NewOptions().
WithFilter(podutil.WrapFilterFuncs(handle.Evictor().Filter, handle.Evictor().PreEvictionFilter)). // <- maybe we should remove this line
WithNamespaces(includedNamespaces).
WithoutNamespaces(excludedNamespaces).
WithLabelSelector(failedPodsArgs.LabelSelector).
BuildFilterFunc()Note, disabling the SystemCriticalPods default pod protection is not an acceptable work-around, as this would also allow for eviction of Pods with system critical priority. The goal is to protect system critical Pods from eviction but to allow removal of Failed Pods.