Skip to content

Failed Pods with system critical priority are note removed by RemoveFailedPods plugin #1775

@alex-berger

Description

@alex-berger

What version of descheduler are you using?

descheduler version: v0.34.0

Does this issue reproduce with the latest release?
yes

Which descheduler CLI options are you using?

        - "--policy-config-file=/policy-dir/policy.yaml"
        - "--v=3"

Please provide a copy of your descheduler policy config file

apiVersion: "descheduler/v1alpha2"
kind: "DeschedulerPolicy"
maxNoOfPodsToEvictPerNamespace: 5
maxNoOfPodsToEvictTotal: 30
profiles:
- name: default
  pluginConfig:
  - args:
      podProtections:
        defaultDisabled:
        - PodsWithLocalStorage
        - FailedBarePods
        extraEnabled: null
    name: DefaultEvictor
  - args:
      includingInitContainers: true
      podRestartThreshold: 20
    name: RemovePodsHavingTooManyRestarts
  - name: RemovePodsViolatingNodeTaints
  - name: RemovePodsViolatingInterPodAntiAffinity
  - args:
      constraints:
      - DoNotSchedule
    name: RemovePodsViolatingTopologySpreadConstraint
  - args:
      nodeAffinityType:
      - requiredDuringSchedulingIgnoredDuringExecution
    name: RemovePodsViolatingNodeAffinity
  - args:
      excludeOwnerKinds:
      - Job
      minPodLifetimeSeconds: 120
      reasons:
      - Evicted
      - Shutdown
      - NodeLost
      - NodeAffinity
      - UnexpectedAdmissionError
      - Preempting
    name: RemoveFailedPods
  plugins:
    balance:
      enabled: []
    deschedule:
      enabled:
      - RemovePodsHavingTooManyRestarts
      - RemovePodsViolatingNodeAffinity
      - RemovePodsViolatingInterPodAntiAffinity
      - RemoveFailedPods

What k8s version are you using (kubectl version)?

kubectl version Output
$ kubectl version
Client Version: v1.33.5
Kustomize Version: v5.6.0
Server Version: v1.33.5-eks-3025e55

What did you do?

What did you expect to see?

Failed Pods are removed, also those with with system critical priority.

While not evicting Pods with system critical priority makes sense for eviction plugins, this filter logic might not make sense for the RemoveFailedPods pluign. The RemoveFailedPods plugin is about Failed Pods, which means Pods which are not running anymore and removing Failed Pods should not break anything.

What did you see instead?

Not all Failed Pods are removed. Failed Pods with system critical priority are not removed.

I1117 09:24:03.412833       1 defaultevictor.go:356] "Pod fails the following checks" plugin="DefaultEvictor" ExtensionPoint="Filter" pod="kube-system/cert-manager-fb975885b-clf87" checks="[pod has system critical priority and is protected against eviction, pod has higher priority than specified priority class threshold]"

I suspect this is caused by https://github.com/kubernetes-sigs/descheduler/blob/v0.34.0/pkg/framework/plugins/removefailedpods/failedpods.go#L63

	// We can combine Filter and PreEvictionFilter since for this strategy it does not matter where we run PreEvictionFilter
	podFilter, err := podutil.NewOptions().
		WithFilter(podutil.WrapFilterFuncs(handle.Evictor().Filter, handle.Evictor().PreEvictionFilter)). // <- maybe we should remove this line
		WithNamespaces(includedNamespaces).
		WithoutNamespaces(excludedNamespaces).
		WithLabelSelector(failedPodsArgs.LabelSelector).
		BuildFilterFunc()

Note, disabling the SystemCriticalPods default pod protection is not an acceptable work-around, as this would also allow for eviction of Pods with system critical priority. The goal is to protect system critical Pods from eviction but to allow removal of Failed Pods.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions