Skip to content

maxSurge for node draining or how to meet availability requirements when draining nodes by adding pods #114877

Open
@txomon

Description

@txomon

What would you like to be added?

A way to drain nodes by adding more pods elsewhere to meet PodDisruptionBudgets.

Why is this needed?

Currently, when there is a Deployment, it can be configured to have a maxSurge to avoid going under the amount of replicas the deployment requires while allowing for a new release to be rolled out. This parameter allows adding extra pods before subtracting the old ones so that the "replicas" number required is always met as a minimum,

This feature (to my knowledge) is only available when releasing new versions of an application, however when draining nodes this would be extremely useful.

Usual cluster maintenance is done by adding new nodes before removing old ones. This means all the pods in the node need to be evicted and there is usually space for one more of each of the old node in the new node. Current solutions such as the PodDisruptionBudget or Eviction API are trying to make sure that substracting pods from the current amount don´t break anything, however the possibility of temporarily having one extra pod of each deployment is not contemplated at the moment.

This request is asking for the ability to use a surplus of pods to meet all constraints for safe eviction.

Some side notes to stress the importance. Although when operating evictions on large workloads lack of PDBs or PDBs with minAvailable/maxUnavailable settings work fine. When moving deployments with 1 replicas or HPA controlled deployments that are currently scaled down enough the problem is aggravated and can only be solved through a few inefficient means, which is acerbated if node maintenance is done automatically (such as GKE, and other cloud services)

Just in case, this is a limitation that should only be counted against Deployments with strategy.type=RollingUpdate.

Ways to deal with this situation currently:

  1. Have a minReplicas/replicas to >1, and a PDB with maxUnavailable=1 when it's known that the autoscaler if in use, it's usually scaled on the lower end. Pros: There is no downtime, Cons: Waste of resources.
  2. Do nothing, and deal with eventual downtimes. Pros: No waste of resources, Cons: There is downtime in the deployment

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/featureCategorizes issue or PR as related to a new feature.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.sig/appsCategorizes an issue or PR as relevant to SIG Apps.sig/nodeCategorizes an issue or PR as relevant to SIG Node.

    Type

    No type

    Projects

    • Status

      Needs Triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions