Group and queue nodes for termination

**Describe the feature**
I'd like NTH to be able to group nodes (similar to the CA `--balance-similar-node-groups`) and support processing `n` nodes per group (this can still be constrained by the workers configuration).

I assume that v2 would be designed around this kind of concept, but I think it'd be worth doing in v1 assuming it wouldn't take too much effort.

**Is the feature request related to a problem?**
When using NTH to manage ASG instance refresh events it is very easy to get a cluster into a blocking race condition due to pods being terminated off different nodes causing no nodes to be able to fully shut down due to PDBs. This results in hard terminations and general cluster instability.

**Describe alternatives you've considered**
Using a single worker would work but it would be to slow to respond to time critical events and even for instance refresh it could be too slow for good usability.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Group and queue nodes for termination #576

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Group and queue nodes for termination #576

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions