Skip to content

Commit

Permalink
Adds Node Ignore from Deletion annotation (#187)
Browse files Browse the repository at this point in the history
* Added annotation skip delete. test skeleton

Added unit test for scale down

changed annotation, addressed comments

updated docs

* address docs formatting comments

* fixed grammar
  • Loading branch information
Jacobious52 authored Apr 22, 2020
1 parent 8d2353e commit 8c5f6d2
Show file tree
Hide file tree
Showing 5 changed files with 201 additions and 2 deletions.
3 changes: 3 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,11 @@
- Scale up
- Scale down
- Scale lock
- Tainting of nodes
- Cordoning of nodes
- [**Node Termination**](./node-termination.md)
- Node selection method for termination
- Annotating nodes / Stopping termination of selected nodes
- [**Pod and Node Selectors**](./pod-node-selectors.md)
- Nodes
- Pods
Expand Down
38 changes: 37 additions & 1 deletion docs/node-termination.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,40 @@ where we compare the creation timestamps of each node.

This method is useful to ensure there are always new nodes in the cluster. If you want to deploy a configuration change
to your nodes, you can use Escalator to cycle the nodes by terminating the oldest first until all of the nodes are
using the latest configuration.
using the latest configuration.

### Annotating nodes / Stopping termination of selected nodes

For most cases when wanting to exclude a node from termination for maintenance, you should first consider the [**cordon function**](./scale-process).
Cordoning a node filters it out from calculation **and** scaling processes (tainting and deletion). Additionally, cordoning a node in Kubernetes marks it Unschedule-able, so it won't accept workloads while in this state.

In the cases where Cordoning is not acceptable, because you want your node to continue to receive workloads but not be deleted, you can annotate the node to tell escalator to treat it as normal, except for stopping it from being deleted.
This means the node will still factor into scaling calculations, and get tainted, untainted as the cluster scales, but if it is in a situation where it would be deleted normally, escalator will skip it and leave it active.

#### To annotate a node:

**Annotation key:** `atlassian.com/no-delete`

The annotation value can be any **non empty** string that conforms to the Kubernetes annotation value standards. This can usually indicate the reason for annotation which will appear in logs

**Example:**

`kubectl edit node <node-name>`

**Simple example:**
```yaml
apiVersion: v1
kind: Node
metadata:
annotations:
atlassian.com/no-delete: "true"
```
**An elaborate reason:**
```yaml
apiVersion: v1
kind: Node
metadata:
annotations:
atlassian.com/no-delete: "testing some long running bpf tracing"
```
2 changes: 1 addition & 1 deletion docs/scale-process.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,5 +77,5 @@ Escalator taints are given the `atlassian.com/escalator` key.
Escalator does not use the cordoning command anywhere in it's process. This is done to preserve the cordoning command
for system administrators to filter the cordoned node out of calculations. This way, a faulty or misbehaving node
can be cordoned by the system administrator to be debugged or troubleshooted without worrying about the node being
tainted and then terminated by Escalator.
tainted and then terminated by Escalator.

25 changes: 25 additions & 0 deletions pkg/controller/scale_down.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,13 @@ import (
v1 "k8s.io/api/core/v1"
)

const (
// NodeEscalatorIgnoreAnnotation is the key of an annotation on a node that signifies it should be ignored from ASG deletion
// value does not matter, can be used for reason, as long as not empty
// if set, the node wil not be deleted. However it still can be tainted and factored into calculations
NodeEscalatorIgnoreAnnotation = "atlassian.com/no-delete"
)

// ScaleDown performs the taint and remove node logic
func (c *Controller) ScaleDown(opts scaleOpts) (int, error) {
removed, err := c.TryRemoveTaintedNodes(opts)
Expand All @@ -29,12 +36,30 @@ func (c *Controller) ScaleDown(opts scaleOpts) (int, error) {
return c.scaleDownTaint(opts)
}

func safeFromDeletion(node *v1.Node) (string, bool) {
for key, val := range node.ObjectMeta.Annotations {
if key == NodeEscalatorIgnoreAnnotation && val != "" {
return val, true
}
}
return "", false
}

// TryRemoveTaintedNodes attempts to remove nodes are
// * tainted and empty
// * have passed their grace period
func (c *Controller) TryRemoveTaintedNodes(opts scaleOpts) (int, error) {
var toBeDeleted []*v1.Node
for _, candidate := range opts.taintedNodes {

// skip any nodes marked with the NodeEscalatorIgnore condition which is true
// filter these nodes out as late as possible to ensure rest of escalator scaling calculations remain unaffected
// This is because the nodes still exist and use resources, we don't want any inconsistencies. This node is safe from deletion not tainting
if why, ok := safeFromDeletion(candidate); ok {
log.Infof("node %s has escalator ignore annotation %s: Reason: %s. Removing from deletion options", candidate.Name, NodeEscalatorIgnoreAnnotation, why)
continue
}

// if the time the node was tainted is larger than the hard period then it is deleted no matter what
// if the soft time is passed and the node is empty (excluding daemonsets) then it can be deleted
taintedTime, err := k8s.GetToBeRemovedTime(candidate)
Expand Down
135 changes: 135 additions & 0 deletions pkg/controller/scale_down_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -365,3 +365,138 @@ func TestControllerTaintOldestN(t *testing.T) {
func TestControllerScaleDown(t *testing.T) {
t.Skip("test not implemented")
}

func TestController_TryRemoveTaintedNodes(t *testing.T) {

minNodes := 10
maxNodes := 20
nodeGroup := NodeGroupOptions{
Name: "default",
CloudProviderGroupName: "default",
MinNodes: minNodes,
MaxNodes: maxNodes,
ScaleUpThresholdPercent: 100,
}
nodeGroups := []NodeGroupOptions{nodeGroup}

nodes := test.BuildTestNodes(10, test.NodeOpts{
CPU: 1000,
Mem: 1000,
Tainted: true,
})

pods := buildTestPods(10, 1000, 1000)
client, opts := buildTestClient(nodes, pods, nodeGroups, ListerOptions{})

// For these test cases we only use 1 node group/cloud provider node group
nodeGroupSize := 1

// Create a test (mock) cloud provider
testCloudProvider := test.NewCloudProvider(nodeGroupSize)
testNodeGroup := test.NewNodeGroup(
nodeGroup.CloudProviderGroupName,
nodeGroup.Name,
int64(minNodes),
int64(maxNodes),
int64(len(nodes)),
)

testCloudProvider.RegisterNodeGroup(testNodeGroup)

// Create a node group state with the mapping of node groups to the cloud providers node groups
nodeGroupsState := BuildNodeGroupsState(nodeGroupsStateOpts{
nodeGroups: nodeGroups,
client: *client,
})

nodeGroupsState[testNodeGroup.ID()].NodeInfoMap = k8s.CreateNodeNameToInfoMap(pods, nodes)

c := &Controller{
Client: client,
Opts: opts,
stopChan: nil,
nodeGroups: nodeGroupsState,
cloudProvider: testCloudProvider,
}

// taint the oldest N according to the controller
taintedIndex := c.taintOldestN(nodes, nodeGroupsState[testNodeGroup.ID()], 2)
assert.Equal(t, len(taintedIndex), 2)

// add the untainted the the untainted list
taintedNodes := []*v1.Node{nodes[taintedIndex[0]], nodes[taintedIndex[1]]}
var untaintedNodes []*v1.Node
for i, n := range nodes {
if n == taintedNodes[0] || n == taintedNodes[1] {
continue
}

untaintedNodes = append(untaintedNodes, nodes[i])
}
assert.Equal(t, len(nodes)-2, len(untaintedNodes))

tests := []struct {
name string
opts scaleOpts
annotateFirstTainted bool
want int
wantErr bool
}{
{
"test normal delete all tainted",
scaleOpts{
nodes,
taintedNodes,
untaintedNodes,
nodeGroupsState[testNodeGroup.ID()],
0, // not used in TryRemoveTaintedNodes
},
false,
-2,
false,
},
{
"test normal skip first tainted",
scaleOpts{
nodes,
taintedNodes,
untaintedNodes,
nodeGroupsState[testNodeGroup.ID()],
0, // not used in TryRemoveTaintedNodes
},
true,
-1,
false,
},
{
"test none tainted",
scaleOpts{
nodes,
[]*v1.Node{},
nodes,
nodeGroupsState[testNodeGroup.ID()],
0, // not used in TryRemoveTaintedNodes
},
false,
0,
false,
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if tt.annotateFirstTainted {
tt.opts.taintedNodes[0].Annotations = map[string]string{
NodeEscalatorIgnoreAnnotation: "skip for testing",
}
}
got, err := c.TryRemoveTaintedNodes(tt.opts)
if tt.wantErr {
assert.Error(t, err)
} else {
assert.NoError(t, err)
}
assert.Equal(t, tt.want, got)
})
}
}

0 comments on commit 8c5f6d2

Please sign in to comment.