Skip to content

Find a generic Solution to Maintenance tasks causing server (node) in and out of "Ready" state due to reboots #469

@nagadeesh-nagaraja

Description

@nagadeesh-nagaraja

Summary of Issue

when the servers are rebooted during Maintenance, the system loads from Disk and kubelet will get the system into "Ready state. when the maintenance involves several reboot or longer maintenance cycle, this can cause the server to be added to the cluster even during the Maintenance.

Motivation

Given that multiple kind of serverMaintenances can lead to server reboots. We would need a generic solution to discard the server which are under Maintenance to be out of cluster irrespective of if its rebooted or not.

Example case:
A server being rebooting due to power supply/Motherboard/memory issue etc. Simply transferring it to Maintenance should let the system not be added back to cluster (or READY state) when it reboots again.

Example case:
The Claimed Server is approved by Owner for Maintenance. However, It might be a good approach to make sure that the server is not getting back into Ready state during Maintenance State, even if whatever operations might lead to a reboot. instead of expecting every Maintenance operator to take care of it individually.

Proposed Solution

Solution 1)
When Server is transferred to Maintenance state, we somehow make the kubelet not start. may be having a hook which will call before starting the startup script.
Cons: this means, we will need to update the OS and start-up script provided by the upstream projects.

Solution 2)
ServerMaintanence resource will Change the BootOrder (or disable Server's Network, or both) to stop the server booting from disk and kubelet to report itself to be ready.
it will then patch the previously existing bootOrder (enable network, or both) to enable the server reboot into disk and kubelet communication.

Pros: This will give a generic way to solve the issue of server in Maintenance being reported being in "READY" state to upstream projects

Metadata

Metadata

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions