-
Notifications
You must be signed in to change notification settings - Fork 10
Description
Summary
This is to propose the Kubernetes-node style health-check mechanisms for the MachinePools. It shall circumvent the issues related to degraded health of the previously Ready MachinePools.
The possible solution to such issues could involve:
- Heartbeats from the MachinePoollet to APIServer/MachinePool.
- Machinepool-controller, declaring MachinePool to be
Unknown,NotReadybased on pre-determined configurations, when beats are missing.
This is similar to how Kubelet updates the Ready Node.Status.NodeCondtion[] regularly, missing of which leads Node-controller to declare Nodes to be Unknown/NodeReady.
The possible consumer for this could be the Scheduler, which can prevent further workload from being scheduled on the affected Machinepool, while also eviction-controllers being able to evict workloads if needed.
Basic example
- lastHeartbeatTime: "2023-12-05T10:58:27Z"
lastTransitionTime: "2023-11-13T13:22:23Z"
message: MachinePoollet is posting ready status.
reason: MachinePoolReady
status: "True"
type: Ready
Motivation
To enhance the means the disaster recovery.
Note
Considering this is a bigger epic, it's highly recommended to prepare an Enhancement proposal first.
This can also have possible touch-points with the Node-problem-detector like design with MachinePool, which is better discussed separately.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status