Description
The dask_kubernetes.operator.controller.daskcluster_create_components
handler can run for a long time because it creates resources and waits for them to be ready. This has been extended in #611 with the handler waiting for the full load balancer to be provisioned if the scheduler service is LoadBalancer
type. If the controller is killed/restarted while the handler is waiting we will run into edge-case issues when the controller is resumed.
Handlers in kopf
are intended to be very short-lived. An event happens, the handlers does something quickly and ends. In the case of creating new resources the handler should create the resource and finish. If it needs to do something else after the resource has been created that should be done in a new handler that is triggered by the resource's creation event.
We should refactor dask_kubernetes.operator.controller.daskcluster_create_components
to remove wait_for_service
and create a new handler for any events that need to happen as a result of the service being created.