Description
Is your feature request related to a problem? Please describe.
In its current incarnation, k0s runs the Kubernetes cloud provider code as a component, providing its own bare-bones "cloud provider implementation". That code is more tailored to be run as a separate binary. There's currently no good logging and error handling around that. When the cloud provider code fails, it won't be restarted/retried, and there's just a single log line indicating this.
In the end, the k0s cloud provider is not tied to k0s in any way. It's just a little helper in case folks need to put external IPs on nodes when not running a "real" cloud provider.
Moreover, the cloud provider opens up an HTTP port (10258 by default) that's, besides the --k0s-cloud-provider-port
CLI arg, neither documented nor configured (i.e. certs, TLS algorithms...). There's probably not much to see there besides metrics, is there?
Describe the solution you would like
The k0s cloud provider component can potentially be its own thing, living outside of the k0s repo and be deployed as a usual Kubernetes Deployment, e.g. via k0sctl
and the manifests
folder, a Helm chart, or whatever automation tool is preferred by cluster operators. It could even be useful outside of the context of k0s itself.
Moreover, this might have some security benefits as well. Having it as a Deployment means that it can simply use a ServiceAccount with locked-down permissions. Right now, that code operates as cluster admin.
There's also quite some knobs in the cloud provider code (exposed via CLI args) that might be useful for some people. Having the k0s cloud provider in a separate deployment opens up many more options for custom configs than we'd be comfortable building into k0s directly.
Describe alternatives you've considered
Run the cloud provider as a supervised binary, much like the k0s API. This would be some sort of middle-ground, and it would keep the current k0s controller
CLI flags and general ease of use.
Additional context
Whichever route to take, I think the following things should be considered:
- Retries/restarts in case of failures
- TLS configuration hardening and port documentation (is that port even required?)
- Use of over-privileged credentials
- Observability inside k0s, i.e. logging and status integration (would be kinda free in the stand-alone case)