-
Notifications
You must be signed in to change notification settings - Fork 50
Description
In some circumstances, generally very large clusters, the built-in client-side rate limiting in the Kubernetes client-go library can cause leader election failures that lead to a forced restart of the SPIRE server. An example log message which eventually leads to this behavior is below.
E0225 20:20:22.334896 2920 leaderelection.go:429] Failed to update lock optimistically: client rate limiter Wait returned an error: context deadline exceeded, falling back to slow path
Exposing a means to tune the client QPS and max burst on SPIRE components would be extremely helpful for dealing with these scenarios. Additional CLI arguments of --kube-api-qps and --kube-api-burst on all components, with sane default values, would allow users to tune these values for their environments. As an alternative to the CLI flags, config file settings would be fine. These could then be exposed for configuration in the Helm chart(s) as well.
Would a change for this be accepted? Happy to write up a pull request if so.