Open
Description
What would you like to be added:
KSM should have the ability to set an upper limit on number of objects ingested.
Why is this needed:
We observed an event where an autoscaler by accident created 10k+ ReplicaSets which KSM tried to report on. This caused KSM to run out of memory and we lost visibility into the cluster.
I know we can limit it already on the scraping end in Prometheus, this is just to avoid that ksm is running out of resources and to give another signal on what's going on in the cluster.
Describe the solution you'd like
- Have a generic and a resource-level command-line option that KSM should use to limit number of items read from the Kubernetes API.
- Have metrics exposed
kube_objects_watched{group="foo", kind="bar" version="baz"}
andkube_objects_watched_max
which shows the configuration limit to allow alerting if the threshold gets hit.