-
Notifications
You must be signed in to change notification settings - Fork 807
Description
Given that:
-
Right now, the recommended installation is to have a dd-agent pod on each node, through a DaemonSet. That works reasonably well, save minor issues about regular apps discovering where the node-local statsd really is (I'm working to make this easier upstream).
-
But also, if you collect events, kubernetes.yaml.example tells you to only do that from a single agent in the cluster. How would one do that? It's not trivial to do with just a DaemonSet, unless you resort to a StatefulSet or some label-based hack, which is brittle and prone to fail in unexpected ways.
-
Last but not least, kubernetes_state.yaml.example does NOT tell you to only run the check from a single agent. Based on my experience, that causes only headaches. You get alerts from all the nodes in the cluster. I'm sure you folks aren't too thrilled about getting a lot of duplicate data, either.
I have a proposal to combine the above:
- Keep running regular statsd/agents as a DaemonSet.
- Recommend using Docker-based (i.e. based on image name) service discovery.The agent that lives on the same node where the kube-state-metrics pod is running will the lucky one to send kubernetes_state data. Make clear in the example that the check should only be run from one place in the cluster and that service discovery is the best way to do it. In other words, tell users that in most case configuring the check manually is not a good idea.
- Have a separate Deployment, with one replica, which only runs checks that are global in scope:
- event collection
- control plane checks (conveniently, I just created More checks for the Kubernetes control plane #3112 for that)
- anything else that needs to talk to the API servers
That's all very easy to implement, except for 3). The current check logic is:
- Get the node's list of pods from the Kubelet
- Perform Kubelet health checks
- Fetch cAdvisor/Docker metrics about every pod
- If event collection is enabled, call the API server and fetch them. Curiously enough, _process_events gets passed the list of pods, but it doesn't do anything with that.
The simplest fix is to have a new setting ("global checks"?), which prevents the check from talking to the Kubelet, Docker and cAdvisor. The new, alternate mode would only talk to the API servers and the rest of the control plane in order to collect events and control plane data.
Does it sound reasonable? Anything missing?