Description
URL
https://opentelemetry.io/docs/kubernetes/collector/components/#kubeletstats-receiver
Recommended change
The documented ClusterRole
for the kubeletstats receiver should list its optional permissions.
Maybe commented out? E.g.
...
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: otel-collector
rules:
- apiGroups: ['']
resources: ['nodes/stats']
verbs: ['get', 'watch', 'list']
# The following is needed when using extra_metadata_labels or any of the {request|limit}_utilization metrics
#- apiGroups: ['']
# resources: ['nodes/proxy']
# verbs: ['get']
...
Context
After enabling the optional metric k8s.container.memory_limit_utilization
in the kubeletstats
receiver, my OTel Collectors started partially failing with these errors:
[email protected]/scraper.go:113 call to /pods endpoint failed {"kind": "receiver", "name": "kubeletstats", "data_type": "metrics", "error": "kubelet request GET https://<snipped_node_ip>:10250/pods failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:otel:otel-node-service-account, verb=get, resource=nodes, subresource=proxy)""}
This caught me by surprise as the ClusterRole
described in the linked page was assigned to these pods and had worked fine for weeks.
After granting the permission listed in the error, my otelcol daemonset was back in operation.
I've done so after looking around and finding the same optional permission tackled in open-telemetry/opentelemetry-operator#3155.
https://github.com/open-telemetry/opentelemetry-operator/blob/1980f0877e5cff8e41ff3eafafe4c57133d7c899/internal/components/receivers/kubeletstats.go#L65-L93 shows that nodes/proxy
is needed "when using extra_metadata_labels or any of the {request|limit}_utilization metrics".
Metadata
Metadata
Assignees
Type
Projects
Status
No status