Skip to content

Metrics Server could not scrape log with “tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, not 10.124.4.238" node="fargate-ip-“ error in log #1468

Open
@dkelim1

Description

@dkelim1

What happened:

Logs from the matrics-server pod show this repeatedly
E0410 22:04:01.247686 1 scraper.go:149] "Failed to scrape node" err="Get "https://10.124.4.238:10250/metrics/resource\": tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, not 10.124.4.238" node="fargate-ip-10-124-4-238.ap-southeast-1.compute.internal"
E0410 22:04:16.201141 1 scraper.go:149] "Failed to scrape node" err="Get "https://10.124.4.238:10250/metrics/resource\": tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, not 10.124.4.238" node="fargate-ip-10-124-4-238.ap-southeast-1.compute.internal"
E0410 22:04:31.201853 1 scraper.go:149] "Failed to scrape node" err="Get "https://10.124.4.238:10250/metrics/resource\": tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, not 10.124.4.238" node="fargate-ip-10-124-4-238.ap-southeast-1.compute.internal"
E0410 22:04:46.277913 1 scraper.go:149] "Failed to scrape node" err="Get "https://10.124.4.238:10250/metrics/resource\": tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, not 10.124.4.238" node="fargate-ip-10-124-4-238.ap-southeast-1.compute.internal"

What you expected to happen:
To be able to scrape itself.

Anything else we need to know?:

  1. Initially was using the metrics server that came with vpa. Errors similar to the above appears.
# metrics-server -- configuration options for the [metrics server Helm chart](https://github.com/kubernetes-sigs/metrics-server/tree/master/charts/metrics-server). See the projects [README.md](https://github.com/kubernetes-sigs/metrics-server/tree/master/charts/metrics-server#configuration) for all available options
metrics-server:
  # metrics-server.enabled -- Whether or not the metrics server Helm chart should be installed
  enabled: true
  # CHANGE ABOVE from original value false

  defaultArgs:
  - --cert-dir=/tmp
  - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
  - --kubelet-use-node-status-port
  - --metric-resolution=15s


  1. But later switch to the metrics server that is directly installed from eks_blueprints_kubernetes_addons. Errors similar to the above appears.
  enable_metrics_server = true
  metrics_server = {
    name          = "metrics-server"
    chart_version = "3.12.1"
    repository    = "https://kubernetes-sigs.github.io/metrics-server/"
    namespace     = "kube-system"
    values        = [templatefile("${path.module}/metrics-svr.yaml", {})]
  }
  
  1. Tried to upgrade metrics server from version 0.6.x to 0.7.x. Errors similar to the above appears.
  2. Tried to use by pass the certificate check by passing in the ‘--kubelet-insecure-tls’
defaultArgs:
  - --cert-dir=/tmp
  - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
  - --kubelet-use-node-status-port
  - --metric-resolution=15s
  - --kubelet-insecure-tls

However, the follow errors appear.
E0410 22:13:28.928630 1 scraper.go:149] "Failed to scrape node" err="request failed, status: "403 Forbidden"" node="fargate-ip-10-124-4-186.ap-southeast-1.compute.internal"
E0410 22:13:43.827793 1 scraper.go:149] "Failed to scrape node" err="request failed, status: "403 Forbidden"" node="fargate-ip-10-124-4-186.ap-southeast-1.compute.internal"

Environment:

Kubernetes distribution EKS Fargate
    Server Version: v1.27.11-eks-b9c9ed7
  • Metrics Server manifest
spoiler for Metrics Server manifest:

spoiler for Metrics Server manifest:

apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
meta.helm.sh/release-name: metrics-server
meta.helm.sh/release-namespace: kube-system
creationTimestamp: "2024-04-10T21:48:44Z"
labels:
app.kubernetes.io/instance: metrics-server
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: metrics-server
app.kubernetes.io/version: 0.7.1
helm.sh/chart: metrics-server-3.12.1
name: metrics-server
namespace: kube-system
resourceVersion: "1044967"
uid: bbd89fdf-d933-4fd3-9bfa-2c8351bc9159

apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: metrics-server
meta.helm.sh/release-namespace: kube-system
creationTimestamp: "2024-04-10T21:48:44Z"
labels:
app.kubernetes.io/instance: metrics-server
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: metrics-server
app.kubernetes.io/version: 0.7.1
helm.sh/chart: metrics-server-3.12.1
name: metrics-server
namespace: kube-system
resourceVersion: "1044976"
uid: fe68eb2b-9ecf-4c57-996e-6836955f614c
spec:
clusterIP: 172.20.20.200
clusterIPs:

  • 172.20.20.200
    internalTrafficPolicy: Cluster
    ipFamilies:
  • IPv4
    ipFamilyPolicy: SingleStack
    ports:
  • name: https
    port: 443
    protocol: TCP
    targetPort: https
    selector:
    app.kubernetes.io/instance: metrics-server
    app.kubernetes.io/name: metrics-server
    sessionAffinity: None
    type: ClusterIP
    status:
    loadBalancer: {}

apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "3"
meta.helm.sh/release-name: metrics-server
meta.helm.sh/release-namespace: kube-system
creationTimestamp: "2024-04-10T21:48:44Z"
generation: 3
labels:
app.kubernetes.io/instance: metrics-server
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: metrics-server
app.kubernetes.io/version: 0.7.1
helm.sh/chart: metrics-server-3.12.1
name: metrics-server
namespace: kube-system
resourceVersion: "1048455"
uid: 51c7e198-d10b-4ec4-b96d-69e151de778b
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/instance: metrics-server
app.kubernetes.io/name: metrics-server
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/instance: metrics-server
app.kubernetes.io/name: metrics-server
spec:
containers:
- args:
- --secure-port=10250
- --cert-dir=/tmp
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls
image: registry.k8s.io/metrics-server/metrics-server:v0.7.1
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /livez
port: https
scheme: HTTPS
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: metrics-server
ports:
- containerPort: 10250
name: https
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /readyz
port: https
scheme: HTTPS
initialDelaySeconds: 20
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
requests:
cpu: 100m
memory: 200Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /tmp
name: tmp
dnsPolicy: ClusterFirst
priorityClassName: system-cluster-critical
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: metrics-server
serviceAccountName: metrics-server
terminationGracePeriodSeconds: 30
volumes:
- emptyDir: {}
name: tmp
status:
availableReplicas: 1
conditions:

  • lastTransitionTime: "2024-04-10T21:50:06Z"
    lastUpdateTime: "2024-04-10T21:50:06Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available

  • lastTransitionTime: "2024-04-10T21:48:44Z"
    lastUpdateTime: "2024-04-10T22:13:47Z"
    message: ReplicaSet "metrics-server-578bc9bf64" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
    observedGeneration: 3
    readyReplicas: 1
    replicas: 1
    updatedReplicas: 1

  • apiVersion: apiregistration.k8s.io/v1
    kind: APIService
    metadata:
    annotations:
    meta.helm.sh/release-name: metrics-server
    meta.helm.sh/release-namespace: kube-system
    creationTimestamp: "2024-04-10T21:48:44Z"
    labels:
    app.kubernetes.io/instance: metrics-server
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: metrics-server
    app.kubernetes.io/version: 0.7.1
    helm.sh/chart: metrics-server-3.12.1
    name: v1beta1.metrics.k8s.io
    resourceVersion: "1048453"
    uid: 84cc08c7-27bc-4a4e-a7b8-efcd7b428ea2
    spec:
    group: metrics.k8s.io
    groupPriorityMinimum: 100
    insecureSkipTLSVerify: true
    service:
    name: metrics-server
    namespace: kube-system
    port: 443
    version: v1beta1
    versionPriority: 100
    status:
    conditions:

  • Kubelet config:
spoiler for Kubelet config:
  • Metrics server logs:
spoiler for Metrics Server logs:
  • Status of Metrics API:
spolier for Status of Metrics API:
kubectl describe apiservice v1beta1.metrics.k8s.io

/kind bug

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.kind/supportCategorizes issue or PR as a support question.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions