Skip to content

v4.11.1 unexpected error obtaining nginx status info #11689

Open
@Kampe

Description

@Kampe

Seeing issues in nginx startup, not seeing much in relation to why there's issues with the healthcheck response.

I0727 00:08:28.342380       7 nginx.go:317] "Starting NGINX process"
I0727 00:08:28.342455       7 leaderelection.go:250] attempting to acquire leader lease ingress-nginx/ingress-nginx-internal-leader...
I0727 00:08:28.342749       7 nginx.go:337] "Starting validation webhook" address=":8443" certPath="/usr/local/certificates/cert" keyPath="/usr/local/certificates/key"
I0727 00:08:28.345201       7 controller.go:193] "Configuration changes detected, backend reload required"
I0727 00:08:28.358021       7 status.go:85] "New leader elected" identity="ingress-nginx-internal-controller-67bfb7fd4b-nzkdt"
2024/07/27 00:08:35 Get "http://127.0.0.1:10246/nginx_status": dial tcp 127.0.0.1:10246: connect: connection refused
W0727 00:08:35.677958       7 nginx_status.go:171] unexpected error obtaining nginx status info: Get "http://127.0.0.1:10246/nginx_status": dial tcp 127.0.0.1:10246: connect: connection refused
2024/07/27 00:09:05 Get "http://127.0.0.1:10246/nginx_status": dial tcp 127.0.0.1:10246: connect: connection refused
W0727 00:09:05.683341       7 nginx_status.go:171] unexpected error obtaining nginx status info: Get "http://127.0.0.1:10246/nginx_status": dial tcp 127.0.0.1:10246: connect: connection refused
I0727 00:09:07.380630       7 controller.go:213] "Backend successfully reloaded"
I0727 00:09:07.380716       7 controller.go:224] "Initial sync, sleeping for 1 second"
I0727 00:09:07.380802       7 event.go:377] Event(v1.ObjectReference{Kind:"Pod", Namespace:"ingress-nginx", Name:"ingress-nginx-internal-controller-dbcc4dc9c-29mpv", UID:"4ee6bf1d-df1f-4bb4-8e37-04d6978dfd6d", APIVersion:"v1", ResourceVersion:"214163955", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration
W0727 00:09:08.382382       7 controller.go:244] Dynamic reconfiguration failed (retrying; 15 retries left): Post "http://127.0.0.1:10246/configuration/backends": dial tcp 127.0.0.1:10246: connect: connection refused
W0727 00:09:09.394353       7 controller.go:244] Dynamic reconfiguration failed (retrying; 14 retries left): Post "http://127.0.0.1:10246/configuration/backends": dial tcp 127.0.0.1:10246: connect: connection refused
W0727 00:09:10.797697       7 controller.go:244] Dynamic reconfiguration failed (retrying; 13 retries left): Post "http://127.0.0.1:10246/configuration/backends": dial tcp 127.0.0.1:10246: connect: connection refused
W0727 00:09:12.616922       7 controller.go:244] Dynamic reconfiguration failed (retrying; 12 retries left): Post "http://127.0.0.1:10246/configuration/backends": dial tcp 127.0.0.1:10246: connect: connection refused
W0727 00:09:14.913299       7 controller.go:244] Dynamic reconfiguration failed (retrying; 11 retries left): Post "http://127.0.0.1:10246/configuration/backends": dial tcp 127.0.0.1:10246: connect: connection refused
I0727 00:09:16.276657       7 sigterm.go:36] "Received SIGTERM, shutting down"
I0727 00:09:16.276928       7 nginx.go:393] "Shutting down controller queues"
I0727 00:09:16.289355       7 nginx.go:401] "Stopping admission controller"
E0727 00:09:16.289652       7 nginx.go:340] "Error listening for TLS connections" err="http: Server closed"
I0727 00:09:16.289815       7 nginx.go:409] "Stopping NGINX process"
W0727 00:09:17.931239       7 controller.go:244] Dynamic reconfiguration failed (retrying; 10 retries left): Post "http://127.0.0.1:10246/configuration/backends": dial tcp 127.0.0.1:10246: connect: connection refused
W0727 00:09:21.837363       7 controller.go:244] Dynamic reconfiguration failed (retrying; 9 retries left): Post "http://127.0.0.1:10246/configuration/backends": dial tcp 127.0.0.1:10246: connect: connection refused
W0727 00:09:26.847362       7 controller.go:244] Dynamic reconfiguration failed (retrying; 8 retries left): Post "http://127.0.0.1:10246/configuration/backends": dial tcp 127.0.0.1:10246: connect: connection refused
W0727 00:09:33.648965       7 controller.go:244] Dynamic reconfiguration failed (retrying; 7 retries left): Post "http://127.0.0.1:10246/configuration/backends": dial tcp 127.0.0.1:10246: connect: connection refused
2024/07/27 00:09:16 [notice] 2486#2486: ModSecurity-nginx v1.0.3 (rules loaded inline/local/remote: 0/14418/0)
2024/07/27 00:09:16 [notice] 2486#2486: signal process started
W0727 00:09:41.869474       7 controller.go:244] Dynamic reconfiguration failed (retrying; 6 retries left): Post "http://127.0.0.1:10246/configuration/backends": dial tcp 127.0.0.1:10246: connect: connection refused
W0727 00:09:53.470106       7 controller.go:244] Dynamic reconfiguration failed (retrying; 5 retries left): Post "http://127.0.0.1:10246/configuration/backends": dial tcp 127.0.0.1:10246: connect: connection refused
I0727 00:09:59.244212       7 nginx.go:422] "NGINX process has stopped"
I0727 00:09:59.244234       7 sigterm.go:44] Handled quit, delaying controller exit for 10 seconds

What happened:

Upgraded my helm chart from v4.10.0 to v4.11.1

What you expected to happen:

All pods are replaced and working without issue.

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):

NGINX Ingress controller
  Release:       v1.11.1
  Build:         7c44f992012555ff7f4e47c08d7c542ca9b4b1f7
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.25.5

Kubernetes version (use kubectl version):

Client Version: v1.30.0
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.4-eks-036c24b

Environment:
AWS EKS

  • How was the ingress-nginx-controller installed:
values: |
        fullnameOverride: ingress-nginx-internal
        controller:
          replicaCount: 3
          autoscaling:
            enabled: true
            minReplicas: 3
            targetCPUUtilizationPercentage: 80
            targetMemoryUtilizationPercentage: 80
          resources:
            requests:
              cpu: "500m"
              memory: "512Mi"
          ingressClassResource:
            name: "nginx-internal"
            controllerValue: "k8s.io/ingress-nginx-internal"
            enabled: true
            default: true
          opentelemetry:
            enabled: true
          admissionWebhooks:
            timeoutSeconds: 30

          config:
            allow-snippet-annotations: "true"
            otlp-collector-host: "opentelemetry-collector.monitoring.svc"
            otlp-collector-port: "4317"
            enable-opentelemetry: "true"
            otel-sampler: "AlwaysOn"
            otel-sampler-ratio: "1.0"
            enable-underscores-in-headers: "true"
            opentelemetry-config: "/etc/nginx/opentelemetry.toml"
            opentelemetry-operation-name: "HTTP $request_method $service_name $uri"
            opentelemetry-trust-incoming-span: "false"
            otel-sampler-parent-based: "false"
            otel-max-queuesize: "2048"
            otel-schedule-delay-millis: "5000"
            otel-max-export-batch-size: "512"
            server-snippet: |
              opentelemetry_attribute "ingress.namespace" "$namespace";
              opentelemetry_attribute "ingress.service_name" "$service_name";
              opentelemetry_attribute "ingress.name" "$ingress_name";
              opentelemetry_attribute "ingress.upstream" "$proxy_upstream_name";

          metrics:
            enabled: true
            serviceMonitor:
              enabled: true
          service:
            public: false
            subdomain: "ingress-internal"
            external:
              enabled: false
            internal:
              enabled: true
              annotations: 
                service.beta.kubernetes.io/aws-load-balancer-type: nlb-ip
                service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
                service.beta.kubernetes.io/aws-load-balancer-scheme: internal
                service.beta.kubernetes.io/aws-load-balancer-internal: "true"
                service.beta.kubernetes.io/aws-load-balancer-attributes: deletion_protection.enabled=true
  • Current State of the controller:
Name:         nginx-internal
Labels:       app.kubernetes.io/component=controller
              app.kubernetes.io/instance=ingress-nginx-internal
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=ingress-nginx
              app.kubernetes.io/part-of=ingress-nginx
              app.kubernetes.io/version=1.11.1
              argocd.argoproj.io/instance=ingress-nginx-internal
              helm.sh/chart=ingress-nginx-4.11.1
Annotations:  argocd.argoproj.io/tracking-id: ingress-nginx-internal:networking.k8s.io/IngressClass:ingress-nginx/nginx-internal
              ingressclass.kubernetes.io/is-default-class: true
Controller:   k8s.io/ingress-nginx-internal
Events:       <none>

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/supportCategorizes issue or PR as a support question.lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.needs-priorityneeds-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.triage/needs-informationIndicates an issue needs more information in order to work on it.

    Type

    No type

    Projects

    • Status

      No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions