Skip to content

Upload is broken (9: Bad file descriptor) #12875

Open
@sepich

Description

@sepich

What happened:

POST with 10mb body leads to dropped request with error:

2025/02/20 15:24:11 [crit] 517#517: *74177 pread() "/tmp/nginx/client-body/0000000041" failed (9: Bad file descriptor) while sending request to upstream

What you expected to happen:

Http request succeeds.

NGINX Ingress controller version (exec into the pod and run /nginx-ingress-controller --version):
v1.12.0

Kubernetes version (use kubectl version):
v1.30.8

Environment:

  • Cloud provider or hardware configuration: EKS/GKE/on-prem

How to reproduce this issue:

Could be reproduced in kind:

kind create cluster
kubectl apply -f https://kind.sigs.k8s.io/examples/ingress/deploy-ingress-nginx.yaml
kubectl -n ingress-nginx port-forward ingress-nginx-controller-696d4c4c5-x7kl6 8080:80

Create test Ingress accepting uploads:

yaml spec
apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo
spec:
  selector:
    matchLabels:
      app: echo
  template:
    metadata:
      labels:
        app: echo
    spec:
      containers:
        - image: inanimate/echo-server
          name: echo
      enableServiceLinks: false
---
apiVersion: v1
kind: Service
metadata:
  name: echo
spec:
  selector:
    app: echo
  ports:
    - port: 8080
      targetPort: 8080
      protocol: TCP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/proxy-body-size: "0"
  #    nginx.ingress.kubernetes.io/proxy-request-buffering: "off"
  name: echo
spec:
  ingressClassName: nginx
  rules:
    - http:
        paths:
          - backend:
              service:
                name: echo
                port:
                  number: 8080
            path: /
            pathType: ImplementationSpecific

Upload file, see the connection dropped:

dd if=/dev/urandom of=10M bs=1M count=10
curl -d @10M http://localhost:8080 > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (7) Failed to connect to localhost port 8080 after 0 ms: Couldn't connect to server

In logs:

$ k -n ingress-nginx logs --tail=5 ingress-nginx-controller-696d4c4c5-x7kl6

I0220 16:14:51.539815      11 event.go:377] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"kube-system", Name:"echo", UID:"eb48935e-1c17-40d5-9563-599b29c8d844", APIVersion:"networking.k8s.io/v1", ResourceVersion:"1177", FieldPath:""}): type: 'Normal' reason: 'Sync' Scheduled for sync
127.0.0.1 - - [20/Feb/2025:16:57:25 +0000] "GET / HTTP/1.1" 200 1779 "-" "curl/8.7.1" 77 0.029 [kube-system-echo-8080] [] 10.244.0.8:8080 1779 0.029 200 3d0c1d9f60cec40e163e0f2713eb9821
2025/02/20 16:57:38 [warn] 573#573: *32687 a client request body is buffered to a temporary file /tmp/nginx/client-body/0000000001, client: 127.0.0.1, server: _, request: "POST / HTTP/1.1", host: "localhost:8080"
2025/02/20 16:57:38 [crit] 573#573: *32687 pread() "/tmp/nginx/client-body/0000000001" failed (9: Bad file descriptor) while sending request to upstream, client: 127.0.0.1, server: _, request: "POST / HTTP/1.1", upstream: "http://10.244.0.8:8080/", host: "localhost:8080"
127.0.0.1 - - [20/Feb/2025:16:57:39 +0000] "POST / HTTP/1.1" 200 3959 "-" "curl/8.7.1" 10363454 1.035 [kube-system-echo-8080] [] 10.244.0.8:8080 3939 0.003 502 73dfb8ab83944211aadfc70382080cd8

Per debug logs it happens after upstream already answered 200 OK, and response is forwarded back to client:

2025/02/20 17:21:31 [debug] 1114#1114: *101569 http body new buf t:1 f:0 0000FFFF802C1690, pos 0000FFFF802C1690, size: 8192 file: 0, size: 0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http write client request body, bufs 0000FFFF80287290
2025/02/20 17:21:31 [debug] 1114#1114: *101569 add cleanup: 0000FFFF802872A0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 hashed path: /tmp/nginx/client-body/0000000002
2025/02/20 17:21:31 [debug] 1114#1114: *101569 temp fd:12
2025/02/20 17:21:31 [warn] 1114#1114: *101569 a client request body is buffered to a temporary file /tmp/nginx/client-body/0000000002, client: 127.0.0.1, server: _, request: "POST / HTTP/1.1", host: "localhost:8080"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 write: 12, 0000FFFF802C1690, 8192, 0
...
2025/02/20 17:21:31 [debug] 1114#1114: *101569 read: 12, 0000FFFF802C1690, 8192, 360448
...
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy status 200 "200 OK"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy header: "Content-Type: text/plain"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy header: "X-Real-Server: echo-server"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy header: "Date: Thu, 20 Feb 2025 17:21:31 GMT"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy header: "Connection: close"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy header: "Transfer-Encoding: chunked"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy header done
2025/02/20 17:21:31 [debug] 1114#1114: *101569 lua header filter for user lua code, uri "/"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 code cache lookup (key='nhlf_9f4fbd7160880a0dbc1df5df3dfe0b59', ref=-1)
2025/02/20 17:21:31 [debug] 1114#1114: *101569 code cache miss (key='nhlf_9f4fbd7160880a0dbc1df5df3dfe0b59', ref=-1)
2025/02/20 17:21:31 [debug] 1114#1114: *101569 lua capture header filter, uri "/"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 headers more header filter, uri "/"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 HTTP/1.1 200 OK
Date: Thu, 20 Feb 2025 17:21:31 GMT
Content-Type: text/plain
Transfer-Encoding: chunked
Connection: keep-alive
X-Real-Server: echo-server

2025/02/20 17:21:31 [debug] 1114#1114: *101569 write new buf t:1 f:0 0000FFFF80287F70, pos 0000FFFF80287F70, size: 162 file: 0, size: 0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http write filter: l:0 f:0 s:162
2025/02/20 17:21:31 [debug] 1114#1114: *101569 file cleanup: fd:12
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy filter init s:200 h:0 c:1 l:-1
2025/02/20 17:21:31 [debug] 1114#1114: *101569 tcp_nodelay
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 38 s:0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 30 s:1
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 30 s:1
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 0D s:1
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 0A s:3
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 57 s:4
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy out buf 0000FFFF80199592 2048
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 0D s:5
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 0A s:6
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 64 s:0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 63 s:1
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 62 s:1
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 0D s:1
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 0A s:3
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunked byte: 2C s:4
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http proxy out buf 0000FFFF80199D99 1879
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http upstream process non buffered downstream
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http output filter "/?"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http copy filter: "/?"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 lua capture body filter, uri "/"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http postpone filter "/?" 0000FFFF80288150
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunk: 2048
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http chunk: 1879
2025/02/20 17:21:31 [debug] 1114#1114: *101569 write old buf t:1 f:0 0000FFFF80287F70, pos 0000FFFF80287F70, size: 162 file: 0, size: 0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 write new buf t:1 f:0 0000FFFF802881F0, pos 0000FFFF802881F0, size: 5 file: 0, size: 0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 write new buf t:0 f:0 0000000000000000, pos 0000FFFF80199592, size: 2048 file: 0, size: 0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 write new buf t:0 f:0 0000000000000000, pos 0000FFFF80199D99, size: 1879 file: 0, size: 0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 write new buf t:0 f:0 0000000000000000, pos 0000AAAABA2B2288, size: 2 file: 0, size: 0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http write filter: l:0 f:1 s:4096
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http write filter limit 2097152
2025/02/20 17:21:31 [debug] 1114#1114: *101569 writev: 4096 of 4096
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http write filter 0000000000000000
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http copy filter: 0 "/?"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 recv: eof:0, avail:0
2025/02/20 17:21:31 [debug] 1114#1114: *101569 event timer add: 15: 60000:17774858
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http upstream request: "/?"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http upstream send request handler
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http upstream send request
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http upstream send request body
2025/02/20 17:21:31 [debug] 1114#1114: *101569 chain writer in: 0000FFFF80287B90
2025/02/20 17:21:31 [debug] 1114#1114: *101569 writev: 1920 of 1920
2025/02/20 17:21:31 [debug] 1114#1114: *101569 chain writer out: 0000000000000000
2025/02/20 17:21:31 [debug] 1114#1114: *101569 read: -1, 0000FFFF802C1690, 8192, 368640
2025/02/20 17:21:31 [crit] 1114#1114: *101569 pread() "/tmp/nginx/client-body/0000000002" failed (9: Bad file descriptor) while sending request to upstream, client: 127.0.0.1, server: _, request: "POST / HTTP/1.1", upstream: "http://10.244.0.8:8080/", host: "localhost:8080"
2025/02/20 17:21:31 [debug] 1114#1114: *101569 http next upstream, 2
2025/02/20 17:21:31 [debug] 1114#1114: *101569 free keepalive peer
2025/02/20 17:21:31 [debug] 1114#1114: *101569 lua balancer free peer, tries: 2
2025/02/20 17:21:31 [debug] 1114#1114: *101569 finalize http upstream request: 502
2025/02/20 17:21:31 [debug] 1114#1114: *101569 finalize http proxy request

So, 0000FFFF802C1690 - is the buffer for /tmp/nginx/client-body/0000000002
Last read was at pos 360448, and then it tries to continue reads from pos 360448+8192=368640 to same buffer but from fh=-1?

Anything else we need to know:

We noticed the issue in both prod GKE and EKS clusters with ingress-nginx v1.11.2.
Tested that v1.12.0 and latest image with nginx 1.27 still have the issue.
The only found current workaround is to add:
nginx.ingress.kubernetes.io/proxy-request-buffering: "off"
But that slows down all uploads to 2Mbit/s comparing to usual >300Mbit/s with buffering enabled, and is not acceptable.

Possibly related:
#4474
openresty/lua-nginx-module#2374

Also tried, without success:

  • reduce nginx workers to 1
  • bumping client_body_buffer_size can make 10Mb file pass, but then 11Mb fails
  • client_body_in_file_only on;
  • all variants of chunked_transfer_encoding, proxy_buffering

How can I help fixing this issue? Do you have more ideas to try?

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/supportCategorizes issue or PR as a support question.lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.needs-prioritytriage/needs-informationIndicates an issue needs more information in order to work on it.

    Type

    No type

    Projects

    • Status

      No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions