Skip to content

Mountpoint's file closing logic for writing breaks in V2 when multiple processes access the same open file #626

@dannycjones

Description

@dannycjones

What happened?

File writes fail due to premature closing of files.

...
50604:2025-09-23T16:23:08.047121Z DEBUG fuser::request: FUSE(91744) ino 0x0000000000000011 WRITE fh FileHandle(101), offset 40542208, size 4096, write flags 0x0
50606:2025-09-23T16:23:08.047210Z DEBUG fuser::request: FUSE(91748) ino 0x0000000000000011 WRITE fh FileHandle(101), offset 40546304, size 4096, write flags 0x0
50609:2025-09-23T16:23:08.047292Z DEBUG fuser::request: FUSE(91754) ino 0x0000000000000011 FLUSH fh FileHandle(101), lock owner LockOwner(4103602521915412644)
50610:2025-09-23T16:23:08.047296Z DEBUG fuser::request: FUSE(91756) ino 0x0000000000000011 WRITE fh FileHandle(101), offset 40550400, size 4096, write flags 0x0
50612:2025-09-23T16:23:08.047354Z DEBUG fuser::request: FUSE(91760) ino 0x0000000000000011 WRITE fh FileHandle(101), offset 40554496, size 4096, write flags 0x0
51713:2025-09-23T16:23:08.119019Z DEBUG flush{req=91754 ino=17 fh=101 pid=0 name="part-00000-5b5cbfde-a336-40c5-9bf1-a4298fb20b1b-c000.snappy.parquet"}: mountpoint_s3_fs::fs::handles: put succeeded etag="\"47d379521fe24ad28da6fadd434a912d-5\"" key="order/output/ondemand/_temporary/0/_temporary/attempt_202509231623041559459485160075385_0007_m_000000_51/part-00000-5b5cbfde-a336-40c5-9bf1-a4298fb20b1b-c000.snappy.parquet" size=40554496
51714:2025-09-23T16:23:08.119073Z  WARN write{req=91760 ino=17 fh=101 offset=40554496 length=4096 pid=0 name="part-00000-5b5cbfde-a336-40c5-9bf1-a4298fb20b1b-c000.snappy.parquet"}: mountpoint_s3_fs::fuse: write failed with errno 5: upload already completed for key "order/output/ondemand/_temporary/0/_temporary/attempt_202509231623041559459485160075385_0007_m_000000_51/part-00000-5b5cbfde-a336-40c5-9bf1-a4298fb20b1b-c000.snappy.parquet"
55863:2025-09-23T16:23:08.367917Z DEBUG fuser::request: FUSE(102002) ino 0x0000000000000011 FLUSH fh FileHandle(101), lock owner LockOwner(13709436311748740162)
57344:2025-09-23T16:23:08.456849Z DEBUG fuser::request: FUSE(104866) ino 0x0000000000000011 FLUSH fh FileHandle(101), lock owner LockOwner(2560387006992014556)
58457:2025-09-23T16:23:08.521675Z DEBUG fuser::request: FUSE(107006) ino 0x0000000000000011 FLUSH fh FileHandle(101), lock owner LockOwner(15889247685641757733)
207094:2025-09-23T16:23:18.046015Z DEBUG fuser::request: FUSE(403344) ino 0x0000000000000011 FLUSH fh FileHandle(101), lock owner LockOwner(7237174021238096036)
...

What you expected to happen?

There should be no change between V1 and V2. The file should be able to be written to completely, and complete the upload to S3 after all file descriptors were closed.

How to reproduce it (as minimally and precisely as possible)?

Running this example is enough to fail: https://github.com/awslabs/data-on-eks/blob/main/analytics/terraform/spark-k8s-operator/examples/karpenter/spark-app-graviton.yaml

Anything else we need to know?:

Internal tracking: D306200245

Workaround, not recommended: kubectl patch daemonset s3-csi-node -n kube-system -p '{"spec":{"template":{"spec":{"hostPID":true}}}}' && kubectl rollout restart daemonset s3-csi-node -n kube-system

This is a regression due to containerization of Mountpoint for V2. (V2 issue: #504)

Environment

  • Kubernetes version (use kubectl version): TBC
  • Driver version: v2.0.0

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions