-
Notifications
You must be signed in to change notification settings - Fork 65
Description
My setup:
Running 5 volume servers (spinning disks) on 1 node Kubernetes cluster. The seaweedfs deployment itself seems to be working well and stays online - I've spent a handful of days scanning from a mount on my dev laptop without issue.
From there I have a couple additional Kubernetes clusters where I utilize the CSI Driver for shared storage. You can see that part of my stack here: https://github.com/OwnYourIO/SpencersLab/blob/main/services/proxy-local/prod/values.yaml#L1-L11
My issues:
CSI Driver crashing
Under load the CSI driver takes RAM and doesn't give it back resulting in OOM events.
CSI Driver not recovering from crash
After the above happens all pods using the CSI driver for a mount give I/O errors when accessing those paths. In order to get them functional again I have to restart the mount pod, node pod, and whatever is actually accessing the files. I'll bet it's the same issue as #200
What I've done
Mostly I've tried updating the CSI driver so that it's current (image version: v1.4.3, chart version: 0.2.9), but I've also tried increasing and decreasing the memory limit (below it's set to 8GB).
I think this started when I updated the driver and the mount had to be split out... But I'm also not exactly sure if I don't recall there being an issue then because there wasn't one, or if there were other instabilities causing worse havoc that I was focused on instead.
The task that kills the driver the fastest is when I have qbittorrent recheck it's files. I expect this to take a day+ for all the files, and it usually crashes out after a handful of hours. I figured maybe there was an issue with the app checking files in parallel so I reduced Asynchronous I/O threads from 10 to 1, but that gave the same result.
At this point I've got a fairly robust monitoring stack, so if you're curious about anything else I can turn up or try let me know! I think SeaweedFS is great and while I'm not stoked to have it crash so consistently, I'm happy to have an opportunity to contribute :)
Logs
Below are some logs of the driver starting and crashing. I get pretty much constant [media] stderr: read new data2 and am not sure if that's related or not.
Here are some logs of the CSI driver starting:
02/04/2026, 08:31:55.010 PM
I0205 03:31:55.010707 main.go:69 will run node: true, controller: false, attacher: true
02/04/2026, 08:31:55.011 PM
I0205 03:31:55.010993 main.go:81 connect to filer seaweedfs-filer-0.infra.spencerslab.com:8880,seaweedfs-filer-1.infra.spencerslab.com:8881,seaweedfs-filer-2.infra.spencerslab.com:8882,seaweedfs-filer-3.infra.spencerslab.com:8883,seaweedfs-filer-4.infra.spencerslab.com:8884
02/04/2026, 08:31:55.011 PM
I0205 03:31:55.011005 driver.go:58 Driver: seaweedfs-csi-driver version: 1.0.0
02/04/2026, 08:31:55.011 PM
I0205 03:31:55.011589 driver.go:145 Enabling volume access mode: MULTI_NODE_MULTI_WRITER
02/04/2026, 08:31:55.011 PM
I0205 03:31:55.011602 driver.go:145 Enabling volume access mode: SINGLE_NODE_WRITER
02/04/2026, 08:31:55.011 PM
I0205 03:31:55.011604 driver.go:145 Enabling volume access mode: SINGLE_NODE_MULTI_WRITER
02/04/2026, 08:31:55.011 PM
I0205 03:31:55.011606 driver.go:145 Enabling volume access mode: SINGLE_NODE_SINGLE_WRITER
02/04/2026, 08:31:55.011 PM
I0205 03:31:55.011620 driver.go:152 Enabling controller service capability: CREATE_DELETE_VOLUME
02/04/2026, 08:31:55.011 PM
I0205 03:31:55.011624 driver.go:152 Enabling controller service capability: SINGLE_NODE_MULTI_WRITER
02/04/2026, 08:31:55.011 PM
I0205 03:31:55.011625 driver.go:152 Enabling controller service capability: EXPAND_VOLUME
02/04/2026, 08:31:55.011 PM
I0205 03:31:55.011627 driver.go:152 Enabling controller service capability: PUBLISH_UNPUBLISH_VOLUME
02/04/2026, 08:31:55.011 PM
I0205 03:31:55.011631 driver.go:108 starting
02/04/2026, 08:31:55.012 PM
I0205 03:31:55.011972 server.go:92 Listening for connections on address: &net.UnixAddr{Name:"//csi/csi.sock", Net:"unix"}
02/04/2026, 08:31:55.099 PM
Started libcontainer container c88a2e16a7853f88efb140791d3f829ff57a28ecad156d9a5422b753e9fe0f97.
02/04/2026, 08:31:55.314 PM
Started libcontainer container 1e4dd277298fbee7fc863b4d9a63289f43ca1b8e936ca767c613bd480791fd16.
02/04/2026, 08:31:55.371 PM
I0205 03:31:55.371741 1 main.go:149] calling CSI driver to discover driver name
02/04/2026, 08:31:55.372 PM
I0205 03:31:55.372495 1 main.go:155] CSI driver name: "seaweedfs-csi-driver"
02/04/2026, 08:31:55.372 PM
I0205 03:31:55.372523 1 main.go:183] ServeMux listening at ":9808"
02/04/2026, 08:31:55.505 PM
I0205 03:31:55.505047 1205 pod_startup_latency_tracker.go:104] "Observed pod startup duration" pod="default/proxy-local-seaweedfs-csi-driver-node-zjrzp" podStartSLOduration=1.505030042 podStartE2EDuration="1.505030042s" podCreationTimestamp="2026-02-05 03:31:54 +0000 UTC" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2026-02-05 03:31:55.502244252 +0000 UTC m=+31319.589208067" watchObservedRunningTime="2026-02-05 03:31:55.505030042 +0000 UTC m=+31319.591993838"
02/04/2026, 08:31:55.850 PM
I0205 03:31:55.850161 1205 csi_plugin.go:106] kubernetes.io/csi: Trying to validate a new CSI Driver with name: seaweedfs-csi-driver endpoint: /var/lib/kubelet/plugins/seaweedfs-csi-driver/csi.sock versions: 1.0.0
02/04/2026, 08:31:55.850 PM
I0205 03:31:55.850198 1205 csi_plugin.go:119] kubernetes.io/csi: Register new plugin with name: seaweedfs-csi-driver at endpoint: /var/lib/kubelet/plugins/seaweedfs-csi-driver/csi.sock
02/04/2026, 08:31:58.030 PM
cri-containerd-628c3eba1e9d58020bebfae99a67044d30424bf96c605572aa597094ac47fcc3.scope: Deactivated successfully.
02/04/2026, 08:31:58.093 PM
run-k3s-containerd-io.containerd.runtime.v2.task-k8s.io-628c3eba1e9d58020bebfae99a67044d30424bf96c605572aa597094ac47fcc3-rootfs.mount: Deactivated successfully.
02/04/2026, 08:31:58.103 PM
Created slice libcontainer container kubepods-burstable-pod7319c3ba_3cb8_42d2_956b_41be1c37353f.slice.
02/04/2026, 08:31:58.170 PM
run-k3s-containerd-io.containerd.grpc.v1.cri-sandboxes-628c3eba1e9d58020bebfae99a67044d30424bf96c605572aa597094ac47fcc3-shm.mount: Deactivated successfully.
02/04/2026, 08:31:58.187 PM
cni0: port 39(veth22588d82) entered disabled state
02/04/2026, 08:31:58.188 PM
veth22588d82 (unregistering): left allmulticast mode
02/04/2026, 08:31:58.188 PM
veth22588d82 (unregistering): left promiscuous mode
02/04/2026, 08:31:58.188 PM
cni0: port 39(veth22588d82) entered disabled state
02/04/2026, 08:31:58.207 PM
run-netns-cni\x2dfb1f7f6e\x2d4427\x2df455\x2dcfb2\x2dd2df96e4938f.mount: Deactivated successfully.
02/04/2026, 08:31:58.242 PM
I0205 03:31:58.241945 1205 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-z45c4\" (UniqueName: \"kubernetes.io/projected/7319c3ba-3cb8-42d2-956b-41be1c37353f-kube-api-access-z45c4\") pod \"proxy-local-qbittorrent-98c78989c-zrl67\" (UID: \"7319c3ba-3cb8-42d2-956b-41be1c37353f\") " pod="default/proxy-local-qbittorrent-98c78989c-zrl67"
02/04/2026, 08:31:58.242 PM
I0205 03:31:58.242150 1205 operation_generator.go:515] "MountVolume.WaitForAttach entering for volume \"media\" (UniqueName: \"kubernetes.io/csi/seaweedfs-csi-driver^media\") pod \"proxy-local-qbittorrent-98c78989c-zrl67\" (UID: \"7319c3ba-3cb8-42d2-956b-41be1c37353f\") DevicePath \"csi-92e3312e0f52af45fdc11cbbc0d7aa1bb2c3e137b62c4ed7e5aa4bedaf883af1\"" pod="default/proxy-local-qbittorrent-98c78989c-zrl67"
02/04/2026, 08:31:58.258 PM
I0205 03:31:58.258870 1205 operation_generator.go:525] "MountVolume.WaitForAttach succeeded for volume \"media\" (UniqueName: \"kubernetes.io/csi/seaweedfs-csi-driver^media\") pod \"proxy-local-qbittorrent-98c78989c-zrl67\" (UID: \"7319c3ba-3cb8-42d2-956b-41be1c37353f\") DevicePath \"csi-92e3312e0f52af45fdc11cbbc0d7aa1bb2c3e137b62c4ed7e5aa4bedaf883af1\"" pod="default/proxy-local-qbittorrent-98c78989c-zrl67"
02/04/2026, 08:31:58.269 PM
I0205 03:31:58.269010 nodeserver.go:112 node publish volume media to /var/lib/kubelet/pods/7319c3ba-3cb8-42d2-956b-41be1c37353f/volumes/kubernetes.io~csi/media/mount
02/04/2026, 08:31:58.269 PM
I0205 03:31:58.269032 nodeserver.go:140 volume media not found in cache, attempting self-healing
02/04/2026, 08:31:58.269 PM
W0205 03:31:58.269077 mount_util.go:26 staging path /var/lib/kubelet/plugins/kubernetes.io/csi/seaweedfs-csi-driver/721c9525ade2ea8903d343ef25cf68b9bf4ab0aad56bb7b01fbe48d09bc7fcf4/globalmount has corrupted mount: stat /var/lib/kubelet/plugins/kubernetes.io/csi/seaweedfs-csi-driver/721c9525ade2ea8903d343ef25cf68b9bf4ab0aad56bb7b01fbe48d09bc7fcf4/globalmount: transport endpoint is not connected
02/04/2026, 08:31:58.269 PM
I0205 03:31:58.269833 nodeserver.go:150 volume media staging path /var/lib/kubelet/plugins/kubernetes.io/csi/seaweedfs-csi-driver/721c9525ade2ea8903d343ef25cf68b9bf4ab0aad56bb7b01fbe48d09bc7fcf4/globalmount is not healthy, re-staging
02/04/2026, 08:31:58.270 PM
I0205 03:31:58.269899 mount_util.go:69 cleaning up stale staging path /var/lib/kubelet/plugins/kubernetes.io/csi/seaweedfs-csi-driver/721c9525ade2ea8903d343ef25cf68b9bf4ab0aad56bb7b01fbe48d09bc7fcf4/globalmount
02/04/2026, 08:31:58.278 PM
var-lib-kubelet-plugins-kubernetes.io-csi-seaweedfs\x2dcsi\x2ddriver-721c9525ade2ea8903d343ef25cf68b9bf4ab0aad56bb7b01fbe48d09bc7fcf4-globalmount.mount: Deactivated successfully.
02/04/2026, 08:31:58.278 PM
I0205 03:31:58.278294 mount_util.go:98 successfully cleaned up staging path /var/lib/kubelet/plugins/kubernetes.io/csi/seaweedfs-csi-driver/721c9525ade2ea8903d343ef25cf68b9bf4ab0aad56bb7b01fbe48d09bc7fcf4/globalmount
02/04/2026, 08:31:58.278 PM
W0205 03:31:58.278417 mounter.go:180 VolumeContext 'path' ignored
02/04/2026, 08:31:58.278 PM
W0205 03:31:58.278422 mounter.go:180 VolumeContext 'csi.storage.k8s.io/pod.uid' ignored
02/04/2026, 08:31:58.278 PM
W0205 03:31:58.278425 mounter.go:180 VolumeContext 'csi.storage.k8s.io/serviceAccount.name' ignored
02/04/2026, 08:31:58.278 PM
W0205 03:31:58.278427 mounter.go:180 VolumeContext 'csi.storage.k8s.io/pod.name' ignored
02/04/2026, 08:31:58.278 PM
W0205 03:31:58.278429 mounter.go:180 VolumeContext 'csi.storage.k8s.io/ephemeral' ignored
02/04/2026, 08:31:58.278 PM
W0205 03:31:58.278430 mounter.go:180 VolumeContext 'csi.storage.k8s.io/pod.namespace' ignored
02/04/2026, 08:31:58.280 PM
I0205 03:31:58.279797 manager.go:245 [media] Starting weed mount: weed -logtostderr=true mount -dirAutoCreate=true -umask=000 -dir=/var/lib/kubelet/plugins/kubernetes.io/csi/seaweedfs-csi-driver/721c9525ade2ea8903d343ef25cf68b9bf4ab0aad56bb7b01fbe48d09bc7fcf4/globalmount -localSocket=/var/lib/seaweedfs-mount/seaweedfs-mount-721c9525ade2ea89.sock -cacheDir=/var/cache/seaweedfs/media -concurrentReaders=128 -disk=hdd -replication=001 -concurrentWriters=128 -filer=seaweedfs-filer-0.infra.spencerslab.com:8880,seaweedfs-filer-1.infra.spencerslab.com:8881,seaweedfs-filer-2.infra.spencerslab.com:8882,seaweedfs-filer-3.infra.spencerslab.com:8883,seaweedfs-filer-4.infra.spencerslab.com:8884 -cacheCapacityMB=0 -collection=media -filer.path=/buckets/media
02/04/2026, 08:31:58.343 PM
I0205 03:31:58.342468 1205 reconciler_common.go:163] "operationExecutor.UnmountVolume started for volume \"config\" (UniqueName: \"kubernetes.io/local-volume/pvc-49535829-c73e-4b50-99f3-028d21ed4e76\") pod \"88fc0060-efbe-4609-b85a-1cc7fc933392\" (UID: \"88fc0060-efbe-4609-b85a-1cc7fc933392\") "
02/04/2026, 08:31:58.343 PM
I0205 03:31:58.342592 1205 reconciler_common.go:163] "operationExecutor.UnmountVolume started for volume \"kube-api-access-z664h\" (UniqueName: \"kubernetes.io/projected/88fc0060-efbe-4609-b85a-1cc7fc933392-kube-api-access-z664h\") pod \"88fc0060-efbe-4609-b85a-1cc7fc933392\" (UID: \"88fc0060-efbe-4609-b85a-1cc7fc933392\") "
02/04/2026, 08:31:58.346 PM
I0205 03:31:58.346353 1205 operation_generator.go:781] UnmountVolume.TearDown succeeded for volume "kubernetes.io/projected/88fc0060-efbe-4609-b85a-1cc7fc933392-kube-api-access-z664h" (OuterVolumeSpecName: "kube-api-access-z664h") pod "88fc0060-efbe-4609-b85a-1cc7fc933392" (UID: "88fc0060-efbe-4609-b85a-1cc7fc933392"). InnerVolumeSpecName "kube-api-access-z664h". PluginName "kubernetes.io/projected", VolumeGIDValue ""
02/04/2026, 08:31:58.351 PM
I0205 03:31:58.351696 1205 operation_generator.go:781] UnmountVolume.TearDown succeeded for volume "kubernetes.io/local-volume/pvc-49535829-c73e-4b50-99f3-028d21ed4e76" (OuterVolumeSpecName: "config") pod "88fc0060-efbe-4609-b85a-1cc7fc933392" (UID: "88fc0060-efbe-4609-b85a-1cc7fc933392"). InnerVolumeSpecName "pvc-49535829-c73e-4b50-99f3-028d21ed4e76". PluginName "kubernetes.io/local-volume", VolumeGIDValue ""
02/04/2026, 08:31:58.439 PM
I0205 03:31:58.439789 manager.go:335 [media] stdout: mount point owner uid=0 gid=0 mode=drwxrwxrwx
02/04/2026, 08:31:58.439 PM
I0205 03:31:58.439808 manager.go:335 [media] stdout: current uid=0 gid=0
02/04/2026, 08:31:58.441 PM
I0205 03:31:58.441150 manager.go:335 [media] stderr: I0205 03:31:58.440955 leveldb_store.go:48 filer store dir: /var/cache/seaweedfs/media/e8e05423/meta
02/04/2026, 08:31:58.441 PM
I0205 03:31:58.441170 manager.go:335 [media] stderr: I0205 03:31:58.441031 file_util.go:28 Folder /var/cache/seaweedfs/media/e8e05423/meta Permission: -rwxr-xr-x
02/04/2026, 08:31:58.443 PM
I0205 03:31:58.443368 1205 reconciler_common.go:299] "Volume detached for volume \"kube-api-access-z664h\" (UniqueName: \"kubernetes.io/projected/88fc0060-efbe-4609-b85a-1cc7fc933392-kube-api-access-z664h\") on node \"cloud-proxy\" DevicePath \"\""
02/04/2026, 08:31:58.499 PM
I0205 03:31:58.499427 manager.go:78 started weed mount process for volume media at /var/lib/kubelet/plugins/kubernetes.io/csi/seaweedfs-csi-driver/721c9525ade2ea8903d343ef25cf68b9bf4ab0aad56bb7b01fbe48d09bc7fcf4/globalmount
02/04/2026, 08:31:58.507 PM
I0205 03:31:58.507902 manager.go:335 [media] stderr: I0205 03:31:58.507794 weedfs_grpc_server.go:15 quota changed from 0 to 54975581388800
02/04/2026, 08:31:58.508 PM
I0205 03:31:58.508475 nodeserver.go:170 volume media successfully re-staged to /var/lib/kubelet/plugins/kubernetes.io/csi/seaweedfs-csi-driver/721c9525ade2ea8903d343ef25cf68b9bf4ab0aad56bb7b01fbe48d09bc7fcf4/globalmount
02/04/2026, 08:31:58.548 PM
I0205 03:31:58.547966 manager.go:335 [media] stderr: I0205 03:31:58.547914 filer_conf.go:41 fuse filer conf /etc/seaweedfs/filer.conf not found
02/04/2026, 08:31:58.548 PM
I0205 03:31:58.547989 manager.go:335 [media] stderr: I0205 03:31:58.547946 mount_std.go:311 mounted seaweedfs-filer-0.infra.spencerslab.com:8880,seaweedfs-filer-1.infra.spencerslab.com:8881,seaweedfs-filer-2.infra.spencerslab.com:8882,seaweedfs-filer-3.infra.spencerslab.com:8883,seaweedfs-filer-4.infra.spencerslab.com:8884/buckets/media to /var/lib/kubelet/plugins/kubernetes.io/csi/seaweedfs-csi-driver/721c9525ade2ea8903d343ef25cf68b9bf4ab0aad56bb7b01fbe48d09bc7fcf4/globalmount
02/04/2026, 08:31:58.548 PM
I0205 03:31:58.547992 manager.go:335 [media] stderr: I0205 03:31:58.547956 mount_std.go:312 This is SeaweedFS version 30GB 4.07 linux amd64
02/04/2026, 08:31:58.548 PM
I0205 03:31:58.548479 manager.go:335 [media] stderr: I0205 03:31:58.548436 weedfs_metadata_flush.go:34 periodic metadata flush enabled, interval: 2m0s
02/04/2026, 08:31:58.550 PM
I0205 03:31:58.550236 nodeserver.go:180 volume media successfully published to /var/lib/kubelet/pods/7319c3ba-3cb8-42d2-956b-41be1c37353f/volumes/kubernetes.io~csi/media/mount
02/04/2026, 08:31:58.645 PM
I0205 03:31:58.645385 1205 reconciler_common.go:163] "operationExecutor.UnmountVolume started for volume \"media\" (UniqueName: \"kubernetes.io/csi/seaweedfs-csi-driver^media\") pod \"88fc0060-efbe-4609-b85a-1cc7fc933392\" (UID: \"88fc0060-efbe-4609-b85a-1cc7fc933392\") "
02/04/2026, 08:31:58.646 PM
I0205 03:31:58.646472 nodeserver.go:214 node unpublish volume media from /var/lib/kubelet/pods/88fc0060-efbe-4609-b85a-1cc7fc933392/volumes/kubernetes.io~csi/media/mount
02/04/2026, 08:31:58.647 PM
I0205 03:31:58.647754 nodeserver.go:242 volume media successfully unpublished from /var/lib/kubelet/pods/88fc0060-efbe-4609-b85a-1cc7fc933392/volumes/kubernetes.io~csi/media/mount
02/04/2026, 08:31:58.648 PM
I0205 03:31:58.648180 1205 operation_generator.go:781] UnmountVolume.TearDown succeeded for volume "kubernetes.io/csi/seaweedfs-csi-driver^media" (OuterVolumeSpecName: "media") pod "88fc0060-efbe-4609-b85a-1cc7fc933392" (UID: "88fc0060-efbe-4609-b85a-1cc7fc933392"). InnerVolumeSpecName "media". PluginName "kubernetes.io/csi", VolumeGIDValue ""
02/04/2026, 08:31:58.766 PM
<info> [1770262318.7661] manager: (vethdc54a712): new Veth device (/org/freedesktop/NetworkManager/Devices/52)
02/04/2026, 08:31:58.770 PM
cni0: port 39(vethdc54a712) entered blocking state
02/04/2026, 08:31:58.770 PM
cni0: port 39(vethdc54a712) entered disabled state
02/04/2026, 08:31:58.770 PM
vethdc54a712: entered allmulticast mode
02/04/2026, 08:31:58.770 PM
vethdc54a712: entered promiscuous mode
02/04/2026, 08:31:58.782 PM
cni0: port 39(vethdc54a712) entered blocking state
02/04/2026, 08:31:58.782 PM
cni0: port 39(vethdc54a712) entered forwarding state
02/04/2026, 08:31:58.782 PM
<info> [1770262318.7821] device (vethdc54a712): carrier: link connected
02/04/2026, 08:31:58.818 PM
Removed slice libcontainer container kubepods-burstable-pod88fc0060_efbe_4609_b85a_1cc7fc933392.slice.
02/04/2026, 08:31:58.939 PM
Started libcontainer container dcf937e221a1ce701650616656c3b33536453300ac08b3d6429f83ec30872478.
02/04/2026, 08:31:59.098 PM
var-lib-kubelet-pods-88fc0060\x2defbe\x2d4609\x2db85a\x2d1cc7fc933392-volumes-kubernetes.io\x7eprojected-kube\x2dapi\x2daccess\x2dz664h.mount: Deactivated successfully.
02/04/2026, 08:31:59.099 PM
var-lib-kubelet-pods-88fc0060\x2defbe\x2d4609\x2db85a\x2d1cc7fc933392-volumes-kubernetes.io\x7ecsi-media-mount.mount: Deactivated successfully.
02/04/2026, 08:31:59.100 PM
var-lib-kubelet-pods-88fc0060\x2defbe\x2d4609\x2db85a\x2d1cc7fc933392-volumes-kubernetes.io\x7elocal\x2dvolume-pvc\x2d49535829\x2dc73e\x2d4b50\x2d99f3\x2d028d21ed4e76.mount: Deactivated successfully.
02/04/2026, 08:31:59.147 PM
Started libcontainer container f1fc4a3ba21c71523ae5c55dc2ad2b2560c3d924f233f163bdd64a2033b89b00.
02/04/2026, 08:31:59.531 PM
I0205 03:31:59.530968 1205 pod_startup_latency_tracker.go:104] "Observed pod startup duration" pod="default/proxy-local-qbittorrent-98c78989c-zrl67" podStartSLOduration=1.530952622 podStartE2EDuration="1.530952622s" podCreationTimestamp="2026-02-05 03:31:58 +0000 UTC" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2026-02-05 03:31:59.528660964 +0000 UTC m=+31323.615624779" watchObservedRunningTime="2026-02-05 03:31:59.530952622 +0000 UTC m=+31323.617916417"
02/04/2026, 08:31:59.580 PM
I0205 03:31:59.580444 1205 event.go:389] "Event occurred" object="default/proxy-local-qbittorrent-bittorent" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="UpdatedLoadBalancer" message="Updated LoadBalancer with new IPs: [] -> [10.0.77.54]"
02/04/2026, 08:32:00.618 PM
I0205 03:32:00.618220 1205 kubelet_volumes.go:163] "Cleaned up orphaned pod volumes dir" podUID="88fc0060-efbe-4609-b85a-1cc7fc933392" path="/var/lib/kubelet/pods/88fc0060-efbe-4609-b85a-1cc7fc933392/volumes"
02/04/2026, 08:32:11.635 PM
wtmpdbd.service: Deactivated successfully.
02/04/2026, 08:32:35.561 PM
I0205 03:32:35.561559 manager.go:335 [media] stderr: read new data2 1770262350832676296 ns
02/04/2026, 08:32:35.561 PM
I0205 03:32:35.561704 manager.go:335 [media] stderr: read new data2 1770262350832920131 ns
02/04/2026, 08:32:35.561 PM
I0205 03:32:35.561724 manager.go:335 [media] stderr: read new data2 1770262350852472633 ns
02/04/2026, 08:32:35.561 PM
I0205 03:32:35.561819 manager.go:335 [media] stderr: read new data2 1770262350852537234 ns
02/04/2026, 08:32:35.561 PM
I0205 03:32:35.561830 manager.go:335 [media] stderr: read new data2 1770262350857176895 ns
02/04/2026, 08:32:35.561 PM
I0205 03:32:35.561866 manager.go:335 [media] stderr: read new data2 1770262350857258958 ns
02/04/2026, 08:32:35.561 PM
I0205 03:32:35.561869 manager.go:335 [media] stderr: read new data2 1770262350857344458 ns
02/04/2026, 08:32:35.562 PM
I0205 03:32:35.562005 manager.go:335 [media] stderr: read new data2 1770262350857391316 ns
02/04/2026, 08:32:35.562 PM
I0205 03:32:35.562037 manager.go:335 [media] stderr: read new data2 1770262350857494198 ns
And here are the logs where you can see it getting OOMed
02/04/2026, 10:13:41.852 PM
I0205 05:13:41.851606 main.go:70 mount service listening on unix:///var/lib/seaweedfs-mount/seaweedfs-mount.sock
02/04/2026, 10:13:41.779 PM
Started libcontainer container fbacde1c495526ce09e788b00a8e6d10c693f0ef77ffb3aa347fcbb4da8320af.
02/04/2026, 10:13:41.712 PM
var-lib-rancher-k3s-agent-containerd-tmpmounts-containerd\x2dmount1686074154.mount: Deactivated successfully.
02/04/2026, 10:13:41.531 PM
I0205 05:13:41.531749 1205 scope.go:117] "RemoveContainer" containerID="ca4cf40145b96062c618efd1f26aea190f854ab8b2d6dd817198858e30a3c374"
02/04/2026, 10:13:40.956 PM
run-k3s-containerd-io.containerd.runtime.v2.task-k8s.io-ca4cf40145b96062c618efd1f26aea190f854ab8b2d6dd817198858e30a3c374-rootfs.mount: Deactivated successfully.
02/04/2026, 10:13:40.893 PM
cri-containerd-ca4cf40145b96062c618efd1f26aea190f854ab8b2d6dd817198858e30a3c374.scope: Consumed 7min 14.362s CPU time, 7.8G memory peak, 56.1G read from disk, 23G written to disk.
02/04/2026, 10:13:40.858 PM
cri-containerd-ca4cf40145b96062c618efd1f26aea190f854ab8b2d6dd817198858e30a3c374.scope: Deactivated successfully.
02/04/2026, 10:13:40.359 PM
cri-containerd-ca4cf40145b96062c618efd1f26aea190f854ab8b2d6dd817198858e30a3c374.scope: A process of this unit has been killed by the OOM killer.
02/04/2026, 10:13:40.358 PM
cri-containerd-ca4cf40145b96062c618efd1f26aea190f854ab8b2d6dd817198858e30a3c374.scope: A process of this unit has been killed by the OOM killer.
02/04/2026, 10:13:40.358 PM
Memory cgroup out of memory: Killed process 228533 (weed) total-vm:9629040kB, anon-rss:8184848kB, file-rss:60kB, shmem-rss:0kB, UID:0 pgtables:16504kB oom_score_adj:-997
02/04/2026, 10:13:40.358 PM
Memory cgroup out of memory: Killed process 227851 (seaweedfs-mount) total-vm:1235092kB, anon-rss:7072kB, file-rss:172kB, shmem-rss:0kB, UID:0 pgtables:124kB oom_score_adj:-997
02/04/2026, 10:13:40.358 PM
Tasks in /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podd239fe1b_39b5_4f06_aace_7287494cdc65.slice/cri-containerd-ca4cf40145b96062c618efd1f26aea190f854ab8b2d6dd817198858e30a3c374.scope are going to be killed due to memory.oom.group set
02/04/2026, 10:13:40.358 PM
Memory cgroup out of memory: Killed process 228528 (weed) total-vm:9629040kB, anon-rss:8184848kB, file-rss:60kB, shmem-rss:0kB, UID:0 pgtables:16504kB oom_score_adj:-997
02/04/2026, 10:13:40.356 PM
oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=cri-containerd-ca4cf40145b96062c618efd1f26aea190f854ab8b2d6dd817198858e30a3c374.scope,mems_allowed=0,oom_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podd239fe1b_39b5_4f06_aace_7287494cdc65.slice,task_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podd239fe1b_39b5_4f06_aace_7287494cdc65.slice/cri-containerd-ca4cf40145b96062c618efd1f26aea190f854ab8b2d6dd817198858e30a3c374.scope,task=weed,pid=228528,uid=0
02/04/2026, 10:13:40.356 PM
[ 228528] 0 228528 2407260 2046227 2046212 15 0 16900096 0 -997 weed
02/04/2026, 10:13:40.356 PM
[ 227851] 0 227851 308773 1811 1768 43 0 126976 0 -997 seaweedfs-mount
02/04/2026, 10:13:40.356 PM
[ 227825] 65535 227825 245 13 0 13 0 36864 0 -998 pause
02/04/2026, 10:13:40.356 PM
[ pid ] uid tgid total_vm rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
02/04/2026, 10:13:40.356 PM
Tasks state (memory values in pages):
02/04/2026, 10:13:40.356 PM
numa_hint_faults 0
02/04/2026, 10:13:40.356 PM
numa_pte_updates 0
02/04/2026, 10:13:40.356 PM
numa_pages_migrated 0
02/04/2026, 10:13:40.355 PM
thp_swpout_fallback 0
02/04/2026, 10:13:40.354 PM
thp_swpout 0
02/04/2026, 10:13:40.354 PM
thp_collapse_alloc 126
02/04/2026, 10:13:40.354 PM
thp_fault_alloc 1184
02/04/2026, 10:13:40.354 PM
zswpwb 0
02/04/2026, 10:13:40.354 PM
zswpout 0
02/04/2026, 10:13:40.354 PM
zswpin 0
02/04/2026, 10:13:40.354 PM
swpout_zero 0
02/04/2026, 10:13:40.354 PM
swpin_zero 0
02/04/2026, 10:13:40.354 PM
pglazyfreed 0
02/04/2026, 10:13:40.354 PM
pglazyfree 0
02/04/2026, 10:13:40.354 PM
pgdeactivate 3386026
02/04/2026, 10:13:40.354 PM
pgactivate 591782
02/04/2026, 10:13:40.354 PM
pgrefill 3480894
02/04/2026, 10:13:40.354 PM
pgmajfault 6269
02/04/2026, 10:13:40.354 PM
pgfault 1597060
02/04/2026, 10:13:40.354 PM
pgsteal_proactive 0
02/04/2026, 10:13:40.354 PM
pgsteal_khugepaged 1783
02/04/2026, 10:13:40.354 PM
pgsteal_direct 2147378
02/04/2026, 10:13:40.354 PM
pgsteal_kswapd 16616823
02/04/2026, 10:13:40.354 PM
pgscan_proactive 0
02/04/2026, 10:13:40.354 PM
pgscan_khugepaged 1783
02/04/2026, 10:13:40.354 PM
pgscan_direct 10179751
02/04/2026, 10:13:40.353 PM
pgscan_kswapd 16911573
02/04/2026, 10:13:40.353 PM
pswpout 0
02/04/2026, 10:13:40.353 PM
pswpin 0
02/04/2026, 10:13:40.353 PM
pgsteal 18765984
02/04/2026, 10:13:40.353 PM
pgscan 27093107
02/04/2026, 10:13:40.353 PM
pgpromote_success 0
02/04/2026, 10:13:40.353 PM
pgdemote_proactive 0
02/04/2026, 10:13:40.353 PM
pgdemote_khugepaged 0
02/04/2026, 10:13:40.353 PM
pgdemote_direct 0
02/04/2026, 10:13:40.353 PM
pgdemote_kswapd 0
02/04/2026, 10:13:40.353 PM
workingset_nodereclaim 0
02/04/2026, 10:13:40.353 PM
workingset_restore_file 974536
02/04/2026, 10:13:40.353 PM
workingset_restore_anon 0
02/04/2026, 10:13:40.353 PM
workingset_activate_file 3047139
02/04/2026, 10:13:40.353 PM
workingset_activate_anon 0
02/04/2026, 10:13:40.353 PM
workingset_refault_file 14604982
02/04/2026, 10:13:40.353 PM
workingset_refault_anon 0
02/04/2026, 10:13:40.353 PM
slab 6325240
02/04/2026, 10:13:40.353 PM
slab_unreclaimable 744600
02/04/2026, 10:13:40.353 PM
slab_reclaimable 5580640
02/04/2026, 10:13:40.353 PM
unevictable 0
02/04/2026, 10:13:40.353 PM
active_file 167936
02/04/2026, 10:13:40.353 PM
inactive_file 11300864
02/04/2026, 10:13:40.353 PM
active_anon 20480
02/04/2026, 10:13:40.353 PM
inactive_anon 8388886528
02/04/2026, 10:13:40.353 PM
shmem_thp 0
02/04/2026, 10:13:40.353 PM
file_thp 0
02/04/2026, 10:13:40.353 PM
anon_thp 2382364672
02/04/2026, 10:13:40.353 PM
swapcached 0
02/04/2026, 10:13:40.353 PM
file_writeback 0
02/04/2026, 10:13:40.353 PM
file_dirty 0
02/04/2026, 10:13:40.353 PM
file_mapped 65536
02/04/2026, 10:13:40.353 PM
zswapped 0
02/04/2026, 10:13:40.353 PM
zswap 0
02/04/2026, 10:13:40.353 PM
shmem 0
02/04/2026, 10:13:40.353 PM
vmalloc 0
02/04/2026, 10:13:40.353 PM
sock 26472448
02/04/2026, 10:13:40.353 PM
percpu 20448
02/04/2026, 10:13:40.353 PM
sec_pagetables 0
02/04/2026, 10:13:40.353 PM
pagetables 0
02/04/2026, 10:13:40.352 PM
kernel_stack 638976
02/04/2026, 10:13:40.352 PM
kernel 24088576
02/04/2026, 10:13:40.352 PM
file 8536064
02/04/2026, 10:13:40.352 PM
anon 8388907008
02/04/2026, 10:13:40.352 PM
Memory cgroup stats for /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podd239fe1b_39b5_4f06_aace_7287494cdc65.slice:
02/04/2026, 10:13:40.352 PM
swap: usage 0kB, limit 9007199254740988kB, failcnt 0
02/04/2026, 10:13:40.352 PM
memory: usage 8253440kB, limit 8253440kB, failcnt 52967
02/04/2026, 10:13:40.352 PM
</TASK>
02/04/2026, 10:13:40.351 PM
R13: 0000000000000008 R14: 000000c0001021c0 R15: 000000000000000e
02/04/2026, 10:13:40.351 PM
R10: 00007f3448e938b0 R11: 0000000000000000 R12: 000000c004f67500
02/04/2026, 10:13:40.351 PM
RBP: 000000c00010bbe0 R08: 0000000007918e20 R09: 0000000000000030
02/04/2026, 10:13:40.351 PM
RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000101
02/04/2026, 10:13:40.351 PM
RAX: 0000000000000000 RBX: 000000c00010bd70 RCX: 000000c00010bc40
02/04/2026, 10:13:40.351 PM
RSP: 002b:000000c00010bb70 EFLAGS: 00010206
02/04/2026, 10:13:40.350 PM
Code: Unable to access opcode bytes at 0x463b7b.
02/04/2026, 10:13:40.350 PM
RIP: 0033:0x463ba5
02/04/2026, 10:13:40.350 PM
asm_exc_page_fault+0x26/0x30
02/04/2026, 10:13:40.350 PM
exc_page_fault+0x69/0x170
02/04/2026, 10:13:40.350 PM
do_user_addr_fault+0x21a/0x690
02/04/2026, 10:13:40.350 PM
handle_mm_fault+0xe7/0x2d0
02/04/2026, 10:13:40.350 PM
__handle_mm_fault+0x8b8/0xf00
02/04/2026, 10:13:40.350 PM
do_fault+0x38d/0x600
02/04/2026, 10:13:40.350 PM
__do_fault+0x34/0x1d0
02/04/2026, 10:13:40.350 PM
filemap_fault+0x112/0x1610
02/04/2026, 10:13:40.349 PM
? page_cache_ra_unbounded+0x1ac/0x270
02/04/2026, 10:13:40.349 PM
__filemap_get_folio+0x1c6/0x550
02/04/2026, 10:13:40.349 PM
? alloc_pages_mpol+0x86/0x170
02/04/2026, 10:13:40.349 PM
filemap_add_folio+0x7e/0x210
02/04/2026, 10:13:40.349 PM
__mem_cgroup_charge+0x30/0x90
02/04/2026, 10:13:40.349 PM
charge_memcg+0x2f/0x70
02/04/2026, 10:13:40.347 PM
try_charge_memcg+0x430/0x650
02/04/2026, 10:13:40.347 PM
mem_cgroup_out_of_memory+0xc2/0xd0
02/04/2026, 10:13:40.345 PM
out_of_memory+0x210/0x500
02/04/2026, 10:13:40.336 PM
oom_kill_process.cold+0x8/0x90
02/04/2026, 10:13:40.336 PM
dump_header+0x43/0x1aa
02/04/2026, 10:13:40.336 PM
dump_stack_lvl+0x5b/0x80
02/04/2026, 10:13:40.336 PM
<TASK>
02/04/2026, 10:13:40.336 PM
Call Trace:
02/04/2026, 10:13:40.334 PM
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 3.20230228-4 06/06/2023
02/04/2026, 10:13:40.333 PM
CPU: 0 UID: 0 PID: 228533 Comm: weed Not tainted 6.18.8-1-default #1 PREEMPT(voluntary) openSUSE Tumbleweed 4eb7683dc1a12cae14b0b6e4bc7d5605cb13a285
02/04/2026, 10:13:40.274 PM
weed invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-997
02/04/2026, 10:13:36.354 PM
I0205 05:13:36.353953 manager.go:335 [media] stderr: read new data2 1770268406108804553 ns
02/04/2026, 10:13:36.354 PM
I0205 05:13:36.353951 manager.go:335 [media] stderr: read new data2 1770268406108722929 ns
02/04/2026, 10:13:36.354 PM
I0205 05:13:36.353949 manager.go:335 [media] stderr: read new data2 1770268406108621400 ns
02/04/2026, 10:13:36.354 PM
I0205 05:13:36.353941 manager.go:335 [media] stderr: read new data2 1770268406108485164 ns
02/04/2026, 10:13:36.354 PM
I0205 05:13:36.353936 manager.go:335 [media] stderr: read new data2 1770268406108313723 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353934 manager.go:335 [media] stderr: read new data2 1770268406108163552 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353923 manager.go:335 [media] stderr: read new data2 1770268406167524207 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353291 manager.go:335 [media] stderr: read new data2 1770268406162929235 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353278 manager.go:335 [media] stderr: read new data2 1770268406174086373 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353274 manager.go:335 [media] stderr: read new data2 1770268406141236602 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353272 manager.go:335 [media] stderr: read new data2 1770268406170717926 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353258 manager.go:335 [media] stderr: read new data2 1770268406157001988 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353255 manager.go:335 [media] stderr: read new data2 1770268404549432031 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353251 manager.go:335 [media] stderr: read new data2 1770268406162494801 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353248 manager.go:335 [media] stderr: read new data2 1770268406116112414 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353243 manager.go:335 [media] stderr: read new data2 1770268406126996902 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353239 manager.go:335 [media] stderr: read new data2 1770268404525444892 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353237 manager.go:335 [media] stderr: read new data2 1770268405826195551 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353235 manager.go:335 [media] stderr: read new data2 1770268406172990591 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353233 manager.go:335 [media] stderr: read new data2 1770268406125892965 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353231 manager.go:335 [media] stderr: read new data2 1770268404524028129 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353201 manager.go:335 [media] stderr: read new data2 1770268404522823745 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353199 manager.go:335 [media] stderr: read new data2 1770268404543921163 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353197 manager.go:335 [media] stderr: read new data2 1770268404549331292 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353195 manager.go:335 [media] stderr: read new data2 1770268406111751510 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353193 manager.go:335 [media] stderr: read new data2 1770268406107030851 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353191 manager.go:335 [media] stderr: read new data2 1770268406107946666 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353142 manager.go:335 [media] stderr: read new data2 1770268404501580012 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353139 manager.go:335 [media] stderr: read new data2 1770268404540559749 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353136 manager.go:335 [media] stderr: read new data2 1770268406162169212 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353133 manager.go:335 [media] stderr: read new data2 1770268404509416954 ns
02/04/2026, 10:13:36.353 PM
I0205 05:13:36.353122 manager.go:335 [media] stderr: read new data2 1770268404509274848 ns
02/04/2026, 10:13:36.352 PM
I0205 05:13:36.352528 manager.go:335 [media] stderr: read new data2 1770268404501452964 ns
02/04/2026, 10:13:36.352 PM
I0205 05:13:36.352525 manager.go:335 [media] stderr: read new data2 1770268404502299008 ns
02/04/2026, 10:13:36.352 PM
I0205 05:13:36.352522 manager.go:335 [media] stderr: read new data2 1770268404534914731 ns
02/04/2026, 10:13:36.352 PM
I0205 05:13:36.352520 manager.go:335 [media] stderr: read new data2 1770268404509098447 ns
02/04/2026, 10:13:36.352 PM
I0205 05:13:36.352515 manager.go:335 [media] stderr: read new data2 1770268404525334144 ns
02/04/2026, 10:13:36.352 PM
I0205 05:13:36.352508 manager.go:335 [media] stderr: read new data2 1770268404525155891 ns
02/04/2026, 10:13:36.352 PM
I0205 05:13:36.352505 manager.go:335 [media] stderr: read new data2 1770268404537952960 ns
02/04/2026, 10:13:36.352 PM
I0205 05:13:36.352502 manager.go:335 [media] stderr: read new data2 1770268404195716213 ns
02/04/2026, 10:13:36.352 PM
I0205 05:13:36.352500 manager.go:335 [media] stderr: read new data2 1770268404508996878 ns
02/04/2026, 10:13:36.352 PM
I0205 05:13:36.352497 manager.go:335 [media] stderr: read new data2 1770268404508896499 ns
02/04/2026, 10:13:36.352 PM
I0205 05:13:36.352487 manager.go:335 [media] stderr: read new data2 1770268406762411758 ns
02/04/2026, 10:13:36.352 PM
I0205 05:13:36.352248 manager.go:335 [media] stderr: read new data2 1770268404501343048 ns
02/04/2026, 10:13:36.352 PM
I0205 05:13:36.352245 manager.go:335 [media] stderr: read new data2 1770268404501208847 ns
02/04/2026, 10:13:36.352 PM
I0205 05:13:36.352231 manager.go:335 [media] stderr: read new data2 1770268404501062153 ns
02/04/2026, 10:13:36.352 PM
I0205 05:13:36.352059 manager.go:335 [media] stderr: read new data2 1770268404500957947 ns
02/04/2026, 10:13:36.351 PM
I0205 05:13:36.351746 manager.go:335 [media] stderr: read new data2 1770268404500844965 ns
02/04/2026, 10:13:36.351 PM
I0205 05:13:36.351139 manager.go:335 [media] stderr: read new data2 1770268404500666641 ns
02/04/2026, 10:13:36.351 PM
I0205 05:13:36.351137 manager.go:335 [media] stderr: read new data2 1770268404500552318 ns
02/04/2026, 10:13:36.351 PM
I0205 05:13:36.351136 manager.go:335 [media] stderr: read new data2 1770268404500435168 ns
02/04/2026, 10:13:36.351 PM
I0205 05:13:36.351134 manager.go:335 [media] stderr: read new data2 1770268404500311087 ns
02/04/2026, 10:13:36.351 PM
I0205 05:13:36.351132 manager.go:335 [media] stderr: read new data2 1770268404500105110 ns
02/04/2026, 10:13:36.351 PM
I0205 05:13:36.351131 manager.go:335 [media] stderr: read new data2 1770268403703536312 ns
02/04/2026, 10:13:36.351 PM
I0205 05:13:36.351129 manager.go:335 [media] stderr: read new data2 1770268403708519230 ns