Skip to content

[Bug] soci-snapshotter-grpc.service takes up huge memory #1850

@prafgup

Description

@prafgup

Description

I was trying to profile the memory usage by SOCI locally as well as on a k8s node and figured that the memory usage by soci keeps on increasing even with no pods running on the node.

I have tried using containerd v1.7.27 and v2.2.0 which both having the same issue.

Command to check the usage by system.slice/soci-snapshotter-grpc.service

systemd-cgtop -m

Output -

Image

^ see above that soci is using 2.6+ GB without any containers running currently using soci.

ls -lah /var/lib/soci-snapshotter-grpc

total 72M
drwxr-xr-x  6 root root 4.0K Jan 19 18:38 .
drwxr-xr-x 56 root root 4.0K Jan 19 18:41 ..
drwxr-xr-x  4 root root 4.0K Jan 20 17:58 content
-rw-------  1 root root  88M Jan 20 18:26 metadata.db
drwx------  3 root root 4.0K Jan 19 18:38 snapshotter
drwx------  4 root root 4.0K Jan 20 17:58 soci
drwx------  2 root root 4.0K Jan 19 18:38 unpack



sudo du -h --max-depth=2

4.0K	./unpack
11G	./snapshotter/snapshots
11G	./snapshotter
2.0G	./soci/spancache
180K	./soci/httpcache
2.0G	./soci
263M	./content/blobs
4.0K	./content/ingest
263M	./content
13G	.

I am using a basic soci config as such -

 cat /etc/soci-snapshotter-grpc/config.toml


[http]
  DialTimeoutMsec = 10000
  ResponseHeaderTimeoutMsec = 10000
  RequestTimeoutMsec = 300000
  MaxRetries = 10
  MinWaitMsec = 300
  MaxWaitMsec = 300000


[cri_keychain]
enable_keychain = true
image_service_path = "/run/containerd/containerd.sock"

This would to be an issue where a system would be running multiple unique images. The ram usage keeps on increasing as well as cpu usage (even after pods were deleted).

sudo systemctl status soci-snapshotter-grpc
● soci-snapshotter-grpc.service - SOCI Snapshotter gRPC
     Loaded: loaded (/etc/systemd/system/soci-snapshotter-grpc.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2026-01-19 18:38:18 UTC; 24h ago
   Main PID: 924 (soci-snapshotte)
      Tasks: 107 (limit: 75875)
     Memory: 2.6G
        CPU: 6min 58.194s
     CGroup: /system.slice/soci-snapshotter-grpc.service
             └─924 /usr/local/bin/soci-snapshotter-grpc --address /run/soci-snapshotter-grpc/soci-snapshotter-grpc.sock --config /etc/soci-snapshotter-grpc/config.toml --log-level info

Steps to reproduce the bug

  1. Launch soci-snapshotter-grpc as a systemd service
  2. Check memory usage systemd-cgtop -m (soci would be using ~10mb)
  3. Pull a soci image and run it (use a pod for k8s). (memory spikes to 100-200mb)
  4. Wait for sometime while the background spans are fetched (memory keeps on spiking to 1-2GB, the whole image is less than the memory usage)
  5. [Repeat step 3 multiple times if one wants]
  6. Delete the pod (memory is still not released)

Describe the results you expected

I expect soci to not have constant high memory usage and memory to be released once the pod is deleted. Soci should cache the blobs on disk and not keep it in memory, as the memory would keep on filling up with new unique chunks and soon OOM.

Host information

  1. OS: x86_64
  2. Snapshotter Version: v0.12.0
  3. Containerd Version: v1.7.27 and v2.2.0

Any additional context or information about the bug

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions