PostgreSQL HugePages on Colima vz (Apple Silicon): 4 gotchas that aren't documented anywhere #1553
Brietat71
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
PostgreSQL HugePages on Colima vz (Apple Silicon): 4 gotchas that aren't documented anywhere
Running PG 17 on a Mac Mini M1 16 GB with Colima 0.10.1 (vz). Spent an evening getting HugePages to work. Cache hit ratio went from 84% to 99%. Sharing the gotchas because each one fails silently — PG just falls back to 4 KB pages without a single log line.
Setup
colima.yaml:
docker-compose.yml:
The 4 gotchas
1.
gosudrops your capabilitiesThe postgres Alpine image uses
gosuto switch root → postgres. It clears all effective caps (CapEff=0x0). Socap_add: [IPC_LOCK]is in the bounding set but not effective. PG callsmmap(MAP_HUGETLB), kernel checks effective caps → nope → silent fallback.Fix:
sysctl -w vm.hugetlb_shm_group=70(postgres GID in Alpine). The kernel accepts group membership as fallback.2. PG 17 needs 1618 pages for 3 GB, not 1536
3 GB / 2 MB = 1536. Sounds right? PG actually requests 3,393,191,936 bytes = 1618 pages. We set 1600 ("with margin"). Were 18 pages short. Silent ENOMEM, silent fallback. An hour of debugging for 36 MB.Fix:
postgres -C shared_memory_size_in_huge_pagestells you the exact number. Add 5% margin.3. Kernel 6.8 counts HugePages in cgroup memory.max
Container with
memory: 8gtries to allocate 3.2 GB HugePages + 5 GB process = OOM. Even thoughHugePages_Freeshows plenty available.Fix:
memory: 12g(it's a ceiling, not a reservation).4.
mode: systemis not optionalHugePages need contiguous memory. After dockerd starts, the VM memory is fragmented.
nr_hugepagessilently allocates fewer pages than requested.mode: systemruns before dockerd, when memory is clean.How to verify
colima exec -- grep HugePages /proc/meminfoAfter
docker compose up,HugePages_Freeshould drop by ~1618. If unchanged: PG is not using them. It's always gosu, the page count, or the cgroup limit. Probably all three at once.Bonus: the cgroup trick
When HugePages work, shared_buffers live outside the container's memory cgroup. Docker thinks PG uses 53 MB. The freed 3 GB becomes Linux page cache. That's what actually drove our cache hit from 84% to 99% — we didn't even plan it.
Also useful for vz backend
Environment
Mac Mini M1, 16 GB, Sequoia 15.7.5, Colima 0.10.1 vz, Linux 6.8, PG 17 Alpine.
Hope this saves someone an evening.
Beta Was this translation helpful? Give feedback.
All reactions