Skip to content

Commit 8145911

Browse files
zram: enable zram swap for the vms to reduce pressure
The vms, have limited memory in some cases and when performing tasks such as a resume, it can be seen that some drivers are failing to aquire enough memory to reinitialize. This is likely gonna bite in other situations, adding compressed zram can absorbe some of those pressures and allow the system to continue normally. Signed-off-by: Brian McGillion <bmg.avoin@gmail.com>
1 parent 6a25444 commit 8145911

File tree

12 files changed

+120
-2
lines changed

12 files changed

+120
-2
lines changed

docs/src/content/docs/ghaf/dev/architecture/vm-composition.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -458,5 +458,6 @@ The extendModules pattern enables:
458458

459459
For more details:
460460
- [Configuration Propagation](/ghaf/dev/architecture/config-propagation) - How globalConfig/hostConfig work
461+
- [VM Memory Management](/ghaf/dev/architecture/vm-memory-management) - How VMs handle memory pressure with zram and balloon deflation
461462
- [Creating VMs Guide](/ghaf/dev/guides/creating-vms) - Step-by-step VM creation
462463
- [Downstream Setup Guide](/ghaf/dev/guides/downstream-setup) - Building on top of Ghaf
Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
---
2+
title: "VM Memory Management"
3+
---
4+
{/*
5+
SPDX-FileCopyrightText: 2022-2026 TII (SSRC) and the Ghaf contributors
6+
SPDX-License-Identifier: CC-BY-SA-4.0
7+
*/}
8+
9+
# VM Memory Management
10+
11+
Ghaf VMs run with fixed memory allocations and, without swap, have no safety net for transient memory pressure. This is a problem during events like S3 (suspend-to-RAM) resume, where device drivers must reinitialize and temporarily allocate buffers, DMA regions, and firmware blobs. If the VM is already near its memory ceiling, these allocations can trigger OOM kills or kernel panics.
12+
13+
Ghaf addresses this with a three-layer defense against OOM crashes: zram compressed swap inside every VM, virtio-balloon deflation for app VMs, and host-level disk swap that absorbs the resulting pressure.
14+
15+
## Layer 1: zram Compressed Swap (All VMs)
16+
17+
Every VM enables [zram](https://docs.kernel.org/admin-guide/blockdev/zram.html), a kernel feature that creates a compressed block device in RAM and uses it as swap space. When memory pressure rises, the kernel compresses inactive pages (file caches, idle application memory) into the zram device instead of killing processes.
18+
19+
Configuration (from `modules/microvm/common/vm-swap.nix`):
20+
21+
- **Algorithm**: `lzo-rle` (fast, low CPU overhead)
22+
- **Size**: 25% of VM RAM (`memoryPercent = 25`)
23+
- **Swappiness**: `vm.swappiness = 10` (prefer reclaiming file caches before swapping)
24+
25+
With a typical 2.5x compression ratio, 25% of RAM as zram yields approximately 62% more effective memory. For a 512 MB VM, this means approximately 832 MB of effective memory. The overhead is microseconds of CPU time for compression, with zero disk I/O.
26+
27+
## Layer 2: Balloon deflateOnOOM (App VMs)
28+
29+
App VMs use the [virtio-balloon](https://blog.pmhahn.de/virtio-balloon/) device for dynamic memory management. The host memory manager can inflate the balloon (reclaiming guest pages for the host) or the guest can deflate it (reclaiming pages back from the host).
30+
31+
With `deflateOnOOM = true`, the guest kernel automatically deflates the balloon when an OOM condition is imminent. This reclaims pages that were previously loaned to the host, giving the guest more memory without any host intervention.
32+
33+
For example, a 4 GB base app VM with `balloonRatio = 2` has a 12 GB QEMU allocation. If the balloon has inflated to 6 GB (guest sees approximately 6 GB), an OOM event causes the balloon to deflate, and the guest can reclaim up to the full 12 GB allocation.
34+
35+
System VMs (gui-vm, net-vm, audio-vm, admin-vm, ids-vm) do not use balloons, so this layer applies only to app VMs.
36+
37+
## Layer 3: Host Disk Swap
38+
39+
When app VM balloons deflate, the QEMU process on the host consumes more physical memory. The host has an 8 GB disk swap partition that absorbs this pressure. The host kernel can page out its own processes or inflate balloons on other idle VMs (via `ghaf-mem-manager`) to rebalance.
40+
41+
## How the Layers Cascade During S3 Resume
42+
43+
1. **Suspend**: The host enters S3 sleep. QEMU freezes VMs. RAM stays powered and contents are preserved.
44+
2. **Resume**: The host wakes. QEMU unfreezes VMs. Device drivers reinitialize, allocating temporary buffers.
45+
3. **zram absorbs the spike**: The kernel compresses idle pages into zram to make room for driver buffers (microsecond latency).
46+
4. **Balloon deflates if needed**: If more memory is required, the balloon shrinks and the guest reclaims pages from the host.
47+
5. **Host swap absorbs the host-side impact**: The host uses its disk swap if balloon deflation increases host memory pressure.
48+
6. **Drivers initialize successfully**: Temporary buffers are freed and the system stabilizes.
49+
50+
## Configuration Reference
51+
52+
### vm-swap.nix Options
53+
54+
| Option | Type | Default | Description |
55+
|--------|------|---------|-------------|
56+
| `ghaf.virtualization.microvm.swap.enable` | bool | `false` | Enable zram compressed swap |
57+
58+
All VM bases set this to `true`. To disable zram for a specific VM, override it in that VM's `extraModules`.
59+
60+
### vm-config.nix Balloon Options
61+
62+
| Option | Description |
63+
|--------|-------------|
64+
| `balloonRatio` | Multiplier for QEMU memory allocation beyond base `mem`. A ratio of 2 means QEMU allocates `mem * (balloonRatio + 1)`. |
65+
| `deflateOnOOM` | Set in `appvm-base.nix`. When `true`, the guest automatically reclaims ballooned pages on OOM. |
66+
67+
## Further Reading
68+
69+
- [Linux zram documentation](https://docs.kernel.org/admin-guide/blockdev/zram.html)
70+
- [VirtIO Memory Ballooning](https://blog.pmhahn.de/virtio-balloon/)
71+
- [Firecracker ballooning documentation](https://github.com/firecracker-microvm/firecracker/blob/main/docs/ballooning.md)
72+
- [VM Memory Zeroing and Wipe on Shutdown](/ghaf/overview/arch/vm-memory-wipe) -- how Ghaf handles memory clearing
73+
- [VM Composition with extendModules](/ghaf/dev/architecture/vm-composition) -- how VM configuration flows

docs/src/content/docs/ghaf/overview/arch/vm-memory-wipe.mdx

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,3 +46,7 @@ For implementation details and configuration knobs, see [Memory Wipe on Boot and
4646
## Summary
4747

4848
VM shutdown is a sensitive moment because memory pages leave one trust domain and return to the host allocator. Ghaf eliminates the risk of residual data reuse by enabling kernel features that wipe memory on free and zero memory on allocation. This protects secrets, prevents cross-VM leakage, and keeps the VM boundary trustworthy throughout the VM lifecycle.
49+
50+
## See Also
51+
52+
- [VM Memory Management](/ghaf/dev/architecture/vm-memory-management) -- how Ghaf uses zram swap and balloon deflation to prevent OOM crashes during runtime

modules/hardware/x86_64-generic/kernel/guest/configs/ghaf_guest_hardened_baseline-x86

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1106,6 +1106,8 @@ CONFIG_BLK_DEV=y
11061106
#
11071107
# CONFIG_BLK_DEV_NBD is not set
11081108
# CONFIG_BLK_DEV_RAM is not set
1109+
CONFIG_ZRAM=y
1110+
CONFIG_ZSMALLOC=y
11091111
# CONFIG_CDROM_PKTCDVD is not set
11101112
# CONFIG_ATA_OVER_ETH is not set
11111113
# CONFIG_BLK_DEV_UBLK is not set
@@ -2260,7 +2262,7 @@ CONFIG_CRYPTO_CRC32C=y
22602262
# Compression
22612263
#
22622264
# CONFIG_CRYPTO_DEFLATE is not set
2263-
# CONFIG_CRYPTO_LZO is not set
2265+
CONFIG_CRYPTO_LZO=y
22642266
# CONFIG_CRYPTO_842 is not set
22652267
# CONFIG_CRYPTO_LZ4 is not set
22662268
# CONFIG_CRYPTO_LZ4HC is not set
@@ -2411,6 +2413,7 @@ CONFIG_CRC32_SLICEBY8=y
24112413
CONFIG_XXHASH=y
24122414
# CONFIG_RANDOM32_SELFTEST is not set
24132415
CONFIG_ZLIB_INFLATE=y
2416+
CONFIG_LZO_COMPRESS=y
24142417
CONFIG_LZO_DECOMPRESS=y
24152418
CONFIG_LZ4_DECOMPRESS=y
24162419
CONFIG_ZSTD_COMMON=y

modules/microvm/common/vm-swap.nix

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# SPDX-FileCopyrightText: 2022-2026 TII (SSRC) and the Ghaf contributors
2+
# SPDX-License-Identifier: Apache-2.0
3+
{
4+
config,
5+
lib,
6+
...
7+
}:
8+
let
9+
cfg = config.ghaf.virtualization.microvm.swap;
10+
in
11+
{
12+
options.ghaf.virtualization.microvm.swap = {
13+
enable = lib.mkEnableOption "zram compressed swap for VMs";
14+
};
15+
16+
config = lib.mkIf cfg.enable {
17+
zramSwap = {
18+
enable = true;
19+
algorithm = "lzo-rle";
20+
memoryPercent = 25;
21+
};
22+
boot.kernel.sysctl."vm.swappiness" = 10;
23+
};
24+
}

modules/microvm/flake-module.nix

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ _: {
3131
./common/shared-directory.nix
3232
./common/storagevm.nix
3333
./common/vm-networking.nix
34+
./common/vm-swap.nix
3435
./common/vm-tpm.nix
3536
./common/waypipe.nix
3637
./common/xdghandlers.nix

modules/microvm/sysvms/adminvm-base.nix

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,8 @@ in
9898

9999
# Networking
100100
virtualization.microvm = {
101+
swap.enable = true;
102+
101103
vm-networking = {
102104
enable = true;
103105
inherit vmName;

modules/microvm/sysvms/appvm-base.nix

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,8 @@ in
207207
encryption.enable = globalConfig.storage.encryption.enable or false;
208208
};
209209

210+
virtualization.microvm.swap.enable = true;
211+
210212
# Networking
211213
virtualization.microvm.vm-networking = {
212214
enable = true;
@@ -319,7 +321,7 @@ in
319321
# Sensible defaults based on vm definition - can be further overridden via vmConfig
320322
mem = lib.mkDefault ((vm.mem or 4096) * ((vm.balloonRatio or 2) + 1));
321323
balloon = (vm.balloonRatio or 2) > 0;
322-
deflateOnOOM = false;
324+
deflateOnOOM = true;
323325
vcpu = lib.mkDefault (vm.vcpu or 4);
324326
hypervisor = "qemu";
325327

modules/microvm/sysvms/audiovm-base.nix

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,8 @@ in
9292

9393
# Networking
9494
virtualization.microvm = {
95+
swap.enable = true;
96+
9597
vm-networking = {
9698
enable = true;
9799
inherit vmName;

modules/microvm/sysvms/guivm-base.nix

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -153,6 +153,8 @@ in
153153

154154
# Networking
155155
virtualization.microvm = {
156+
swap.enable = true;
157+
156158
vm-networking = {
157159
enable = true;
158160
inherit vmName;

0 commit comments

Comments
 (0)