Use case
I'm building forkd, an open-source fork() primitive for microVMs aimed at AI agent fan-out. v0.4 wants a "live-fork" path where an external snapshot manager arms UFFDIO_WRITEPROTECT on the source VM's memory and captures dirty pages asynchronously — moving the memory write out of the BRANCH pause window. Empirically on kernel 6.14, this drops the BRANCH pause for a 1 GiB parent from ~150 ms (v0.3.4 floor on ext4) to ~3 ms (just the UFFDIO_WRITEPROTECT arm).
The blocker is that MemBackendType::File in snapshot_load mmaps the backing file with MAP_PRIVATE (see src/vmm/src/vstate/memory.rs::snapshot_file). If forkd hands FC a /proc/<forkd_pid>/fd/<memfd> as mem_backend.backend_path, FC opens that path but the resulting mapping is MAP_PRIVATE — guest writes CoW to FC-private pages and never propagate back to forkd's mmap of the same memfd. The cooperative-snapshot design is impossible.
Empirically verified (forkd's memfd-share spike — 30-line Python script). The Uffd backend isn't a substitute because it gives forkd the uffd fd but no read access to FC's memory (no shared mapping).
Proposed minimal API
Opt-in shared: bool on MemBackendConfig, default false (unchanged behavior):
pub struct MemBackendConfig {
pub backend_path: PathBuf,
pub backend_type: MemBackendType,
/// If true and backend_type == File, mmap with MAP_SHARED.
/// Defaults to false; ignored for Uffd.
#[serde(default)]
pub shared: bool,
}
Implementation is ~39 lines across 4 files (MemBackendConfig + plumbing through guest_memory_from_file to snapshot_file). Build verified against main (commit 053f521d9); full patch saved as a unified diff here.
Open API question for you
Three shapes I considered. I'd prefer your guidance before sending a PR:
- Opt-in field (above). Smallest surface. My current preference.
- New backend type
MemBackendType::SharedFile. More explicit, doubles the match-arms.
- A
shared_mmap: bool at the top-level LoadSnapshotConfig instead of nested in mem_backend.
I'll send a PR in whichever shape you confirm works.
Backing context
forkd is Apache 2.0, 720+ stars, real production-aimed project. Happy to test whatever shape lands, and to maintain the patch in forkd's repo as a user-applied diff until landed if that helps.
Thanks for reading.
Use case
I'm building forkd, an open-source
fork()primitive for microVMs aimed at AI agent fan-out. v0.4 wants a "live-fork" path where an external snapshot manager armsUFFDIO_WRITEPROTECTon the source VM's memory and captures dirty pages asynchronously — moving the memory write out of the BRANCH pause window. Empirically on kernel 6.14, this drops the BRANCH pause for a 1 GiB parent from ~150 ms (v0.3.4 floor on ext4) to ~3 ms (just theUFFDIO_WRITEPROTECTarm).The blocker is that
MemBackendType::Fileinsnapshot_loadmmaps the backing file withMAP_PRIVATE(seesrc/vmm/src/vstate/memory.rs::snapshot_file). If forkd hands FC a/proc/<forkd_pid>/fd/<memfd>asmem_backend.backend_path, FC opens that path but the resulting mapping isMAP_PRIVATE— guest writes CoW to FC-private pages and never propagate back to forkd's mmap of the same memfd. The cooperative-snapshot design is impossible.Empirically verified (forkd's memfd-share spike — 30-line Python script). The
Uffdbackend isn't a substitute because it gives forkd the uffd fd but no read access to FC's memory (no shared mapping).Proposed minimal API
Opt-in
shared: boolonMemBackendConfig, defaultfalse(unchanged behavior):Implementation is ~39 lines across 4 files (MemBackendConfig + plumbing through
guest_memory_from_filetosnapshot_file). Build verified againstmain(commit053f521d9); full patch saved as a unified diff here.Open API question for you
Three shapes I considered. I'd prefer your guidance before sending a PR:
MemBackendType::SharedFile. More explicit, doubles the match-arms.shared_mmap: boolat the top-levelLoadSnapshotConfiginstead of nested inmem_backend.I'll send a PR in whichever shape you confirm works.
Backing context
UFFD_WParm, EPT-mediated guest writes do propagate to UFFD_WP on the host VMA, snapshot restore round-trip works):experiments/v0.4-*-poc/RESULTS.mdVmstateOnlysnapshot type,process_vm_readvbypass): DESIGN-v0.4-PHASE3-SPIKE.mdforkd is Apache 2.0, 720+ stars, real production-aimed project. Happy to test whatever shape lands, and to maintain the patch in forkd's repo as a user-applied diff until landed if that helps.
Thanks for reading.