Skip to content

Allow MAP_SHARED for snapshot-restore mem_backend (cooperative-snapshot tooling) #5912

Description

@WaylandYang

Use case

I'm building forkd, an open-source fork() primitive for microVMs aimed at AI agent fan-out. v0.4 wants a "live-fork" path where an external snapshot manager arms UFFDIO_WRITEPROTECT on the source VM's memory and captures dirty pages asynchronously — moving the memory write out of the BRANCH pause window. Empirically on kernel 6.14, this drops the BRANCH pause for a 1 GiB parent from ~150 ms (v0.3.4 floor on ext4) to ~3 ms (just the UFFDIO_WRITEPROTECT arm).

The blocker is that MemBackendType::File in snapshot_load mmaps the backing file with MAP_PRIVATE (see src/vmm/src/vstate/memory.rs::snapshot_file). If forkd hands FC a /proc/<forkd_pid>/fd/<memfd> as mem_backend.backend_path, FC opens that path but the resulting mapping is MAP_PRIVATE — guest writes CoW to FC-private pages and never propagate back to forkd's mmap of the same memfd. The cooperative-snapshot design is impossible.

Empirically verified (forkd's memfd-share spike — 30-line Python script). The Uffd backend isn't a substitute because it gives forkd the uffd fd but no read access to FC's memory (no shared mapping).

Proposed minimal API

Opt-in shared: bool on MemBackendConfig, default false (unchanged behavior):

pub struct MemBackendConfig {
    pub backend_path: PathBuf,
    pub backend_type: MemBackendType,
    /// If true and backend_type == File, mmap with MAP_SHARED.
    /// Defaults to false; ignored for Uffd.
    #[serde(default)]
    pub shared: bool,
}

Implementation is ~39 lines across 4 files (MemBackendConfig + plumbing through guest_memory_from_file to snapshot_file). Build verified against main (commit 053f521d9); full patch saved as a unified diff here.

Open API question for you

Three shapes I considered. I'd prefer your guidance before sending a PR:

  1. Opt-in field (above). Smallest surface. My current preference.
  2. New backend type MemBackendType::SharedFile. More explicit, doubles the match-arms.
  3. A shared_mmap: bool at the top-level LoadSnapshotConfig instead of nested in mem_backend.

I'll send a PR in whichever shape you confirm works.

Backing context

forkd is Apache 2.0, 720+ stars, real production-aimed project. Happy to test whatever shape lands, and to maintain the patch in forkd's repo as a user-applied diff until landed if that helps.

Thanks for reading.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions