Skip to content

Refactor Vmm into trait-based backend dispatch (VmBackend) + split HV/VZ into their own crates #248

@AprilNEA

Description

@AprilNEA

Problem

The Vmm struct in arcbox-vmm hosts both the VZ (Virtualization.framework) and HV (Hypervisor.framework) backends via inherent impl Vmm { ... } blocks. This causes three concrete issues:

  1. No isolation between backends. VZ-specific fields (vz_vm, etc.) and HV-specific fields (hv_vm, hv_dax_mmap, hv_device_manager, hv_vcpu_thread_handles, hv_cpu_on_senders, hv_blk_worker_threads, ...) all live on the same Vmm struct. A VZ bug can touch HV state and vice versa.
  2. Impossible to extract as separate crates. Rust does not allow inherent impl on a foreign type from another crate. Any attempt to move darwin_hv/ or darwin/ into a sibling crate fails on this rule.
  3. Vmm's impl blocks are oversized. Even after the recent module splits (see commits e8118b1..83d8ea4), darwin_hv/mod.rs is still 2047 lines, dominated by impl Vmm for HV-specific methods. The cross-platform Vmm impl in vmm/mod.rs adds another ~1000 lines on top.

Recent conversation context: this came out of a code-quality review of the HV backend, where the natural next step after splitting the monolithic files was identified as "flip Vmm to a trait-based backend pattern so the backends can genuinely own their state."

Proposed design

Introduce a VmBackend trait. Vmm becomes a thin container that owns cross-platform state (config, running flag, device manager where shared, snapshot machinery) and delegates backend-specific operations to Box<dyn VmBackend>.

// arcbox-vmm/src/backend.rs
pub trait VmBackend: Send + Sync {
    fn initialize(&mut self, cfg: &VmmConfig, shared: &SharedVmState) -> Result<()>;
    fn start(&mut self) -> Result<()>;
    fn stop(&mut self) -> Result<()>;
    fn connect_vsock(&self, port: u32) -> Result<RawFd>;
    fn snapshot(&self) -> Result<BackendSnapshot>;
    fn restore(&mut self, snap: BackendSnapshot) -> Result<()>;
    // ... minimal surface area
}

pub struct Vmm {
    config: VmmConfig,
    running: Arc<AtomicBool>,
    shared: SharedVmState,          // device_manager, irq chip, etc.
    backend: Box<dyn VmBackend>,
}

Each backend owns its own fields:

// arcbox-vmm-hv/src/lib.rs  (new crate)
pub struct HvBackend {
    hv_vm: Option<HvVm>,
    hv_device_manager: Option<Arc<DeviceManager>>,
    hv_dax_mmap: Option<(usize, usize)>,
    hv_vcpu_thread_handles: VcpuThreadHandles,
    hv_cpu_on_senders: Option<CpuOnSenders>,
    hv_blk_worker_threads: Vec<JoinHandle<()>>,
    // ... all HV-specific state, previously on Vmm
}

impl VmBackend for HvBackend { ... }
// arcbox-vmm-vz/src/lib.rs  (new crate)
pub struct VzBackend {
    vz_vm: Option<VzVm>,
    // ... VZ-specific state
}

impl VmBackend for VzBackend { ... }

Sequencing (one PR per phase)

  1. Define VmBackend trait + SharedVmState inside arcbox-vmm. Write a stub NullBackend to exercise the wiring. No behavior change yet.
  2. Migrate HV to HvBackend (still inside arcbox-vmm). Move all hv_* fields off Vmm into HvBackend; rewrite impl Vmm { fn initialize_darwin_hv, fn start_darwin_hv, fn connect_vsock_hv, ... } as methods on HvBackend implementing the trait. darwin_hv/ submodule files keep their current organization but their impl Vmm blocks become impl HvBackend.
  3. Migrate VZ to VzBackend (still inside arcbox-vmm). Same pattern for darwin.rs.
  4. Split into crates. Move arcbox-vmm/src/vmm/darwin_hv/ → new virt/arcbox-vmm-hv/ crate. Move arcbox-vmm/src/vmm/darwin.rs → new virt/arcbox-vmm-vz/ crate. arcbox-vmm retains the Vmm + VmBackend trait + SharedVmState and depends on the two new crates.

After phase 4 the workspace gains a cleanly abstracted seam that makes adding a third backend (Linux/KVM, currently a stub in vmm/linux.rs) a straightforward matter of writing a new crate.

Tradeoffs / open questions

  • SharedVmState contents — which fields genuinely belong on Vmm vs owned by each backend? DeviceManager is used by both darwin backends today (it lives in device/, handles virtio-mmio for both). Likely shared. running: Arc<AtomicBool> is shared. Snapshot storage is shared. IRQ chip is currently VZ-only; HV uses GIC directly — may or may not belong in shared.
  • Async trait methods — methods that today are async fn on impl Vmm will need to use the newer Rust async-trait support or BoxFuture. Decide style upfront.
  • Dynamic dispatch cost — negligible per backend call (these are coarse-grained operations like start, stop, connect_vsock, not per-MMIO exits). Per-vCPU exit handling stays inside the backend and is never trait-dispatched.
  • device/ ownership — HV-specific methods on DeviceManager (net TX/RX, bridge NIC, vsock RX injection) are already isolated in device/bridge_nic.rs. Eventually these might move to arcbox-vmm-hv too, but that is out of scope for this issue.

Cost estimate

Roughly one focused week of work, dominated by phase 2 (moving ~40 hv_* field references and re-homing ~1500 lines of HV-specific impl Vmm into HvBackend). Phases 1, 3, 4 are each ~1 day. Testing requires an actual macOS M-series machine with Developer ID signing for HV and VZ integration tests.

Non-goals

  • Not a performance refactor. No runtime behavior change.
  • Not adding a Linux/KVM backend; only making it easy to add one later.
  • Not splitting device/ further — that is tracked separately if needed.

Related

  • Recent cleanup commits on feat/custom-vmm-phase2: bd38511 (SAFETY comments), e8118b1..83d8ea4 (module splits for darwin_hv/ and device/).
  • Prior project issues: ABX-286 (DarwinHvVm boot), ABX-287 (virtio MMIO integration), ABX-288 (dual backend switching) — all Done. This issue is the architectural follow-up those left behind.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions