Skip to content

feat(volumes): per-volume size cap (size=N[K|M|G])#636

Draft
G4614 wants to merge 13 commits into
boxlite-ai:mainfrom
G4614:feat/sized-volumes
Draft

feat(volumes): per-volume size cap (size=N[K|M|G])#636
G4614 wants to merge 13 commits into
boxlite-ai:mainfrom
G4614:feat/sized-volumes

Conversation

@G4614
Copy link
Copy Markdown
Contributor

@G4614 G4614 commented Jun 1, 2026

Each shim has a virtro-blk mounted to the box so that a host-volume-cap is added for operations of each box
(optional)

Test plan

coverage what it pins
parse_size_token_accepts_k_m_g_and_bare_bytes K/M/G suffix (case-insensitive, binary multiples) + bare bytes + whitespace
parse_size_token_rejects_garbage_and_overflow empty, negative, unsupported suffix (T), non-numeric, overflow
test_parse_volume_spec_anonymous_with_size /data:size=10G → anon vol, size_bytes = 10 GiB
test_parse_volume_spec_anonymous_with_size_and_ro /data:ro,size=500M and /data:size=500M,ro (order-independent)
test_parse_volume_spec_size_on_bind_rejected /host:/box:size=10G → error: "size= is only valid on anonymous volumes"
test_parse_volume_spec_unknown_option_rejected /data:zise=10G → error: "unknown volume option" (no silent typo drop)
existing test_parse_volume_spec_* (10) regression: 2-part heuristic change ("second is ro/rw" → "second starts with /") preserves all existing accept/reject behavior, including the /data: trailing-colon rejection

cargo nextest run -p boxlite-cli -E 'test(~volume_spec) or test(~parse_size_token)' — 19/19 passed (0.06 s).

Summary by CodeRabbit

  • New Features
    • Added support for sized volumes with configurable storage capacity limits using the -v <path>:size=N[K|M|G] syntax.
    • Sized volumes enforce write limits and prevent overflow with ENOSPC errors.
    • Support for read-only sized volumes with -v <path>:size=N,ro.
    • Volume data persists across stop/start cycles.

Review Change Stack

Ubuntu and others added 6 commits June 1, 2026 08:48
…schema)

Phase 1 of per-volume size caps. Lands the API surface (schema + CLI) so the
shape can be reviewed independently from the runtime mount lifecycle.

Schema (boxlite/runtime/options.rs):
- VolumeSpec gains size_bytes: Option<u64> with serde default + skip_if_none,
  so existing serialized box configs continue to parse unchanged.

CLI (cli/cli.rs):
- New token size=N[K|M|G] accepted by -v (case-insensitive suffix, binary
  multiples: K=1024, M=1024^2, G=1024^3; bare digits = bytes).
- New VolumeOptions struct replaces the bool-returning parse_volume_read_only;
  parses comma-separated ro / rw / size=N. Unknown tokens are rejected
  (no silent typo drop).
- 2-part heuristic switched from "second is ro/rw" to "second starts with /"
  so /data:size=10G parses as anon-with-opts instead of bind.
- size= on bind mounts is rejected — we don't reformat operator host dirs.

Tests (8 new):
- parse_size_token: K/M/G/bare, whitespace, negative, overflow, T unsupported.
- parse_volume_spec: size-only, size+ro (both orders), size-on-bind rejected,
  unknown-option rejected, trailing-colon-properly-rejected regression.

Phase 2 (separate commit, this PR): wire VolumeSpec.size_bytes through to a
per-volume loop FS at box init (sparse file + mkfs.ext4 + fuse2fs mount +
virtiofs share) and tear it down at box rm.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…fuse2fs)

Phase 2a of per-volume size caps. Lands the host-side mount-lifecycle
module (`runtime::sized_volume`) so it can be reviewed + tested in isolation
from the box init/rm pipeline integration (Phase 2b).

Architecture (host → box):
  sparse image at <img_path>            ← bounded host consumption
    ↓ mkfs.ext4
  ext4 inside the image                  ← in-image FS sized at create time
    ↓ fuse2fs (-f -o fakeroot)
  <mount_point> on host                  ← caller virtiofs-shares into box
    ↓ virtiofs (caller's responsibility)
  box's /data                            ← what the workload sees

Every layer sees ENOSPC at the cap — that's the point.

SizedVolumeMount handles only the host-side lifecycle:
  - create(img, mount, size, mkfs_bin, fuse2fs_bin) → image + format + mount,
    polls up to 5 s for the mount to register with the kernel, rolls back
    every created artifact on any failure step.
  - teardown(self) → fusermount -u -z + kill+wait fuse2fs + rm image; Drop
    is a safety-net repeat so a leaked handle still cleans up.
  - mount_point() → the path the caller virtiofs-shares into the box.

MIN_SIZED_VOLUME_BYTES = 16 MiB. Below this ext4 can't fit its journal +
reserved blocks; reject at create time with a clear Config error.

Tests:
  - rejects_too_small_size: < 16 MiB → BoxliteError::Config, no image created.
  - create_hits_enospc_at_cap_and_teardown_cleans_up: 16 MiB cap, write
    32 MiB → must hit ENOSPC, teardown deletes the image (skips if fuse2fs
    not installed — apt install fuse2fs).

13 unrelated VolumeSpec literals updated to set size_bytes: None (Phase 1's
new field is required on construction).

Phase 2b: wire SizedVolumeMount into box init / rm pipelines + integration
test of -v vol:/data:size=N end-to-end. fuse2fs bundling (separate from
mke2fs/debugfs that are already bundled) deferred — production currently
relies on the system fuse2fs binary; clearly-typed error if missing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…o-blk path)

Phase 2a used fuse2fs to mount the sized image on the host before sharing via
virtiofs. Under the virtio-blk architecture chosen for Phase 2b, that mount is
unnecessary: the image file is attached directly to the VM as another
`/dev/vdN` block device (libkrun's `krun_add_disk2`, same path the rootfs
already uses), and the guest agent mounts it (BlockDeviceMount, already
present). No host-side daemon, no FUSE permission issues, no cross-CLI-
session lifetime story to manage — just an image file that the kernel handles.

`SizedVolumeMount` (fuse2fs lifecycle) is replaced by `create_sized_volume_image`
(sparse + mkfs.ext4 only). The two unit tests are rewritten to verify the
image is the right length, sparse on disk, and carries the ext4 super-block
magic `0xEF53` — all without spawning any daemon, so the tests no longer
depend on `fuse2fs` being installed.

Phase 2b continues with: ResolvedVolume routing (virtiofs vs block-device
per VolumeSpec.size_bytes), libkrun disk wiring, and an end-to-end test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… disks

Phase 2b core integration. Materialise the ext4 image at box-create time and
attach it to the VM as an extra `/dev/vdN` block device; the guest agent's
existing BlockDeviceMount path mounts it at the requested guest_path. No
fuse2fs, no host-side mount, no daemon lifetime — just an image file the VM
kernel sees as a block device, same as the rootfs.

resolve_user_volumes (litebox/init/types.rs):
- Signature: gains `volumes_dir` + `mkfs_bin` so it can materialise sized
  images under a boxlite-owned dir.
- Branches on VolumeSpec.size_bytes: Some(n) → create sparse + mkfs.ext4 at
  `<volumes_dir>/<tag>.img` via runtime::sized_volume::create_sized_volume_image,
  emit a ResolvedVolume with host_path = img file + size_bytes set; None →
  existing virtiofs/bind path (host_path must already be a directory).
- ResolvedVolume gains `size_bytes: Option<u64>` so downstream consumers
  can route to virtio-blk vs virtiofs.

vmm_spawn.rs (the init task that turns ResolvedVolumes into VM config):
- Pass `<box_root>/volumes` as the sized-volumes dir; use `find_binary("mke2fs")`
  for the bundled mkfs.
- Sized resolved volumes → `volume_mgr.add_block_device(.., Ext4, .., guest_path, false, false)`
  (no host-side mount; the image is already formatted by resolve_user_volumes).
- Non-sized resolved volumes → `ContainerVolumeManager::add_volume(...)` virtiofs path,
  unchanged.
- Two separate loops so ContainerVolumeManager's `&mut volume_mgr` borrow
  doesn't conflict with the sized-vol direct calls.

Tests:
- types.rs: 3 existing tests updated for new signature.
- types.rs: NEW `resolve_sized_volume_creates_image_and_carries_size` —
  host_path doesn't need to pre-exist, image materialises under volumes_dir,
  size_bytes carries through.
- sized_volume.rs (Phase 2a refactor commit): 2 tests still cover the
  image-prep helper.

Phase 2b remaining: end-to-end integration test starting a real box with
`-v vol:/data:size=100M`, writing past 100M, asserting ENOSPC. That requires
the rest of the init pipeline + bundled mke2fs + libkrun, so it lands as a
separate integration test (heavier; needs the VM stack).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adding `pub size_bytes: Option<u64>` to `VolumeSpec` broke the three
SDK conversion sites that build `VolumeSpec` via positional struct
literal (no `..Default::default()` available because `VolumeSpec` does
not derive `Default`). The CLI parser, jailer, init/types, and the
integration tests on this branch already wired the field; the SDK
glue did not.

Compile errors caught by `cargo check --workspace --tests`:
  - sdks/c/src/options.rs:266 — `options_add_volume` C ABI call
  - sdks/node/src/options.rs:256 — `From<JsVolumeSpec> for VolumeSpec`
  - sdks/python/src/options.rs:575 — `From<PyVolumeSpec> for VolumeSpec`

Fix: pass `size_bytes: None` at each site, with an inline comment
naming the eventual extension point (e.g. `options_add_volume_size`,
JS `sizeBytes` field, Python `size_bytes` on `PyVolumeSpec`). No SDK
API surface change — size caps remain CLI-only until phase-2 lands.

Also unblocks rustfmt + clippy on this branch:
  - rustfmt fixed indent drift on the `size_bytes: None,` lines added
    in jailer/mod.rs, litebox/init/types.rs, and the two boxlite
    integration tests (mount_security, security_enforcement); and a
    multi-line format!() in init/types.rs that fits one line.
  - clippy::err_expect: `runtime/sized_volume.rs:103` had
    `.err().expect(...)`. Replaced with `.expect_err(...)`.

All workspace tests + clippy under `-D warnings` clean post-fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2b completion. The previous commit only mounted the sized ext4 image
in the guest agent's namespace — the container's own mount namespace is
separate, and a virtiofs-style volume normally crosses that boundary via a
ContainerVolumeManager bind. Apply the same pattern to sized volumes so the
container actually sees /data.

vmm_spawn.rs:
- Sized vols are now mounted at the SAME convention path a virtiofs volume
  would use: `<SHARED>/containers/<cid>/volumes/<tag>` (via
  `SharedGuestLayout`). The container init bind-mount (added through
  `ContainerVolumeManager::add_bind`) then sees the same source path
  regardless of whether the volume is backed by virtiofs or virtio-blk.
- Two-phase processing avoids a borrow conflict: add_block_device calls
  happen before ContainerVolumeManager takes &mut volume_mgr; the
  (volume_name, dest, ro) triples are stashed and replayed as
  `add_bind`s once the manager is up.

New `src/cli/tests/sized_volume.rs`:
- End-to-end test `sized_volume_caps_writes_and_rm_cleans_up_image`:
  start a box with `-v /data:size=64M`, verify the in-box `df /data`
  reports ~64 MiB (NOT the host's tens of millions of 1K-blocks),
  `dd /data/fill` hits ENOSPC at the volume cap, box stays alive after
  the fill (it's an isolated block device, not the rootfs), the image
  file is at `<box_home>/volumes/uservol0.img` while the box runs, and
  `boxlite rm -f` deletes it. 34.8 s, real VM, real ext4 in real
  virtio-blk, real ENOSPC.

Phase 2b done. Per-volume size caps are fully wired: CLI parse →
VolumeSpec.size_bytes → image materialisation in resolve_user_volumes →
libkrun virtio-blk attach → guest BlockDeviceMount at convention path →
container bind-mount at user-specified guest_path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Ubuntu and others added 2 commits June 1, 2026 09:54
…istence tests

Two integration tests on the gaps the single-vol e2e left open + the
production fix the second test surfaced.

NEW: `two_sized_volumes_on_one_box_are_independent`
  -v /a:size=32M -v /b:size=64M alpine sleep 600. Catches regressions in
  `uservol{i}` naming, /dev/vdN index handoff, or anything else that
  shares state between entries: verifies /a and /b mount at distinct caps
  (32 vs 64 MiB, /b roughly double /a), fills /a to ENOSPC, asserts /b's
  available space doesn't shrink and /b still accepts writes.

NEW: `sized_volume_data_persists_across_stop_start`
  Write a marker into /data, `boxlite stop`, `boxlite start`, expect the
  marker AND the volume's size cap to survive. This is the user-most-
  likely-to-want behaviour for a sized volume (data store across box
  restarts).

FIX: this test failed at first with "/data/marker.txt: No such file or
directory". `create_sized_volume_image` uses `File::create` (O_CREAT |
O_TRUNC), and vmm_spawn calls it every box start — so each `boxlite
start` was silently truncating and re-mkfs.ext4'ing the image, wiping
user data. Make resolve_user_volumes idempotent: if the image file
already exists, reuse it. First-create still goes through the full
sparse + mkfs.ext4 path; subsequent starts just log the reuse and
attach the existing image as /dev/vdN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI's rust fmt:check rejected three long-line layouts in the prior commit
(types.rs idempotency call, sized_volume.rs two stdout-parse chains).
Run cargo fmt to restore the canonical layout. No behaviour change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Ubuntu and others added 2 commits June 1, 2026 11:03
`-v /data:size=999G` previously succeeded on any host because the image
file is sparse — `set_len(999G)` doesn't allocate blocks, and mke2fs
only writes a few hundred KiB of metadata. A misbehaving workload could
then drain the host root fs through the sparse image and corrupt the
SQLite WAL / lock the operator out of recovery commands.

Add a statvfs-based admission check at create: refuse when
`requested + HOST_RESERVE_BYTES > host_free`. Default reserve is 10 GiB
— large enough to keep image pulls, state DB writes, and log rotation
working on a single-disk dev host; small enough not to be absurd on a
TiB-scale server.

Operator now gets a clean Config error at `boxlite run` instead of
EIO mid-write three days later.

Two-side verified: with the production guard reverted, the new test
fails with `Storage("set image size: File too large", errno 27)` from
set_len hitting ext4's 16 TiB file-size limit — proving the test
genuinely exercises the new admission path and that without it the
failure mode is opaque (and only fires above 16 TiB on ext4; under
that, oversubscription is silent).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The existing three sized-volume tests pinned the per-box invariants
(cap, multi-vol independence, persistence). This adds the multi-box
invariant: a runaway box that saturates its sized volume must not
starve the *next* box that lands on the same host.

On the pre-boxlite-ai#636 (legacy `-v /data`) path the same scenario hits a
hard fail — empirically verified in a 1 GiB loop ext4: box A's
`dd > /data/fill` drains the host fs via virtio-fs passthrough, then
box B's `boxlite run` errors with
  Failed to write COW child header to .../disks/guest-rootfs.qcow2:
  No space left on device (os error 28)
in `guest_rootfs_init`, leaving a partial box dir behind that a
third box's qcow2 backing-chain walker later trips over with
"failed to fill whole buffer." None of that surfaces without the
multi-box assertion this test adds.

On the sized path (the whole point of this PR) box A's writes
terminate at the volume's own ext4 inside the virtio-blk loop file,
the host fs is untouched, and box B's create + start + exec all
succeed cleanly. The test runs end-to-end on the standard
PerTestBoxHome — no BOXLITE_DISKTEST_HOME-style isolation needed,
because the cap is what *makes* it isolated. ~16 s wall-clock,
passes on a fresh build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@G4614 G4614 marked this pull request as ready for review June 2, 2026 05:33
@G4614 G4614 marked this pull request as draft June 2, 2026 06:53
G4614 and others added 2 commits June 2, 2026 06:53
Background: when a sized volume spec is `-v /data:size=N,ro`, the host
passed `read_only=true` to libkrun's `add_block_device`, which opens
the backing image with O_RDONLY at the device layer — the guest sees a
read-only /dev/vdN. But the guest agent's `BlockDeviceMount::mount`
did not accept a read_only argument and always built its mount(2)
call with `MS_NOATIME | MS_NODIRATIME` only. mount(2) against a
read-only block device without MS_RDONLY returns EACCES, so the box
never finished init — every `-v /data:size=N,ro` run died with

  Failed to mount /dev/vdb to /run/boxlite/shared/.../volumes/uservol0:
  EACCES: Permission denied

This was structurally invisible to the existing test matrix: the
CLI parse layer pins that `ro,size=N` builds the right VolumeSpec
(`cli.rs::test_parse_volume_spec_anonymous_with_size_and_ro`), but
no integration test ever combined size= with ro, so the fact that
the box couldn't even boot was hidden behind a parse-layer green tick.

Fix: thread `read_only` through one extra hop.

  - shared proto: `BlockDeviceSource` gets `bool read_only = 5;`
  - host: `VolumeConfig::BlockDevice` carries `read_only`,
    `into_proto` populates it, `build_guest_mounts` reads
    `block_devices[i].read_only` and passes it down.
  - guest: `BlockDeviceMount::mount` takes a `read_only: bool` and
    ORs `MS_RDONLY` into mount_flags when set; `volume.rs::mount_volume`
    passes `block.read_only`; `container.rs` rootfs caller passes
    `false` explicitly (rootfs is always rw).

The mechanism that breaks pre-fix is asymmetric:
the host-side ro is correct (defense in depth at the libkrun device
layer); what was missing was the guest's mirror flag. With both,
mount(2) succeeds and writes inside the box hit EROFS at the
kernel — the contract `-v ...:ro` promises.

The companion test in src/cli/tests/sized_volume.rs pins this end-to-end
and was the test that caught the bug. See the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two gaps in the prior test matrix:

1. ro never crossed a real box boundary. The CLI parse layer pinned
   that `-v /data:size=32M,ro` parses into the right VolumeSpec, but
   no integration test ever started a box with that spec and exercised
   the guest mount. This test is what caught the pre-fix EACCES bug
   on the previous commit — the assertion is two-layered on purpose:

     - touch /data/should-fail must exit non-zero AND
     - the kernel must surface "Read-only file system" / EROFS

   Either alone could be faked by an unrelated failure (e.g. /data
   not mounted at all would also fail touch); the conjunction pins
   "mount succeeded but is ro." The image file's existence on the
   host is checked separately so a regression that turned ro into
   "skip materialise" doesn't also pass by accident.

2. Declared-size mismatch on re-resolve was an unwritten contract.
   `resolve_user_volumes` reuses an existing image as-is via
   `if img_path.exists() { ... }` so the user's persistent data
   isn't wiped on every `boxlite start`. This means an operator who
   edits `size=N` after the first create gets silently ignored
   (image stays at original size). That's the right behaviour
   today — silently re-mkfsing on a size change would lose data —
   but it was load-bearing on a single `if exists()` line with no
   test. This commit pins both halves of the contract: image length
   and on-disk block count are byte-for-byte preserved across a
   second `resolve_user_volumes` with a 8× larger declared size.
   If someone later wires up an explicit online-grow or a loud
   refusal, this is the canary that fires.

Both tests are two-side verified.

  - test 1 (ro): reverting the guest's MS_RDONLY branch reproduces
    the original EACCES error on box start — confirmed against a
    rebuilt + re-embedded guest binary, EACCES error string matches
    the production failure verbatim.
  - test 2 (mismatch): removing the `if img_path.exists()` branch
    fails the length assertion (`left: 134217728, right: 16777216`)
    — proving the test guards the reuse contract, not just the
    happy path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@G4614 G4614 marked this pull request as ready for review June 2, 2026 07:00
@G4614 G4614 marked this pull request as draft June 3, 2026 11:52
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 3, 2026

📝 Walkthrough

Walkthrough

This PR implements sized volumes, a new feature enabling per-volume storage capacity limits via ext4 image files. The implementation spans schema updates, image creation and resolution, box startup integration, guest-side mounting, CLI parsing, and comprehensive integration testing.

Changes

Sized Volumes Feature

Layer / File(s) Summary
Core schema and type contracts
src/boxlite/src/runtime/options.rs, src/shared/proto/boxlite/v1/service.proto, src/boxlite/src/portal/interfaces/guest.rs, src/boxlite/src/litebox/init/types.rs
VolumeSpec gains optional size_bytes field; ResolvedVolume tracks capacity for sized vs. legacy volumes; BlockDeviceSource protobuf adds read_only flag; VolumeConfig::BlockDevice variant includes read_only to support read-only mounting.
SDK bridges for sized volumes
sdks/c/src/options.rs, sdks/node/src/options.rs, sdks/python/src/options.rs
C, Node, and Python SDKs initialize size_bytes: None on core VolumeSpec with documentation that sizing is not yet exposed via these APIs.
Sized volume image creation utility
src/boxlite/src/runtime/mod.rs, src/boxlite/src/runtime/sized_volume.rs
New sized_volume.rs module exports create_sized_volume_image with ext4 sizing policy constants, sparse image creation, in-place mke2fs formatting, statvfs-based overcommit prevention, and unit tests verifying size validation and image integrity.
Volume resolution with sized volume materialization
src/boxlite/src/litebox/init/types.rs
resolve_user_volumes refactored to materialize sized volumes as persistent ext4 .img files while preserving legacy directory-backed volumes; test coverage extended for image creation, reuse, and sized-volume contract enforcement.
Block device mounting with read-only support
src/guest/src/storage/block_device.rs, src/guest/src/storage/volume.rs, src/guest/src/service/container.rs
BlockDeviceMount::mount signature extended with read_only parameter, conditionally applying MS_RDONLY flag; guest container init always mounts rootfs writable to ensure root filesystem availability.
VMM spawn integration for sized volumes
src/boxlite/Cargo.toml, src/boxlite/src/litebox/init/tasks/vmm_spawn.rs
Enables nix crate fs feature; build_config derives per-box sized_volumes_dir, materializes sized user volumes as ext4 block devices via volume_mgr, and replays as bind mounts before legacy virtiofs volumes.
Guest volume manager block device configuration
src/boxlite/src/volumes/guest_volume.rs
GuestVolumeManager::build_guest_mounts passes block entry's read_only flag into VolumeConfig::block_device constructor.
CLI parsing for size= option on anonymous volumes
src/cli/src/cli.rs
Introduces parse_size_token helper for N[K|M|G] + bare-byte parsing; VolumeOptions parser handles ro/rw/size=... flags; parse_volume_spec refactored to distinguish anonymous vs. bind-mount volumes, reject size= on bind mounts, and validate unknown options; parsed size_bytes wired into VolumeSpec.
End-to-end integration test suite
src/cli/tests/sized_volume.rs
Five end-to-end scenarios: (1) capacity capping and host cleanup, (2) volume independence, (3) persistence across stop/start, (4) no resource starvation, (5) read-only enforcement.
Unit test updates for struct field additions
src/boxlite/src/jailer/builder.rs, src/boxlite/src/jailer/command.rs, src/boxlite/src/jailer/mod.rs, src/boxlite/tests/mount_security.rs, src/boxlite/tests/security_enforcement.rs
Jailer and integration tests updated to include explicit size_bytes: None in VolumeSpec literals.

🎯 4 (Complex) | ⏱️ ~60 minutes

🐰 A hopping new volume appears,
Sized and bounded, no overflow fears!
Ext4 images spark to life,
Guest writes capped without strife,
The box grows wise with storage might!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(volumes): per-volume size cap (size=N[K|M|G])' clearly and specifically describes the main change—adding per-volume storage size capacity limits with CLI syntax.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Comment @coderabbitai help to get the list of available commands and usage tips.

@cla-assistant
Copy link
Copy Markdown

cla-assistant Bot commented Jun 3, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ G4614
❌ Ubuntu


Ubuntu seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
src/boxlite/src/runtime/sized_volume.rs (1)

77-77: 💤 Low value

Confirm fragment_size() type and keep overflow check as optional

nix::sys::statvfs::Statvfs::fragment_size() returns c_ulong, so the multiplication at src/boxlite/src/runtime/sized_volume.rs (line 77) is type-consistent with blocks_available() being c_ulong (cast to u64). Using checked_mul would be purely defensive hardening; it isn’t required to fix a type/API mismatch.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/boxlite/src/runtime/sized_volume.rs` at line 77, Confirm that
Statvfs::fragment_size() returns c_ulong and that blocks_available() is cast to
u64 already; keep the current multiplication but optionally harden it by using a
checked multiplication when computing free_bytes in sized_volume.rs (the
expression assigning to free_bytes that uses vfs.blocks_available() and
vfs.fragment_size()). If you add the defensive check, use checked_mul on the u64
operands and handle the None case (e.g., cap at u64::MAX or return an error) so
overflow is handled explicitly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/boxlite/src/litebox/init/types.rs`:
- Around line 72-73: The tag creation using the loop index ("tag =
format!(\"uservol{}\", i)") is unstable and can mismatch volumes if order
changes; instead derive a stable tag from the volume identity (e.g., sanitize
vol.guest_path or compute a short hash of vol.guest_path and other identifying
fields like vol.size) so the filename remains tied to the mount target rather
than array position—replace the index-based tag with a deterministic tag using
vol.guest_path (or hash of guest_path+size) and ensure any sanitization yields a
filesystem-safe string before using it as the tag/filename.

In `@src/cli/src/cli.rs`:
- Around line 564-570: The function parse_size_token currently parses num_part
into a u64 but allows zero; add a check after parsing to reject 0 so the CLI
enforces "positive integer" semantics early. Specifically, in parse_size_token
(the block that parses num_part into num), after the parse succeeds do if num ==
0 { return Err(anyhow::anyhow!("size must be a positive integer, got {:?}",
num_part)); } before performing num.checked_mul(mult). Ensure the error message
and returned Err match the existing style and still reference the original token
(num_part or s) as in the surrounding code.
- Around line 670-675: The Windows-drive parsing branch currently accepts
four-or-more segments and ignores parts[4..], which can hide malformed specs; in
the 4.. branch (where is_windows_drive(parts[0]) is true) validate that
parts.len() == 4 and, if greater, return an error instead of proceeding—locate
the logic around is_windows_drive and VolumeOptions::parse(parts[3]) and add a
check to reject inputs with extra ':' segments (produce a clear parse error
message) so malformed Windows volume specs are not silently accepted.

In `@src/cli/tests/sized_volume.rs`:
- Around line 233-250: The code currently silences failures by using
.unwrap_or(0) when parsing df output into b_avail_before; change this to fail
fast: after running boxlite(...) capture the stdout string and call
parse::<u64>().expect(...) (or explicitly match the Result and panic) so the
test errors out with a clear message including the raw stdout when df/exec/parse
fails; reference the b_avail_before binding and the boxlite(...) call that
produces o.stdout to locate the change.

---

Nitpick comments:
In `@src/boxlite/src/runtime/sized_volume.rs`:
- Line 77: Confirm that Statvfs::fragment_size() returns c_ulong and that
blocks_available() is cast to u64 already; keep the current multiplication but
optionally harden it by using a checked multiplication when computing free_bytes
in sized_volume.rs (the expression assigning to free_bytes that uses
vfs.blocks_available() and vfs.fragment_size()). If you add the defensive check,
use checked_mul on the u64 operands and handle the None case (e.g., cap at
u64::MAX or return an error) so overflow is handled explicitly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 61669e98-1684-4151-b44c-0dc589a779b2

📥 Commits

Reviewing files that changed from the base of the PR and between f21e5a4 and ad8bc48.

📒 Files selected for processing (22)
  • sdks/c/src/options.rs
  • sdks/node/src/options.rs
  • sdks/python/src/options.rs
  • src/boxlite/Cargo.toml
  • src/boxlite/src/jailer/builder.rs
  • src/boxlite/src/jailer/command.rs
  • src/boxlite/src/jailer/mod.rs
  • src/boxlite/src/litebox/init/tasks/vmm_spawn.rs
  • src/boxlite/src/litebox/init/types.rs
  • src/boxlite/src/portal/interfaces/guest.rs
  • src/boxlite/src/runtime/mod.rs
  • src/boxlite/src/runtime/options.rs
  • src/boxlite/src/runtime/sized_volume.rs
  • src/boxlite/src/volumes/guest_volume.rs
  • src/boxlite/tests/mount_security.rs
  • src/boxlite/tests/security_enforcement.rs
  • src/cli/src/cli.rs
  • src/cli/tests/sized_volume.rs
  • src/guest/src/service/container.rs
  • src/guest/src/storage/block_device.rs
  • src/guest/src/storage/volume.rs
  • src/shared/proto/boxlite/v1/service.proto

Comment on lines 72 to +73
for (i, vol) in volumes.iter().enumerate() {
let host_path = PathBuf::from(&vol.host_path);
let tag = format!("uservol{}", i);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Tag derivation from index position could cause data mismatch if volume order changes.

The tag uservol{i} is derived from the volume's position in the array. If a user reorders volumes between restarts (e.g., swaps -v /a:size=16M -v /b:size=32M to -v /b:size=32M -v /a:size=16M), the existing uservol0.img (originally for /a) would be reused for /b, potentially exposing the wrong data at the wrong mount point.

Consider deriving the tag from the guest_path (sanitized) or a hash of identifying properties to ensure stable mapping across restarts.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/boxlite/src/litebox/init/types.rs` around lines 72 - 73, The tag creation
using the loop index ("tag = format!(\"uservol{}\", i)") is unstable and can
mismatch volumes if order changes; instead derive a stable tag from the volume
identity (e.g., sanitize vol.guest_path or compute a short hash of
vol.guest_path and other identifying fields like vol.size) so the filename
remains tied to the mount target rather than array position—replace the
index-based tag with a deterministic tag using vol.guest_path (or hash of
guest_path+size) and ensure any sanitization yields a filesystem-safe string
before using it as the tag/filename.

Comment thread src/cli/src/cli.rs
Comment on lines +564 to 570
let num: u64 = num_part
.trim()
.parse()
.map_err(|_| anyhow::anyhow!("size must be a positive integer, got {:?}", num_part))?;
num.checked_mul(mult)
.ok_or_else(|| anyhow::anyhow!("size overflow: {:?}", s))
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

parse_size_token says “positive integer” but currently accepts 0.

0 passes parse and only fails later in runtime admission, which weakens CLI-level validation and gives less direct feedback.

Suggested fix
     let num: u64 = num_part
         .trim()
         .parse()
         .map_err(|_| anyhow::anyhow!("size must be a positive integer, got {:?}", num_part))?;
+    if num == 0 {
+        anyhow::bail!("size must be greater than 0");
+    }
     num.checked_mul(mult)
         .ok_or_else(|| anyhow::anyhow!("size overflow: {:?}", s))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let num: u64 = num_part
.trim()
.parse()
.map_err(|_| anyhow::anyhow!("size must be a positive integer, got {:?}", num_part))?;
num.checked_mul(mult)
.ok_or_else(|| anyhow::anyhow!("size overflow: {:?}", s))
}
let num: u64 = num_part
.trim()
.parse()
.map_err(|_| anyhow::anyhow!("size must be a positive integer, got {:?}", num_part))?;
if num == 0 {
anyhow::bail!("size must be greater than 0");
}
num.checked_mul(mult)
.ok_or_else(|| anyhow::anyhow!("size overflow: {:?}", s))
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cli/src/cli.rs` around lines 564 - 570, The function parse_size_token
currently parses num_part into a u64 but allows zero; add a check after parsing
to reject 0 so the CLI enforces "positive integer" semantics early.
Specifically, in parse_size_token (the block that parses num_part into num),
after the parse succeeds do if num == 0 { return Err(anyhow::anyhow!("size must
be a positive integer, got {:?}", num_part)); } before performing
num.checked_mul(mult). Ensure the error message and returned Err match the
existing style and still reference the original token (num_part or s) as in the
surrounding code.

Comment thread src/cli/src/cli.rs
Comment on lines 670 to 675
4.. => {
if is_windows_drive(parts[0]) {
let host = format!("{}:{}", parts[0], parts[1]);
let ro = parse_volume_read_only(parts[3]);
(Some(host), parts[2].to_string(), ro)
let opts = VolumeOptions::parse(parts[3])?;
(Some(host), parts[2].to_string(), opts)
} else {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject Windows volume specs with extra : segments instead of silently ignoring them.

For Windows-drive inputs in the 4.. branch, only parts[3] is parsed as options and any parts[4..] are ignored. This can silently accept malformed specs and hide typos/options you intended to reject.

Suggested fix
-        4.. => {
+        4 => {
             if is_windows_drive(parts[0]) {
                 let host = format!("{}:{}", parts[0], parts[1]);
                 let opts = VolumeOptions::parse(parts[3])?;
                 (Some(host), parts[2].to_string(), opts)
             } else {
                 anyhow::bail!(
                     "invalid volume spec {:?}; use hostPath:boxPath[:options] (e.g. /data:/app/data or C:\\data:/app/data:ro)",
                     s
                 );
             }
         }
+        5.. => {
+            anyhow::bail!(
+                "invalid volume spec {:?}; too many ':' segments in volume spec",
+                s
+            );
+        }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cli/src/cli.rs` around lines 670 - 675, The Windows-drive parsing branch
currently accepts four-or-more segments and ignores parts[4..], which can hide
malformed specs; in the 4.. branch (where is_windows_drive(parts[0]) is true)
validate that parts.len() == 4 and, if greater, return an error instead of
proceeding—locate the logic around is_windows_drive and
VolumeOptions::parse(parts[3]) and add a check to reject inputs with extra ':'
segments (produce a clear parse error message) so malformed Windows volume specs
are not silently accepted.

Comment on lines +233 to +250
let b_avail_before = {
let o = boxlite(
home.path.as_path(),
&[
"exec",
&box_id,
"--",
"sh",
"-c",
"df -P /b | awk 'NR==2{print $4}'",
],
Duration::from_secs(20),
);
String::from_utf8_lossy(&o.stdout)
.trim()
.parse::<u64>()
.unwrap_or(0)
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t default b_avail_before to 0; fail fast on bad df output.

Using .unwrap_or(0) can hide exec/parse failures and let this isolation test pass incorrectly.

Suggested fix
     let b_avail_before = {
         let o = boxlite(
             home.path.as_path(),
             &[
                 "exec",
                 &box_id,
                 "--",
                 "sh",
                 "-c",
                 "df -P /b | awk 'NR==2{print $4}'",
             ],
             Duration::from_secs(20),
         );
-        String::from_utf8_lossy(&o.stdout)
+        assert!(
+            o.status.success(),
+            "failed to read /b available space before fill: {}",
+            String::from_utf8_lossy(&o.stderr)
+        );
+        String::from_utf8_lossy(&o.stdout)
             .trim()
             .parse::<u64>()
-            .unwrap_or(0)
+            .unwrap_or_else(|_| panic!("invalid /b available-space output: {:?}", String::from_utf8_lossy(&o.stdout)))
     };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let b_avail_before = {
let o = boxlite(
home.path.as_path(),
&[
"exec",
&box_id,
"--",
"sh",
"-c",
"df -P /b | awk 'NR==2{print $4}'",
],
Duration::from_secs(20),
);
String::from_utf8_lossy(&o.stdout)
.trim()
.parse::<u64>()
.unwrap_or(0)
};
let b_avail_before = {
let o = boxlite(
home.path.as_path(),
&[
"exec",
&box_id,
"--",
"sh",
"-c",
"df -P /b | awk 'NR==2{print $4}'",
],
Duration::from_secs(20),
);
assert!(
o.status.success(),
"failed to read /b available space before fill: {}",
String::from_utf8_lossy(&o.stderr)
);
String::from_utf8_lossy(&o.stdout)
.trim()
.parse::<u64>()
.unwrap_or_else(|_| panic!("invalid /b available-space output: {:?}", String::from_utf8_lossy(&o.stdout)))
};
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cli/tests/sized_volume.rs` around lines 233 - 250, The code currently
silences failures by using .unwrap_or(0) when parsing df output into
b_avail_before; change this to fail fast: after running boxlite(...) capture the
stdout string and call parse::<u64>().expect(...) (or explicitly match the
Result and panic) so the test errors out with a clear message including the raw
stdout when df/exec/parse fails; reference the b_avail_before binding and the
boxlite(...) call that produces o.stdout to locate the change.

@G4614
Copy link
Copy Markdown
Contributor Author

G4614 commented Jun 3, 2026

wait for #639

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant