Skip to content

qcow2-rs: fallocate via F_PUNCHHOLE on macOS#11

Merged
ming1 merged 3 commits into
ublk-org:mainfrom
sandrewh:feat/macos-fallocate-punchhole
May 15, 2026
Merged

qcow2-rs: fallocate via F_PUNCHHOLE on macOS#11
ming1 merged 3 commits into
ublk-org:mainfrom
sandrewh:feat/macos-fallocate-punchhole

Conversation

@sandrewh

Copy link
Copy Markdown
Contributor

qcow2-rs: fallocate via F_PUNCHHOLE on macOS

Summary

Replace the macOS fallback in Qcow2IoTokio::fallocate (which previously wrote zeros via pwrite) with a real hole-punch using fcntl(F_PUNCHHOLE, &fpunchhole_t). This makes discard semantics actually reclaim host file space on Darwin hosts, mirroring the existing Linux fallocate(FALLOC_FL_PUNCH_HOLE) behavior.

Motivation

Qcow2IoOps::fallocate is the file-level punch primitive that Qcow2Dev calls (via call_fallocate) when it wants to zero a range AND release the host extents. On Linux this works as intended. On macOS the existing implementation in tokio_io.rs falls back to writing zeros via pwrite, which preserves the reads-as-zero contract but doesn't shrink the file — wasting disk space when callers discard large regions.

F_PUNCHHOLE has been available on macOS since 10.10 (Yosemite, 2014). It's the canonical macOS analogue of Linux's FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE.

Behavior

  • macOS: fcntl(F_PUNCHHOLE, &fpunchhole_t). File size unchanged, punched range reads as zeros, allocated host extents released.
  • macOS APFS requires fp_offset and fp_length to be multiples of the volume block size (4096). Sub-block ranges return EINVAL. The implementation catches EINVAL / EOPNOTSUPP / ENOSYS and falls back to the existing zero-write path so the reads-as-zero contract still holds — the host file just doesn't shrink for that one call.
  • Linux: unchanged.
  • Other non-Linux non-macOS targets (Windows, FreeBSD with tokio backend): unchanged.

Tests

Three new tests in tests/fallocate.rs, platform-gated:

Test Platform What it asserts
fallocate_punch_hole_shrinks_st_blocks_on_linux Linux fallocate(PUNCH_HOLE) shrinks st_blocks; logical file size unchanged; punched range reads as zero; bytes outside untouched
fallocate_punch_hole_shrinks_st_blocks_on_macos macOS Same shape with F_PUNCHHOLE on a 4-KiB-aligned range
fallocate_sub_block_range_falls_back_to_zero_write_on_macos macOS Sub-block-aligned range (1 KiB offset, 1 KiB length) soft-fails to zero-write cleanly: no error, range reads as zeros, surrounding bytes untouched

Reproduce

On macOS:

cargo test --release --test fallocate

On Linux:

cargo test --release --test fallocate

Out of scope (intentional)

  • No change to the Linux fallocate path.
  • No new Qcow2IoOps trait methods or flags.
  • No change to Qcow2Dev cluster-allocator semantics.

Disclosure

Co-developed with Claude Code (Claude Opus 4.7, 1M context). I reviewed every line and can answer questions about the design or implementation.

sandrewh added 2 commits May 13, 2026 19:16
On macOS the existing Qcow2IoTokio::fallocate path wrote zeros via
pwrite, which preserves the reads-as-zero contract but does not
release host extents. This commit replaces that fallback with a real
fcntl(F_PUNCHHOLE, &fpunchhole_t) punch, available on macOS since
10.10 and behaviorally equivalent to fallocate(FALLOC_FL_PUNCH_HOLE |
FALLOC_FL_KEEP_SIZE) on Linux: file size unchanged, punched range
reads as zero, allocated host extents released.

APFS requires fp_offset and fp_length to be multiples of the volume
block size (4096). Sub-block ranges return EINVAL, which the
implementation catches alongside EOPNOTSUPP and ENOSYS and falls back
to the pre-existing zero-write path so the reads-as-zero contract
still holds — the host file just doesn't shrink for that one call.

Mechanically the fd: i32 field on Qcow2IoTokio is now populated on
macOS too (previously Linux-only); the new() cfg arms are merged
accordingly. The Linux fallocate path and the non-Linux non-macOS
fallback are unchanged.

Assisted-by: Claude Opus 4.7 (1M context)
GHA Ubuntu runners' filesystem (likely overlay-backed) returns
EOPNOTSUPP for `FALLOC_FL_PUNCH_HOLE | FALLOC_FL_ZERO_RANGE`, the
combination `Qcow2OpsFlags::FALLOCATE_ZERO_RAGE` maps to.
`Qcow2Dev::call_fallocate` already has a write-zeros fallback for
exactly this case; the test was wrong to be strict.

Now the test accepts the soft-fail path: on EOPNOTSUPP / "Operation
not supported" / "Unsupported", skip the strict shrinkage assertion
with an eprintln note. The read-as-zero contract is still asserted
unconditionally — that's the user-facing guarantee on both paths.

macOS test path unchanged (APFS supports F_PUNCHHOLE; the test
gates strict shrinkage only when the punch reported success).

Assisted-by: Claude Opus 4.7 (1M context)
sandrewh added a commit to sandrewh/qcow2-rs that referenced this pull request May 14, 2026
@ming1

ming1 commented May 15, 2026

Copy link
Copy Markdown
Collaborator

Maybe something like below is needed for fixing the test failure?

diff --git a/tests/fallocate.rs b/tests/fallocate.rs
index a0cf3bb..2328fed 100644
--- a/tests/fallocate.rs
+++ b/tests/fallocate.rs
@@ -86,6 +86,12 @@ fn fallocate_punch_hole_shrinks_st_blocks_on_linux() {
                 blocks_after < blocks_before,
                 "punch_hole must shrink allocated blocks: before={blocks_before} after={blocks_after}",
             );
+        } else {
+              // Production code (Qcow2Dev::call_fallocate) falls back to
+              // write-zeros when fallocate is not supported. Replicate
+              // that here so the reads-as-zero contract holds.
+              let zero_buf = vec![0u8; 32 * 1024];
+              io.write_from(16 * 1024, &zero_buf).await.unwrap();
         }
         // Either way: logical file size unchanged, punched/zeroed range
         // reads as zero, surrounding bytes untouched.

Follow-up to review feedback from @ming1 on PR ublk-org#11.

The previous fix replaced the `.expect()` panic with a soft-fail
skip, but `Qcow2IoTokio::fallocate` is the raw IO layer and has no
write-zeros fallback (that lives in `Qcow2Dev::call_fallocate`). So
on the EOPNOTSUPP path the range stayed the 0xAB prefill and the
reads-as-zero assertion would correctly fail — the test was still
broken, just at a different line.

Now the not-punched branch replicates exactly what production does:
write zeros to the range, then fsync. The reads-as-zero contract
then genuinely holds on both the punch path and the fallback path.

Assisted-by: Claude Opus 4.7 (1M context)
@sandrewh

Copy link
Copy Markdown
Contributor Author

Thanks @ming1 — good catch, and you're exactly right. Qcow2IoTokio::fallocate is the raw IO layer with no write-zeros fallback (that's in Qcow2Dev::call_fallocate), so on the EOPNOTSUPP path the range stayed the 0xAB prefill and the reads-as-zero assertion would still fail — the test was broken at a different line than before.

Applied your suggested approach in f83a390: the not-punched branch now writes zeros + fsyncs, replicating exactly what production does, so the reads-as-zero contract holds on both paths. Pushed.

sandrewh added a commit to sandrewh/qcow2-rs that referenced this pull request May 15, 2026
@ming1 ming1 merged commit 8d298a0 into ublk-org:main May 15, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants