Skip to content

Add OCI image support: pull, unpack, run, prune, status, policy#34

Open
Max042004 wants to merge 4 commits into
sysprog21:mainfrom
Max042004:oci-image
Open

Add OCI image support: pull, unpack, run, prune, status, policy#34
Max042004 wants to merge 4 commits into
sysprog21:mainfrom
Max042004:oci-image

Conversation

@Max042004
Copy link
Copy Markdown
Collaborator

@Max042004 Max042004 commented May 15, 2026

This PR lands the full elfuse OCI image support. It supersedes the
original Phase 1 scope of this PR (CLI scaffold + pull/inspect) and
now covers Phases 1-4 plus the post-Phase-3 improvements plan: image
layout alignment, GC/prune, layer + stack snapshot caches, store
status, parallel pull, registry policy.json, and a heavy-mode compat
matrix.

Scope

  • Pull / inspect — content-addressable blob store, HTTPS + bearer
    token, OCI index walk to the linux/arm64 leaf manifest, partial-
    store-aware inspect renderer.
  • Unpack — tar reader (ustar + PAX x/g records), gzip + decode-
    only vendored zstd, whiteout-aware layer apply (typeflag '1'/'2'/'5'
    • .wh.* markers), per-image sysroot on a case-sensitive APFS
      sparsebundle.
  • Runelfuse oci run clones the unpacked tree via clonefile(2),
    honors Entrypoint / Cmd / Env / WorkingDir / User, and reuses the
    existing elfuse launch path so a dynamically-linked guest binary
    runs through the same shim + syscall surface as the non-OCI mode.
  • Lifecycleoci prune with --older-than / --keep-bytes;
    layer + stack prune sweep; oci status (text + --json);
    oci rebuild-cache for pre-snapshot stores.
  • Performance — parallel blob fetch with HTTP Range resume;
    per-layer raw snapshot cache; ChainID stack snapshot cache; APFS
    COW clone-rootfs reuse between runs.
  • Policy — podman / skopeo-style policy.json + registries.d
    overlay (per-registry insecure / ca_bundle / auth_file). CLI flags
    override; loopback-only --insecure.
  • Test coverage — 25 OCI unit suites (test-oci-*), compat-shell
    smoke (tests/test-oci-compat.sh), and an opt-in heavy mode
    (OCI_COMPAT_TEST=1) that drives three layered fixtures
    (alpine-shaped, busybox-shaped hardlink dispatch, two-layer
    whiteout) end-to-end through a freshly-provisioned scratch
    sparsebundle.

Manual smoke test (docker.io/library/python:3.12)

A real end-to-end pull-and-run against a mainstream multi-layer glibc
image. The image's default Entrypoint is docker-entrypoint.sh (a
shell script, which elfuse does not execute), so the commands below
override --entrypoint to the python3 binary directly.

make elfuse
SCRATCH=$(mktemp -d)
echo "store: $SCRATCH"

# 1. Pull (~400 MB across 7 layers, ~3 minutes on a fast link).
#    If your terminal mishandles CSI cursor-up and the progress
#    output stacks duplicate rows, prepend ELFUSE_OCI_PROGRESS=plain
#    to fall back to one summary line per blob.
./build/elfuse oci pull --store "$SCRATCH" python:3.12

# 2. Offline inspect: image index -> linux/arm64 manifest -> config
#    runtime block (Entrypoint / Cmd / Env / WorkingDir / User).
./build/elfuse oci inspect --store "$SCRATCH" python:3.12

# 3. Cold run. First invocation triggers layer unpack onto the
#    sysroot APFS sparsebundle, then clone-rootfs, then launch. The
#    unpack step dominates the ~50 s wall on a fresh store.
./build/elfuse oci run --store "$SCRATCH" \
    --entrypoint /usr/local/bin/python3 python:3.12 \
    -c 'print("hello from elfuse", 1+2)'
# expected stdout:  hello from elfuse 3

# 4. Warm run. clone-rootfs reuses the unpacked image tree, so wall
#    drops to ~2 s and is dominated by VM bring-up + dynamic-linker
#    bring-up + Python interp init.
./build/elfuse oci run --store "$SCRATCH" \
    --entrypoint /usr/local/bin/python3 python:3.12 \
    -c 'import sys, platform; print(sys.version); print(platform.platform()); print(platform.machine())'
# expected stdout:  Python 3.12.x ... / Linux-<kernel>-aarch64-with-glibc2.41 / aarch64

# 5. stdlib smoke. Confirms json + math + f-string formatting all
#    flow through the emulated syscall surface.
./build/elfuse oci run --store "$SCRATCH" \
    --entrypoint /usr/local/bin/python3 python:3.12 \
    -c 'import json, math; print(json.dumps({"pi": round(math.pi, 5), "ok": True}))'
# expected stdout:  {"pi": 3.14159, "ok": true}

Performance characterization (vs OrbStack)

Measured on Apple M4 / macOS 15.4.1 (Darwin 24.4.0). OrbStack 2.1.3
acts as the ground-truth aarch64-linux runtime: it executes the same
docker.io/library/python:3.12 image inside a Virtualization.framework-
backed Linux VM with a real Linux kernel, so the comparison isolates
the cost of elfuse's user-mode ABI emulation against a native syscall
surface.

Pure CPU (factorial big-int multiply, no syscall)

import sys, math, time
sys.set_int_max_str_digits(0)   # Python 3.12 default cap is 4300 digits
N = 200000
t = time.perf_counter()
f = math.factorial(N)
s = sum(int(d) for d in str(f))
print("fact(%d) digit_sum=%d digits=%d compute=%.3fs" %
      (N, s, len(str(f)), time.perf_counter() - t))

Each engine ran twice; the second is warm. compute is the
time.perf_counter() delta inside Python (pure interpreter +
big-int multiply work); real is the outer wall (includes engine
startup); startup ≈ real - compute.

Engine run compute (s) real (s) startup (s)
elfuse 1 0.791 3.72 2.93
elfuse 2 warm 0.804 3.35 2.55
orbstack 1 0.792 1.10 0.31
orbstack 2 warm 0.796 0.97 0.17

Both engines emit digit_sum=4154076 digits=973351 — correctness
parity confirmed. Pure compute ratio: 1.01× (within measurement
noise). HVF runs guest aarch64 instructions directly so big-int
multiply + Python bytecode dispatch pay zero translation overhead.
Startup ratio: 15.0× (constant ~2.5 s for elfuse vs ~0.17 s for
orbstack), independent of N — verified separately at N=50000 where
both compute drops to ~0.14 s but elfuse startup stays at 2.53 s.

Syscall density (Python loop hammering syscalls)

import os, time
N_BASE = 1_000_000
N_READ = 100_000

def time_loop(label, fn, n):
    fn(min(n // 100, 10_000))   # warm-up
    t = time.perf_counter()
    fn(n)
    return label, time.perf_counter() - t, n

def baseline(n):
    for _ in range(n): pass

def getppid(n):
    g = os.getppid
    for _ in range(n): g()

def clock_ns(n):
    g = time.monotonic_ns
    for _ in range(n): g()

def urandom_read(n):
    fd = os.open("/dev/urandom", os.O_RDONLY)
    try:
        rd = os.read
        for _ in range(n): rd(fd, 1)
    finally:
        os.close(fd)

results = [
    time_loop("baseline (pass)",              baseline,     N_BASE),
    time_loop("getppid",                      getppid,      N_BASE),
    time_loop("clock_gettime (monotonic_ns)", clock_ns,     N_BASE),
    time_loop("/dev/urandom 1B read",         urandom_read, N_READ),
]
base_per = results[0][1] / results[0][2]
for label, secs, n in results:
    per = secs / n
    overhead = (per - base_per) * 1e6 if label != "baseline (pass)" else 0.0
    print("%-38s total=%.3fs n=%d per=%.3fus  syscall_overhead=%.3fus" %
          (label, secs, n, per * 1e6, overhead))

syscall_overhead strips the Python loop interpreter cost (measured
from the baseline band) so the residual is the pure trap+return
cost of a single syscall.

Band elfuse (μs/call) orbstack (μs/call) ratio
baseline (pass) 0.007 0.007 1.0×
getppid 0.960 0.091 10.5×
clock_gettime (monotonic_ns) 1.006 0.018 55.9×
/dev/urandom 1B read 1.704 0.210 8.1×

getppid is the cleanest measurement: no kernel work, just trap +
return. elfuse pays roughly 1 μs per syscall versus ~0.1 μs native.
Rough HVF round-trip breakdown: vCPU state sync ~200 ns, Linux→macOS
semantics ~100 ns, the macOS syscall itself ~100 ns, errno + sync
back ~100 ns, HVF re-entry + ERET ~500 ns. This 1 μs floor is the
structural ceiling for any elfuse syscall path.

vDSO observationtime.monotonic_ns should hit the synthetic
vDSO under src/core/vdso.{c,h} and skip the trap (orbstack does, at
0.018 μs), but the measured 1.006 μs matches the trapping baseline.
elfuse's vDSO entry is not being picked up by glibc 2.41 in this
image. This is an existing optimization opportunity unrelated to the
scope of this PR; left untouched here so the patch series stays
focused on image-distribution and runtime correctness.

Wall-clock model

For a pure-CPU workload of compute time W:

elfuse_total   ≈ 2.5 s + W
orbstack_total ≈ 0.17 s + W
W elfuse orbstack ratio scenario
0.1 s 2.6 s 0.27 s 9.6× CLI one-shot
1 s 3.5 s 1.17 s 3.0× short script
10 s 12.5 s 10.17 s 1.23× medium task
60 s 62.5 s 60.17 s 1.04× batch job

elfuse is competitive for long-running workloads (where the constant
startup amortizes out) and a known tradeoff for short CLI one-shots
where startup dominates total wall.

Known limitations

  • fork() followed by execve() of a dynamically-linked ELF crashes
    in the child during dynamic-linker bring-up. This blocks Python's
    subprocess.run([...other_dynamic_binary...]), shell pipelines that
    spawn external binaries, and timeout(1). Single-process Python
    workloads, stdlib computation, and file I/O are unaffected.
  • Multi-arch image selection is hardcoded to linux/arm64. There is
    no --platform flag; cross-arch image support is out of scope for
    this PR.
  • pull progress uses CSI cursor-up + clear-line for in-place
    redraw. Terminal panes that ignore those escapes show stacking
    rows; set ELFUSE_OCI_PROGRESS=plain to disable the redraw and
    emit one summary line per blob instead.

Summary by cubic

Adds full OCI image lifecycle to elfuse: pull, inspect, unpack, clone, run, prune, rebuild-cache, and status. Improves pulls with parallel/resumable downloads, adds a content‑addressable store with GC and caches, and wires the runtime to execute images directly.

  • New Features

    • CLI: oci pull|inspect|unpack|clone|run|prune|rebuild-cache|status; pull adds progress and --refresh.
    • Registry: HTTPS via libcurl; bearer-token and Basic auth; custom CA; loopback‑gated --insecure; writes oci-layout and pins in index.json.
    • Inspect: offline manifest/index; shows runtime (Entrypoint/Cmd/Env/WorkingDir/User) and cross‑image dedup stats; status supports text/--json.
    • Unpack: tar reader (ustar + PAX), gzip/zstd decode, whiteout‑aware apply, case‑sensitive APFS sysroot, per‑run rootfs via clonefile(2).
    • Caches: per‑layer raw snapshots and ChainID stack snapshots; parallel blob fetch with HTTP Range resume.
    • Runtime: PATH resolver; image‑config User name/group lookup; inject /etc/{resolv.conf,hosts,hostname}; emulate /dev/{full,console}; add /proc cgroup/hostname/comm/statm; shared VM launcher.
    • Policy: podman/skopeo‑style policy.json plus registries.d overlay; merged with CLI flags.
    • Store ops: prune with --older-than/--keep-bytes; rebuild-cache; status reports blobs/layers/stacks.
    • Fixes: walk multi‑arch index to linux/arm64 leaf; accept root tar ./; cross‑volume unpack via copyfile(2) with clone fallback.
  • Migration

    • Pins moved to OCI index.json; store auto‑migrates from refs/ on open.
    • Layer cache marked schema v2; first open wipes legacy v1 entries (blobs/images untouched).
    • Vendors decode‑only zstd and cJSON; uses system zlib and libcurl.

Written for commit 5d6dbc7. Summary will update on new commits. Review in cubic

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

11 issues found across 40 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/oci/pull.c">

<violation number="1" location="src/oci/pull.c:253">
P2: Error-path leak: `sub_resp` may be allocated but not freed when sub-manifest fetch fails before `have_sub` is set.</violation>
</file>

<file name="src/oci/media-type.c">

<violation number="1" location="src/oci/media-type.c:100">
P2: Media type parsing is case-sensitive, but media type type/subtype tokens are case-insensitive; valid values with different casing will be misclassified as unknown.</violation>
</file>

<file name="src/oci/ref.c">

<violation number="1" location="src/oci/ref.c:83">
P2: Repository-path validation incorrectly rejects valid names with repeated dashes (for example `my--repo`).</violation>

<violation number="2" location="src/oci/ref.c:356">
P2: `docker.io` default-namespace detection is case-sensitive, so mixed-case hostnames can skip the required `library/` prefix.</violation>
</file>

<file name="src/oci/fetch.c">

<violation number="1" location="src/oci/fetch.c:782">
P2: Manifest fetch skips bearer-challenge parsing when a token is already cached, so 401 responses from expired/stale tokens are not retried with a refreshed token.</violation>

<violation number="2" location="src/oci/fetch.c:945">
P2: Blob fetch also disables challenge parsing when a token is cached, preventing 401-triggered token refresh and causing avoidable pull failures.</violation>
</file>

<file name="src/oci/blob-store.c">

<violation number="1" location="src/oci/blob-store.c:354">
P2: The commit path is not crash-durable because it never fsyncs the destination directory after linking the blob into place.</violation>
</file>

<file name="src/oci/store.c">

<violation number="1" location="src/oci/store.c:285">
P2: Fsync the pin directory after `rename` to make tag->digest updates crash-safe; file fsync alone does not persist the directory entry change.</violation>
</file>

<file name="src/oci/manifest.c">

<violation number="1" location="src/oci/manifest.c:295">
P2: `schemaVersion` parsing can accept fractional JSON numbers because `valueint` is used without an integer round-trip check.</violation>

<violation number="2" location="src/oci/manifest.c:385">
P2: Layer descriptor memory is leaked on post-parse validation failures because `nlayers` is incremented too late.</violation>

<violation number="3" location="src/oci/manifest.c:481">
P2: Index descriptor memory leaks when platform parsing fails because `nentries` is incremented after the fallible parse.</violation>
</file>

Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.
Re-trigger cubic

Comment thread src/oci/pull.c Outdated
Comment thread src/oci/media-type.c
return OCI_MT_UNKNOWN;

for (size_t i = 0; i < MEDIA_TYPE_COUNT; i++) {
if (!strcmp(MEDIA_TYPES[i].name, buf))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Media type parsing is case-sensitive, but media type type/subtype tokens are case-insensitive; valid values with different casing will be misclassified as unknown.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/oci/media-type.c, line 100:

<comment>Media type parsing is case-sensitive, but media type type/subtype tokens are case-insensitive; valid values with different casing will be misclassified as unknown.</comment>

<file context>
@@ -0,0 +1,189 @@
+        return OCI_MT_UNKNOWN;
+
+    for (size_t i = 0; i < MEDIA_TYPE_COUNT; i++) {
+        if (!strcmp(MEDIA_TYPES[i].name, buf))
+            return MEDIA_TYPES[i].kind;
+    }
</file context>

Comment thread src/oci/ref.c
} else {
return false;
}
if (i >= len || !is_lower_alnum(s[i]))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Repository-path validation incorrectly rejects valid names with repeated dashes (for example my--repo).

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/oci/ref.c, line 83:

<comment>Repository-path validation incorrectly rejects valid names with repeated dashes (for example `my--repo`).</comment>

<file context>
@@ -0,0 +1,429 @@
+        } else {
+            return false;
+        }
+        if (i >= len || !is_lower_alnum(s[i]))
+            return false;
+    }
</file context>

Comment thread src/oci/ref.c Outdated
Comment thread src/oci/fetch.c Outdated
Comment thread src/oci/blob-store.c
return -1;
}

if (link(w->tmp_path, final_path) < 0) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The commit path is not crash-durable because it never fsyncs the destination directory after linking the blob into place.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/oci/blob-store.c, line 354:

<comment>The commit path is not crash-durable because it never fsyncs the destination directory after linking the blob into place.</comment>

<file context>
@@ -0,0 +1,399 @@
+        return -1;
+    }
+
+    if (link(w->tmp_path, final_path) < 0) {
+        if (errno != EEXIST) {
+            int saved = errno;
</file context>

Comment thread src/oci/store.c
*err_msg = "close on pin tmp file failed";
return -1;
}
if (rename(tmp, path) < 0) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Fsync the pin directory after rename to make tag->digest updates crash-safe; file fsync alone does not persist the directory entry change.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/oci/store.c, line 285:

<comment>Fsync the pin directory after `rename` to make tag->digest updates crash-safe; file fsync alone does not persist the directory entry change.</comment>

<file context>
@@ -0,0 +1,360 @@
+            *err_msg = "close on pin tmp file failed";
+        return -1;
+    }
+    if (rename(tmp, path) < 0) {
+        int saved = errno;
+        unlink(tmp);
</file context>

Comment thread src/oci/manifest.c
Comment thread src/oci/manifest.c
Comment on lines +385 to +405
if (parse_descriptor(desc, &out->layers[out->nlayers], err_msg) < 0)
goto fail;
oci_media_type_t lmt = out->layers[out->nlayers].media_type;
if (!oci_media_type_is_layer(lmt)) {
set_parse_err(err_msg,
"manifest layer has non-layer media type");
goto fail;
}
if (oci_media_type_is_foreign(lmt)) {
set_parse_err(err_msg,
"manifest references foreign (nondistributable) "
"layer; not supported");
goto fail;
}
if (!oci_media_type_is_layer_supported(lmt)) {
set_parse_err(err_msg,
"manifest layer media type is not supported "
"(only tar / tar+gzip / tar+zstd)");
goto fail;
}
out->nlayers++;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Layer descriptor memory is leaked on post-parse validation failures because nlayers is incremented too late.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/oci/manifest.c, line 385:

<comment>Layer descriptor memory is leaked on post-parse validation failures because `nlayers` is incremented too late.</comment>

<file context>
@@ -0,0 +1,707 @@
+            set_parse_err(err_msg, "manifest layer entry is not an object");
+            goto fail;
+        }
+        if (parse_descriptor(desc, &out->layers[out->nlayers], err_msg) < 0)
+            goto fail;
+        oci_media_type_t lmt = out->layers[out->nlayers].media_type;
</file context>
Suggested change
if (parse_descriptor(desc, &out->layers[out->nlayers], err_msg) < 0)
goto fail;
oci_media_type_t lmt = out->layers[out->nlayers].media_type;
if (!oci_media_type_is_layer(lmt)) {
set_parse_err(err_msg,
"manifest layer has non-layer media type");
goto fail;
}
if (oci_media_type_is_foreign(lmt)) {
set_parse_err(err_msg,
"manifest references foreign (nondistributable) "
"layer; not supported");
goto fail;
}
if (!oci_media_type_is_layer_supported(lmt)) {
set_parse_err(err_msg,
"manifest layer media type is not supported "
"(only tar / tar+gzip / tar+zstd)");
goto fail;
}
out->nlayers++;
oci_descriptor_t *slot = &out->layers[out->nlayers];
if (parse_descriptor(desc, slot, err_msg) < 0)
goto fail;
out->nlayers++;
oci_media_type_t lmt = slot->media_type;
if (!oci_media_type_is_layer(lmt)) {
set_parse_err(err_msg,
"manifest layer has non-layer media type");
goto fail;
}
if (oci_media_type_is_foreign(lmt)) {
set_parse_err(err_msg,
"manifest references foreign (nondistributable) "
"layer; not supported");
goto fail;
}
if (!oci_media_type_is_layer_supported(lmt)) {
set_parse_err(err_msg,
"manifest layer media type is not supported "
"(only tar / tar+gzip / tar+zstd)");
goto fail;
}

Comment thread src/oci/manifest.c
*err_msg = type_msg;
return -1;
}
*out = item->valueint;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: schemaVersion parsing can accept fractional JSON numbers because valueint is used without an integer round-trip check.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/oci/manifest.c, line 295:

<comment>`schemaVersion` parsing can accept fractional JSON numbers because `valueint` is used without an integer round-trip check.</comment>

<file context>
@@ -0,0 +1,707 @@
+            *err_msg = type_msg;
+        return -1;
+    }
+    *out = item->valueint;
+    return 0;
+}
</file context>

@Max042004 Max042004 changed the title Add elfuse oci subcommand for pulling and inspecting images Add OCI image support: pull, unpack, run, prune, status, policy May 23, 2026
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 131 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/oci/media-type.c">

<violation number="1" location="src/oci/media-type.c:100">
P2: Media type parsing is case-sensitive, but media type type/subtype tokens are case-insensitive; valid values with different casing will be misclassified as unknown.</violation>
</file>

<file name="src/oci/ref.c">

<violation number="1" location="src/oci/ref.c:83">
P2: Repository-path validation incorrectly rejects valid names with repeated dashes (for example `my--repo`).</violation>
</file>

<file name="src/oci/fetch.c">

<violation number="1" location="src/oci/fetch.c:782">
P2: Manifest fetch skips bearer-challenge parsing when a token is already cached, so 401 responses from expired/stale tokens are not retried with a refreshed token.</violation>
</file>

<file name="src/oci/blob-store.c">

<violation number="1" location="src/oci/blob-store.c:354">
P2: The commit path is not crash-durable because it never fsyncs the destination directory after linking the blob into place.</violation>
</file>

<file name="src/oci/store.c">

<violation number="1" location="src/oci/store.c:285">
P2: Fsync the pin directory after `rename` to make tag->digest updates crash-safe; file fsync alone does not persist the directory entry change.</violation>
</file>

<file name="src/oci/manifest.c">

<violation number="1" location="src/oci/manifest.c:295">
P2: `schemaVersion` parsing can accept fractional JSON numbers because `valueint` is used without an integer round-trip check.</violation>

<violation number="2" location="src/oci/manifest.c:385">
P2: Layer descriptor memory is leaked on post-parse validation failures because `nlayers` is incremented too late.</violation>
</file>

<file name="docs/usage.md">

<violation number="1" location="docs/usage.md:135">
P2: Contradictory documentation for `--user`. The options table describes it as 'numeric only', but the User and WorkingDir section immediately below describes detailed symbolic-name resolution (accepting symbolic `name`, `name:group`, reading /etc/passwd and /etc/group). These cannot both be correct.</violation>
</file>

<file name="src/oci/inspect.h">

<violation number="1" location="src/oci/inspect.h:57">
P3: The `suppress_layer_reuse` comment is inverted and documents the opposite runtime behavior, which can cause callers to pass the wrong value.</violation>
</file>

<file name="externals/zstd/VENDORING.md">

<violation number="1" location="externals/zstd/VENDORING.md:12">
P3: The file references 'oci-roadmap.md', which does not exist in the codebase. Remove the broken reference or update it to point to the actual document containing the policy commitment.</violation>
</file>

Note: This PR contains a large number of files. cubic only reviews up to 100 files per PR, so some files may not have been reviewed. cubic prioritizes the most important files to review.
On a pro plan you can use ultrareview for larger PRs.

Re-trigger cubic

Comment thread docs/usage.md
| `-e KEY=VAL`, `--env KEY=VAL` | Set or replace one env var (repeatable) |
| `-e KEY`, `--env KEY` | Import `KEY` from the host environ (repeatable) |
| `-w DIR`, `--workdir DIR` | Override image WorkingDir |
| `-u UID[:GID]`, `--user UID[:GID]` | Override image User (numeric only) |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Contradictory documentation for --user. The options table describes it as 'numeric only', but the User and WorkingDir section immediately below describes detailed symbolic-name resolution (accepting symbolic name, name:group, reading /etc/passwd and /etc/group). These cannot both be correct.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At docs/usage.md, line 135:

<comment>Contradictory documentation for `--user`. The options table describes it as 'numeric only', but the User and WorkingDir section immediately below describes detailed symbolic-name resolution (accepting symbolic `name`, `name:group`, reading /etc/passwd and /etc/group). These cannot both be correct.</comment>

<file context>
@@ -99,6 +99,179 @@ and memory access, and per-thread inspection. Implementation details, including
+| `-e KEY=VAL`, `--env KEY=VAL` | Set or replace one env var (repeatable) |
+| `-e KEY`, `--env KEY` | Import `KEY` from the host environ (repeatable) |
+| `-w DIR`, `--workdir DIR` | Override image WorkingDir |
+| `-u UID[:GID]`, `--user UID[:GID]` | Override image User (numeric only) |
+| `--keep` | Keep the per-run cloned rootfs after exit |
+| `--name NAME` | Reserved: deterministic clone-dir suffix (ignored today) |
</file context>
Suggested change
| `-u UID[:GID]`, `--user UID[:GID]` | Override image User (numeric only) |
| `-u UID[:GID]`, `--user UID[:GID]` | Override image User (supports numeric UID[:GID] or symbolic name[:group]) |

Comment thread src/oci/inspect.h
Comment on lines +57 to +62
/* When true (default), render a "layer reuse:" section after the
* manifest layer table. Setting this to false suppresses the section
* entirely (useful for tests that only want to verify the renderer
* baseline without dedup compute side-effects). The CLI never sets
* this to false.
*/
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: The suppress_layer_reuse comment is inverted and documents the opposite runtime behavior, which can cause callers to pass the wrong value.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/oci/inspect.h, line 57:

<comment>The `suppress_layer_reuse` comment is inverted and documents the opposite runtime behavior, which can cause callers to pass the wrong value.</comment>

<file context>
@@ -45,9 +46,21 @@ typedef struct {
+     * convention. Pure information: dedup metrics never write to disk.
+     */
+    const char *volume_root;
+    /* When true (default), render a "layer reuse:" section after the
+     * manifest layer table. Setting this to false suppresses the section
+     * entirely (useful for tests that only want to verify the renderer
</file context>
Suggested change
/* When true (default), render a "layer reuse:" section after the
* manifest layer table. Setting this to false suppresses the section
* entirely (useful for tests that only want to verify the renderer
* baseline without dedup compute side-effects). The CLI never sets
* this to false.
*/
/* When false (default), render a "layer reuse:" section after the
* manifest layer table. Setting this to true suppresses the section
* entirely (useful for tests that only want to verify the renderer
* baseline without dedup compute side-effects). The CLI never sets
* this to true.
*/


## Why vendored, decode-only

`oci-roadmap.md` Q9 commits the OCI work to hand-rolled C: no Go, no Rust,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: The file references 'oci-roadmap.md', which does not exist in the codebase. Remove the broken reference or update it to point to the actual document containing the policy commitment.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At externals/zstd/VENDORING.md, line 12:

<comment>The file references 'oci-roadmap.md', which does not exist in the codebase. Remove the broken reference or update it to point to the actual document containing the policy commitment.</comment>

<file context>
@@ -0,0 +1,72 @@
+
+## Why vendored, decode-only
+
+`oci-roadmap.md` Q9 commits the OCI work to hand-rolled C: no Go, no Rust,
+no `cargo` / `go` in the build matrix. zstd is the only OCI-spec layer
+compression beyond gzip that has wide registry support, and the upstream
</file context>

Copy link
Copy Markdown
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase onto the latest main branch and squash/rework the commits into fewer, cleaner ones.

Max042004 added 4 commits May 23, 2026 22:36
Introduce the `elfuse oci` subcommand surface with the first two
operations needed to retrieve and read an OCI image without leaving
the local store:

- `oci ref` parsing -- host[:port]/repo[:tag|@digest], docker.io
  default-namespace handling
- SHA-256 digester + content-addressable blob store
  (sha256:<hex>/<tail> on-disk layout, tmp+rename commit)
- manifest, image-index, image-config parsers (cJSON-backed)
- HTTPS registry client (libcurl): anonymous fetch + bearer-token
  WWW-Authenticate challenge handling
- private-registry options: basic auth, custom CA bundle,
  loopback-only --insecure
- local pin store + `oci pull` pipeline driving the registry round
  trips (top-level fetch, index recurse, config fetch, layer fetch,
  pin write)
- offline `oci inspect` renderer that walks the local manifest tree
  without touching the network

Vendors externals/cjson (MIT, v1.7.18) for JSON parsing. Wires the
oci/ subdirectory into the build and adds five test-oci-*
native-host unit tests for the new modules.
Turn a pulled image into something `elfuse` can actually execute:

- vendor decode-only zstd v1.5.6 (compression / dictBuilder /
  legacy paths excluded; only oci/decompress.c includes the header)
- tar reader: ustar + GNU long-name records, used by layer apply
- decompression dispatch: gzip via system zlib, zstd via vendored
  decode-only build, dispatched by layer media type
- layer applier with whiteout-aware merge: typeflag '1' (hardlink),
  '2' (symlink), '5' (directory), `.wh.*` markers, symlink-escape
  containment
- per-image sysroot on a case-sensitive APFS sparsebundle
  (hdiutil-provisioned, image_layout v1)
- per-run rootfs via clonefile(2) on top of the sysroot
- `oci unpack` and `oci clone` subcommands that exercise the above
- `oci inspect` extended with the image-config runtime block
  (Entrypoint / Cmd / Env / WorkingDir / User)
- runspec resolver merging image-config defaults with CLI
  Entrypoint / Cmd / Env / WorkingDir / User overrides
- PATH resolver that walks the guest /usr/local/sbin..:/sbin chain
  inside the sysroot (no host PATH leakage)
- `elfuse_launch` extraction from main.c so the elfuse runtime can
  be reused by both legacy ./binary mode and the new `oci run`
- `oci run` subcommand that ties pull -> unpack -> clone -> launch
- `oci-layout` 1.0.0 marker at the store root
- migrate store pins from `refs/<name>` flat files to a single
  `index.json` (OCI image-layout 1.0); auto-migrate on store open
Round out the store with garbage collection, caching, and a faster
pull path:

- origin sidecar attached to each unpacked image tree so the GC
  walker can attribute layer blobs back to their owning image
- root-set walker that joins image trees to blob digests
- mark-and-sweep `oci prune` with `--older-than` and `--keep-bytes`
- per-layer raw-tar snapshot cache (APFS clonefile) so re-unpacking
  the same layer reuses the previous extracted tree
- ChainID-keyed stack snapshot cache that materializes a full
  layer-stack tree in one clonefile when the chain has been seen
  before
- `layers/` schema marker v2 + auto-migration from legacy v1
  (legacy v1 entries wiped; blobs and image trees untouched)
- raw-tar layer apply mode used to populate the per-layer cache
- unpack orchestrator rewritten on raw + ChainID stack caches
- `oci rebuild-cache` for back-filling stack snapshots on stores
  that were created before the cache existed
- cross-image dedup metrics in `oci inspect` (layer-reuse %, bytes
  saved)
- `oci status` (text + `--json`) summarizing blobs / layers /
  stacks / pinned images
- `oci pull --refresh` to revalidate the top-level manifest against
  the registry without re-downloading unchanged layers
- parallel blob fetch via curl_multi
- HTTP Range resume for partial blob downloads
- per-blob progress callback + TTY / non-TTY renderers
- podman / skopeo-style `policy.json` schema and loader (default,
  per-transport, per-repository rules)
- `policy.json` plumbed into fetch and the `oci pull` CLI
- `registries.d/*` overlay merged with policy (per-registry
  insecure / ca_bundle / auth_file); CLI flags still win
Make `oci run` work against real public images (alpine, busybox,
python, ruby, debian) and lock the surface down with end-to-end
fixtures.

Runtime surface:

- writable clone-rootfs DoD: the per-run rootfs is writable
  out of the box, so guests that mutate /tmp, /var, /run work
  unchanged
- runtime files injection: /etc/resolv.conf, /etc/hosts,
  /etc/hostname populated from the host into the clone-rootfs
- /dev/full and /dev/console emulation in the syscall layer
- /proc surface: cgroup, hostname, comm, statm entries that
  glibc startup and procps tooling read
- image-config `User` symbolic resolution: name and name:group
  forms looked up against the guest /etc/passwd and /etc/group
  before falling back to numeric
- `oci run` walks the image index to the linux/arm64 leaf manifest
  (Phase 3 fix; previously fed the top-level index to the
  config-loader and crashed on multi-arch images)

Bug fixes uncovered by cold-cache runs:

- layer apply no longer rejects the root tar entry "./"
- unpack stages files via copyfile(2) with COPYFILE_CLONE fallback
  so cross-volume unpack (store on internal SSD, sysroot on the
  APFS sparsebundle) succeeds
- tar reader handles PAX 'x' / 'g' extended-header `path` and
  `linkpath` records (busybox and python:alpine layers use them)

Compat tests:

- `tests/test-oci-compat.sh` shell smoke (in-tree fixtures)
- `OCI_COMPAT_TEST=1` heavy mode that provisions a scratch
  sparsebundle and drives three fixtures end-to-end:
  alpine-shaped, busybox-shaped hardlink dispatch, two-layer
  whiteout
- `OCI_FETCH_ONLINE=1` alpine:3 end-to-end smoke (opt-in;
  requires network)

`ELFUSE_OCI_PROGRESS=plain` env disables the pull progress
in-place CSI redraw for terminals that don't honor cursor-up
escapes (issue surfaced on legacy Terminal.app panes).

Documentation: `docs/oci.md` Phase 4 runtime surface and
libc-adjacent envelope notes (what guests can / can't expect
from the synthetic /etc, /dev, /proc).
@sysprog21 sysprog21 deleted a comment from cubic-dev-ai Bot May 23, 2026
Copy link
Copy Markdown
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refine per review messages from cubic.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 144 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="externals/cjson/VENDORING.md">

<violation number="1" location="externals/cjson/VENDORING.md:1">
P3: Incorrect release date for cJSON v1.7.18. The upstream release date is 2024-05-13, not 2024-07-30. This is a documentation inaccuracy.</violation>
</file>

<file name="src/oci/volume-list.c">

<violation number="1" location="src/oci/volume-list.c:114">
P2: Handle `readdir()` errors explicitly; otherwise a directory read failure is silently treated as EOF and can return an incomplete volume list.</violation>
</file>

<file name="src/oci/tar.h">

<violation number="1" location="src/oci/tar.h:40">
P2: Expose borrowed tar entry strings as `const char *` to prevent callers from mutating reader-owned memory.</violation>
</file>

Note: This PR contains a large number of files. cubic only reviews up to 100 files per PR, so some files may not have been reviewed. cubic prioritizes the most important files to review.
On a pro plan you can use ultrareview for larger PRs.

Re-trigger cubic

Comment thread src/oci/volume-list.c
size_t cap = 0;
struct dirent *de;
int rc = 0;
while ((de = readdir(dp)) != NULL) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Handle readdir() errors explicitly; otherwise a directory read failure is silently treated as EOF and can return an incomplete volume list.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/oci/volume-list.c, line 114:

<comment>Handle `readdir()` errors explicitly; otherwise a directory read failure is silently treated as EOF and can return an incomplete volume list.</comment>

<file context>
@@ -0,0 +1,161 @@
+    size_t cap = 0;
+    struct dirent *de;
+    int rc = 0;
+    while ((de = readdir(dp)) != NULL) {
+        const char *name = de->d_name;
+        if (!name_is_sha256_dir(name))
</file context>

Comment thread src/oci/tar.h
* the next oci_tar_next call. Callers that need to keep either past
* the next iteration must duplicate the strings themselves.
*/
char *path;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Expose borrowed tar entry strings as const char * to prevent callers from mutating reader-owned memory.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/oci/tar.h, line 40:

<comment>Expose borrowed tar entry strings as `const char *` to prevent callers from mutating reader-owned memory.</comment>

<file context>
@@ -0,0 +1,87 @@
+     * the next oci_tar_next call. Callers that need to keep either past
+     * the next iteration must duplicate the strings themselves.
+     */
+    char *path;
+    char *linkname;
+    uint64_t size;
</file context>

@@ -0,0 +1,35 @@
# Vendored cJSON
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: Incorrect release date for cJSON v1.7.18. The upstream release date is 2024-05-13, not 2024-07-30. This is a documentation inaccuracy.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At externals/cjson/VENDORING.md:

<comment>Incorrect release date for cJSON v1.7.18. The upstream release date is 2024-05-13, not 2024-07-30. This is a documentation inaccuracy.</comment>

<file context>
@@ -0,0 +1,35 @@
+# Vendored cJSON
+
+This directory contains a vendored copy of [cJSON](https://github.com/DaveGamble/cJSON),
+the ultralightweight JSON parser written in ANSI C. cJSON ships as a single
+`.c` / `.h` pair and is dual-licensed under the MIT license (see `LICENSE`).
+
+## Why vendored
+
+`oci-roadmap.md` Q9 commits Phase 1 to hand-rolled C alongside the existing
</file context>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants