fix(rest): cp box->host extracts single file to dest (POL-37)#633
fix(rest): cp box->host extracts single file to dest (POL-37)#633G4614 wants to merge 4 commits into
Conversation
… dir (POL-37) The REST copy_out extracted the response tar with archive.unpack(host_dst) unconditionally — i.e. it always treated host_dst as a target directory. For `boxlite cp box:/etc/hosts ./hosts` the server's tar carries the parent path entries (`./`, `./etc/`) along with `./etc/hosts`, so the client built `./hosts/etc/hosts` instead of writing the file's content to `./hosts`. The user reported this as "writes raw tar instead of the file." Now the client counts regular file entries (ignoring the synthetic directory entries the server interleaves) and, when the tar has exactly one regular file and the dest doesn't look like a directory, extracts that one entry's body to host_dst — matching `docker cp`'s single-file UX. Multi-file and trailing-slash cases keep the previous "unpack into directory" path with proper overwrite-refusal added. Also threads include_parent / follow_symlinks through the query string so a future server-side fix that honors them sends leaner tars that the existing single-file branch will pick up automatically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
f56bb75 to
e95a4ec
Compare
…cker
The first cut of POL-37 inlined an extract_tar_bytes_to_path in
src/boxlite/src/rest/litebox.rs that paralleled
src/shared/src/tar.rs::unpack (the helper the local copy_out path
already uses via box_impl.rs:582). That meant two detection rules,
two extraction loops, two test suites, and two paths a future fix
has to remember to update. The PR description claimed "the client
still delegates to the shared tar unpacker and inherits its
detection logic" — it didn't.
This change makes that statement true.
Changes:
- shared::tar gains `unpack_bytes(bytes, dest, opts)` as the
sibling of the existing path-based `unpack`. Both go through a
new private `unpack_archive_blocking` helper that takes a
re-openable reader factory (`File::open(&path)` for unpack,
`Cursor::new(bytes.as_slice())` for unpack_bytes), so neither
public entry has to duplicate the FileToFile / IntoDirectory
branches.
- `detect_extraction_mode` now skips entries whose type is
`is_dir()` when counting. The whole reason the REST handler
needed its own detector was that shared counted every entry —
the docker-cp `./` + `./etc/` + `./etc/hosts` tar tripped
`count > 1` and got routed to IntoDirectory mode. Skipping
pure-directory entries lands the single regular file at the
user-supplied destination, matching docker semantics. Other
non-content entries (Symlink, hardlink Link, sparse, char/block,
fifo) DO still count toward `content_count`, so a tar carrying
Regular + Symlink stays in directory mode and the symlink is
not silently dropped — covering the previously-latent multi-
entry footgun the REST-only detector had.
- The FileToFile extraction path likewise tolerates synthetic
directory entries: it walks past `is_dir()` headers before
unpacking the first content entry. Confirmed via the detection
pass that there is exactly one such entry, so there's no
ambiguity about which one we land at `dest`.
- REST `litebox.rs::copy_out` becomes a 7-line
`boxlite_shared::tar::unpack_bytes(bytes.to_vec(),
host_dst.to_path_buf(), UnpackContext { overwrite, mkdir_parents:
true, force_directory: false }).await`. The 100-line inline
helper, its stale "Extract a tar archive to a host directory"
doc, and the unreachable "tar promised a single regular file
but the second pass found none" branch all go away.
- The four inline `tar_extract_tests` (single-file-to-dest,
trailing slash, multi-file, overwrite refusal) move to shared
as `unpack_bytes_*` tests where they actually exercise the
helper, plus a new `unpack_bytes_regular_plus_symlink_uses_
dir_mode_not_file_mode` to pin the symlink-not-dropped
behavior the old REST-side detector could not have caught.
Net diff for tests is positive — the shared coverage strictly
supersets the deleted inline coverage.
Local copy_out (src/boxlite/src/litebox/box_impl.rs:582) inherits
the same detection refinement automatically. shared::tar::pack
doesn't emit synthetic ancestor-dir entries for single-file packs
(verified via `append_path_with_name` at shared/src/tar.rs:76), so
local behavior is unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
The Codecov patch report on boxlite-ai#633 flagged 10 uncovered lines in shared/tar.rs; most were `?` propagation paths in the new `unpack_archive_blocking` core that fire only when the `open` closure returns `Err`. The path-based `unpack` is the only entry point whose closure can actually fail (the `unpack_bytes` Cursor never errors), so one test against a nonexistent tar path covers them all: - the `map_err` closure body (formatting the named error) - the `?` after `detect_extraction_mode` (line 161) - the `?` inside `detect_extraction_mode` calling `open()?` (line 277) Asserting on the error *message* (rather than just `is_err`) is the load-bearing piece: callers that don't control the tar location (e.g. local `copy_out` reading a tar staged by a peer process) need the operation name AND the missing path so the operator can act, not a generic spawn_blocking traceback. Patch coverage 96.0% → 96.5%. The remaining 5 uncovered lines are `tokio::JoinError` map_err arms (124 / 142 — spawn_blocking panic, not reachable without panic injection), the symmetric second-pass `open()?` failure inside FileToFile / IntoDirectory branches (188 / 238 — a path-side race or a Cursor that can't fail), and the malformed-tar-entry `continue` arm (282 — covering it ties the test to the `tar` crate's specific header-decode failure modes). Those are defensive paths that would require either panic injection or brittle malformed-tar fixtures to hit; covering them would trade one regression-guard for several Codecov-driven near-stubs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR refactors tar extraction into a shared, generic routine that accepts a configurable reader opener, enabling consistent extraction behavior across on-disk and in-memory archives. REST copy_out now threads CopyOptions through the download step and delegates to the shared unpack_bytes routine, replacing local tar handling. ChangesShared tar extraction and REST integration
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
gamnaansong seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/shared/src/tar.rs (1)
215-237:⚠️ Potential issue | 🟠 Major | ⚡ Quick winLogic error:
overwrite=falsecheck fires after creating the directory withmkdir_parents.When
destdoesn't exist,mkdir_parents=truecreates it at line 218. Then line 232 checksdest.exists() && !opts.overwrite, which is now true — causing an error for a directory we just created. This breaks themkdir_parents + overwrite=falsecombination forIntoDirectorymode.The check should occur before potentially creating the directory:
🐛 Proposed fix: reorder the overwrite check
ExtractionMode::IntoDirectory => { + if dest.exists() && !opts.overwrite { + return Err(BoxliteError::Storage(format!( + "destination {} exists and overwrite=false", + dest.display() + ))); + } if !dest.exists() { if opts.mkdir_parents { std::fs::create_dir_all(dest).map_err(|e| { BoxliteError::Storage(format!( "failed to create destination {}: {}", dest.display(), e )) })?; } else { return Err(BoxliteError::Storage(format!( "destination {} does not exist", dest.display() ))); } } - if dest.exists() && !opts.overwrite { - return Err(BoxliteError::Storage(format!( - "destination {} exists and overwrite=false", - dest.display() - ))); - } let mut archive = tar::Archive::new(open()?);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/shared/src/tar.rs` around lines 215 - 237, The overwrite check is done after creating the destination when handling ExtractionMode::IntoDirectory, causing mkdir_parents=true to create dest then fail if opts.overwrite is false; move the check that returns an error when dest.exists() && !opts.overwrite to occur before any call to std::fs::create_dir_all(dest) so that the code first refuses to proceed when an existing destination disallows overwrite (refer to ExtractionMode::IntoDirectory, dest, opts.mkdir_parents, opts.overwrite and the std::fs::create_dir_all call).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/shared/src/tar.rs`:
- Around line 215-237: The overwrite check is done after creating the
destination when handling ExtractionMode::IntoDirectory, causing
mkdir_parents=true to create dest then fail if opts.overwrite is false; move the
check that returns an error when dest.exists() && !opts.overwrite to occur
before any call to std::fs::create_dir_all(dest) so that the code first refuses
to proceed when an existing destination disallows overwrite (refer to
ExtractionMode::IntoDirectory, dest, opts.mkdir_parents, opts.overwrite and the
std::fs::create_dir_all call).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 0b518b0e-ae7b-40af-af80-9c2d0f0f9985
📒 Files selected for processing (2)
src/boxlite/src/rest/litebox.rssrc/shared/src/tar.rs
when REST
copy_outa regular file from box to outside, a dir created in host because of the tar logicTest plan
cargo test -p boxlite --lib --features rest tar_extract— 4 unit tests, in-memory tar construction:cp box:/etc/hosts ./hosts(tar =./+./etc/+./etc/hosts)./hosts/etc/hosts(directory tree)./hostsis a regular file, content =/etc/hostsfrom the boxcp box:/foo ./out/(trailing slash)cp box:/dir ./dir-copy(multi-file)--no-overwrite+ dst existsexists and overwrite=false, pre-existing content preservedTwo-side: temporarily restore the old
archive.unpackbody forextract_tar_bytes_to_path→single_regular_file_lands_at_dest_pathfails withdest must be a regular file, not a directory treeandoverwrite_false_refuses_existing_destfails withexpected overwrite-refusal error, got: failed to unpack— those are exactly the two POL-37 symptoms. Restore the fix: 4/4 pass.Summary by CodeRabbit
Bug Fixes
Tests