worker: fix two silent output-corruption bugs (absolute symlinks + zero-digest files)#2346
Merged
palfrey merged 6 commits intoMay 20, 2026
Conversation
…argets When the worker's work directory is populated via DirectoryCache, output paths can be absolute symlinks pointing into the cache directory (/var/.../directory_cache/...). The previous output collection code uploaded these as raw SymlinkInfo with absolute targets that are meaningless on the Bazel client, causing "No such file or directory" errors when the client materialised the action result. Detect absolute symlinks in output paths and resolve them: upload directory contents as Tree protos and file contents as regular files. Relative symlinks (intentionally created by the action) are still preserved as symlinks. Updates upload_dir_and_symlink_test to use a relative symlink (the previous /dev/null absolute symlink is now resolved, breaking the old assertion) and adds a new upload_absolute_symlink_resolves_contents regression test that verifies an out-of-tree payload reachable via an absolute symlink lands in CAS as a file. Ported from upstream TraceMachina/nativelink@0807153 (PR TraceMachina#2243). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
FilesystemStore::get_file_entry_for_digest previously returned a synthetic FileEntry for zero-digest blobs pointing to a content_path that never exists on disk (FilesystemStore deliberately never persists zero-byte files). Downstream worker output-materialisation code that took the prefetched hardlink path would try to hard_link from this non-existent source, fail, and either silently produce a missing file or fall back without enough context to diagnose the failure. This is the OUTPUT-materialisation companion to PR TraceMachina#2338's DirectoryCache zero-byte fix, which lives on the input-fetch path. The two changes are complementary: TraceMachina#2338 covers the directory cache short-circuit; this covers the FilesystemStore API and the running_actions_manager fallback. Changes: - FilesystemStore::get_file_entry_for_digest now returns Code::NotFound for zero digests instead of a synthetic FileEntry, forcing callers to materialise empty files via fs::create_file rather than hard_linking from a phantom source. - running_actions_manager::download_to_directory adds err_tip context on the zero-digest fs::create_file / write_all so a failure here is attributable to the empty-file path. - Updates the existing get_file_entry_for_zero_digest test to assert the new NotFound behaviour and renames it to get_file_entry_for_zero_digest_returns_not_found. - Adds download_to_directory_zero_digest_empty_file_test that exercises an empty file declared inside an input directory and verifies it materialises on disk with zero bytes. This test intentionally does NOT collide with PR TraceMachina#2338's DirectoryCache zero-byte test (which validates a different short-circuit path). Ported from upstream TraceMachina/nativelink@19d7e20 (PR TraceMachina#2243). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Audit of the two ported commits (e57b675b, b1dc3a1c) flagged two
correctness-relevant gaps. Both are now closed with strict assertions so
the underlying bugs cannot silently regress.
1. upload_absolute_symlink_to_directory_uploads_tree
The original e57b675b regression test covers absolute-symlink-to-file
but not absolute-symlink-to-directory, which exercises an entirely
separate code path (upload_directory + Tree proto serialisation). The
new test creates an out-of-tree directory containing inner.txt,
absolute-symlinks it into the work dir, runs the action, and:
- asserts output_directory_symlinks and output_file_symlinks are
BOTH empty (the symlink must NOT be preserved),
- asserts output_folders contains exactly one entry at the symlink
path, proving Tree-upload happened,
- walks the uploaded Tree, locates inner.txt, fetches its blob from
CAS and verifies the content. This proves the directory was
actually traversed and uploaded — not stubbed.
2. download_to_directory_zero_digest_empty_file_test (strengthened)
Extended the existing single-file test to cover:
- a second sibling zero-digest file (proves the path is not a
single-file fluke),
- a zero-digest file nested inside a subdirectory (proves the
recursive download_to_directory caller also handles NotFound from
get_file_entry_for_digest — not just the top-level invocation),
- strict per-file assertions: is_file, !is_symlink, len == 0, and
a read-back that confirms the file is actually empty rather than
a phantom dirent.
Verdict on the dropped checks:
- Dangling absolute symlink: source returns OutputType::None which is
the correct graceful behaviour, but writing a test for it requires
fabricating a broken symlink mid-action and is comparatively low
value; left uncovered intentionally.
- File mode on zero-digest materialisation: the worker uses
fs::create_file with no explicit mode, so the result follows umask
and would be brittle across CI environments. Length + type
assertions are the stronger invariant.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
palfrey
requested changes
May 19, 2026
Two fixes for the new tests in upload_dir_and_symlink_test and the absolute-symlink tests: 1. Restore the absolute-symlink case in upload_dir_and_symlink_test. The previous commit replaced `empty_sym -> /dev/null` (absolute) with `rel_sym -> dir1/file` (relative), dropping coverage for the absolute branch in that integration test. Re-add `empty_sym` pointing to an out-of-tree zero-byte payload we create explicitly. This exercises both the relative-symlink-preserved path (via `rel_sym`) and the new absolute-symlink-resolved path (via `empty_sym`) without relying on `/dev/null`'s character-device semantics. `empty_sym` now appears in `output_files` (resolved, zero bytes, sha256 = e3b0c44...) rather than `output_file_symlinks`. 2. Replace the factually-wrong "Windows does not support symlinks" comments on the two new tests with an accurate explanation: Windows supports symlinks via std::os::windows::fs::symlink_file / symlink_dir, but the fixtures use `ln -s`, which is the Unix-only reason these tests are gated to non-Windows targets.
…_test Previous iterations of this PR sidestepped `/dev/null` by swapping the absolute-symlink fixture for either a relative symlink (lost coverage) or an out-of-tree real empty file (added scaffolding to dodge a corner case we should just test directly). Restore the original `ln -s /dev/null empty_sym` fixture. The post-fix worker code resolves the absolute symlink via `fs::metadata` (follows), sees a character device (`is_dir() == false`), and falls into the "upload as file" branch. `upload_file` opens `/dev/null` and reads to EOF — `/dev/null`'s contract is that reads return 0 bytes immediately — producing the canonical sha256 empty digest (e3b0c44…) with size 0. That's well-defined, harmless behavior and a real-world Bazel pattern (rules sometimes use `ln -sf /dev/null x` to create empty outputs). Test now locks in this contract directly, no scaffolding required. Drops the `external_root` / `external_file` setup since it was only needed to avoid `/dev/null`.
palfrey
approved these changes
May 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Two correctness fixes on the worker's output-upload and input-materialization paths. Both are silent-corruption bugs — actions complete "successfully" but the output artifacts the client receives are wrong or missing.
What's fixed
Absolute symlinks in output upload (
running_actions_manager.rs::inner_upload_results)When a Bazel action emits an absolute symlink as one of its declared outputs (e.g.
output_link → /usr/lib/libfoo.soor→ /tmp/nativelink/cache/...), the pre-fix worker uploads the literal target path string to the AC. The client (or a different worker that downloads the AC result) follows that symlink on its own filesystem and finds nothing — the worker-local path doesn't exist on the requester.The fix branches on
Path::is_absolute()in the symlink handler. Absolute targets are resolved — directory targets get uploaded as Tree protos viaupload_directory+serialize_and_upload_message, file targets viaupload_file. Relative symlinks fall through to the existingupload_symlinkcodepath unchanged (relative symlinks travel correctly with the tree).This aligns the actual worker behavior with our advertised
symlink_absolute_path_strategy: Disallowedin CacheCapabilities.Zero-digest files missing from worker exec dirs (
FilesystemStore::get_file_entry_for_digest)The pre-fix function returned a synthetic
FileEntryfor the zero digest pointing at a content path that never exists on disk (zero-byte files aren't stored — nothing to store). Downstream hardlink calls failed silently — the zero-byte file went missing from the action's output tree. Swift's incremental compile produces empty.ofiles routinely; same for.swiftdocplaceholders.Fix: return
Code::NotFound. Callers must take the explicit empty-file path (fs::create_file+write_all(&[])). Companion change indownload_to_directorywraps the zero-digest branch witherr_tipso any future failures are diagnosable rather than silent.Provenance
Equivalent to upstream commits
08071533band19d7e2052from #2243, ported atomically to current main. A third upstream commit (cf801c2b, fixspawn_upload_to_remotemissing output dir blobs) was intentionally not ported because it patches a function that doesn't exist on main; that bug is latent until the deferred fast/slow upload split lands.Non-overlap with #2338
PR #2338 (
8051ca9e) fixed zero-byte handling in the DirectoryCache (input-fetch materialization path). This PR fixes the related bug in FilesystemStore + running_actions_manager (output-materialization + worker exec-dir path). Different stores, different code paths, different test files.Note on backward compatibility
Not a breaking change in the semver sense (no API surface change, no config surface change). However, the wire shape of
ActionResultchanges for actions that produce absolute symlinks: those entries now appear inoutput_files/output_folders(resolved content) rather thanoutput_file_symlinks(worker-local target). This brings actual behavior into compliance with our advertisedsymlink_absolute_path_strategy: Disallowed. Clients depending on the prior (contract-violating) behavior will see different output entries; the byte semantics are equivalent.Type of change
How Has This Been Tested?
cargo build -p nativelink-store -p nativelink-workercleancargo test -p nativelink-store --test filesystem_store_test— 22 passedcargo test -p nativelink-worker --test running_actions_manager_test— 33 passedcargo clippy -p nativelink-store --lib --tests -- -D warningsclean (2 pre-existing clippy errors on main in newly-mergedr2_store.rsconfirmed reproducible on plainorigin/main)cargo clippy -p nativelink-worker --lib -- -D warningscleancargo fmt --checkclean for both packagesNew tests:
upload_absolute_symlink_resolves_contents— symlink-to-file resolution branchupload_absolute_symlink_to_directory_uploads_tree— symlink-to-directory branch, asserts Tree proto upload + nested blob in CASdownload_to_directory_zero_digest_empty_file_test— extended with sibling + nested-in-subdir zero-digest files to exercise the recursive caller pathget_file_entry_for_zero_digest_returns_not_found— FilesystemStore API contractUpdated
upload_dir_and_symlink_test— retains theln -s /dev/null empty_symfixture; assertion updated to expect the now-resolved-to-empty-file behavior (canonical empty digeste3b0c44.../0).Checklist
bazel test //...passes locally (verified viacargoonly; bazel-aspect clippy clean for changed files per grep againstclippy.tomldisallowed-methods)This change is