Skip to content

Commit bff444d

Browse files
MarcusSorealheiserneestocclaude
authored
Bugfix race condition from combined mac opt #2362 (#2366)
* util: make permission walks symlink-safe The recursive permission walk `set_perms_recursive_impl` (driving both `set_readonly_recursive` and `set_dir_writable_recursive`) used `fs::metadata` (stat), which follows symlinks. On input trees containing symlinks - e.g. `.venv/bin/python3` produced by rules_python / rules_apple venv tooling - this had two failure modes: * A symlink to a directory reported `is_dir() == true`, so the walk recursed *through* the link, escaping the materialized tree or descending into an unrelated directory. * A symlink was passed to `set_permissions`; `chmod` follows symlinks, so it mutated the link's target. When the target did not exist (a dangling link - common when a venv points outside the action's input set) the `chmod` returned ENOENT and failed the entire walk. That ENOENT failure surfaced as `set_readonly_recursive` erroring inside `DirectoryCache::get_or_create`, which made `prepare_action_inputs` log "Directory cache failed, falling back to traditional download" and take the slow `download_to_directory` path. Fix: `set_perms_recursive_impl` now uses `symlink_metadata` (lstat) and returns early on symlink entries - it never chmods a symlink and never recurses through one. Regular files keep their existing read-only (0o555) treatment, so the CAS-hardlinked-inode hermeticity contract (PR #2347) is unchanged. `hardlink_directory_tree_recursive` already recreated symlinks as symlinks; its symlink branch is reordered ahead of the `is_dir()` / `is_file()` branches to make the symlink-first intent explicit and robust. Adds regression tests covering set-readonly, set-dir-writable, and hardlink/clone walks over a tree containing a symlink to an in-tree file, a dangling relative symlink, and a symlink to an in-tree directory, asserting each walk succeeds and the symlinks are preserved with their targets intact. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * worker: make directory-cache entries already-writable `DirectoryCache` locks each cache entry down with `set_readonly_recursive` after construction. Previously that helper made the entire entry tree mode 0o555 — directories included — so every materialization had to follow up with a separate `set_dir_writable_recursive` recursive chmod walk in `prepare_action_inputs` to re-add write permission to directories (Bazel actions declare outputs at paths nested inside input subdirectories). That post-walk is redundant work. Directories are not hardlink-shared between cache entries — only file content inodes are — so directory mode can safely be made writable once, at the cache entry, instead of on every materialization. `set_readonly_recursive` now locks a tree down as a cache entry by making only FILES read-only (0o555) and leaving DIRECTORIES writable (0o755). Both materialization paths then produce a directly-usable tree: - macOS `clonefile(2)` copies the source's modes verbatim, so the clone's directories are writable and its files read-only. - The Linux per-file hardlink walk creates fresh directories (writable) and hardlinks files (which keep the source inode's read-only mode). Files stay read-only on both paths, so the hermeticity contract and the CAS-hardlink shared-inode invariant (PR #2347) are preserved. With the materialized tree already correct, the `set_dir_writable_recursive` call is removed from `prepare_action_inputs`. `set_dir_writable_recursive` itself is unchanged and still used by the cache eviction cleanup path. Tests: - fs_util: `test_set_readonly_recursive` now also asserts directories stay writable; the macOS clonefile tests assert cloned subdirs are writable and that a nested output can be created with no `set_dir_writable_recursive` walk; `test_set_dir_writable_recursive_walks_nested_dirs` keeps covering the eviction-cleanup helper. - directory_cache: new `test_materialized_tree_dirs_writable_files_readonly` builds a nested tree and asserts that, after `get_or_create` on both the fresh-materialize and cache-hit paths, every directory is writable and every file is read-only, with no separate chmod walk. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * worker: hardlink CAS blobs in directory-cache construct `DirectoryCache::construct_directory` previously materialized every file by fetching the whole blob into RAM (`get_part_unchunked`) and writing a full copy (`fs::write`). For a cache that exists to avoid re-fetching from the CAS, this is the dominant cost on a miss. Switch the cache-entry file build to hardlink the FilesystemStore CAS blob directly into the cache entry — zero-copy, metadata-only — exactly the way `download_to_directory` already does on the fallback path: `populate_fast_store` then `get_file_entry_for_digest` / `get_file_path_locked` / `fs::hard_link`. Correctness: * A hardlinked CAS blob shares its inode with the CAS store and every other action that hardlinked the same blob, so it must never be chmod'd (the inode-corruption bug PR #2347 fixed). Executable files (`FileNode.is_executable`) therefore get their own private inode via fetch+write and are chmod'd 0o555 on that unshared copy — never hardlinked. * When the blob is not locally hardlinkable (the fast tier is not a FilesystemStore, or the blob is absent / evicted from it), the file falls back to fetch+write rather than failing the build. * Zero-byte files keep their existing direct-write special case. * The post-construction lockdown switches from `set_readonly_recursive` (which chmods files, and would corrupt the shared CAS inode) to `set_dir_writable_recursive`, which only touches directories. `DirectoryCache::new` now takes the worker's `Arc<FastSlowStore>` so it can reach `populate_fast_store` and downcast the fast tier to FilesystemStore. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * worker: drop the two redundant full-tree walks in directory-cache build After `construct_directory`, the cache-miss path walked the materialized tree twice more: `calculate_directory_size` (an `fs::metadata` per file) to compute the LRU size, and a recursive permission pass to normalize directory modes. Both are now folded into construction itself. Size: `construct_directory` returns the total tree size, accumulated from `FileNode.digest.size_bytes` in the `Directory` protos it already decodes. This is also more correct than the old filesystem walk — it counts each file once by its CAS size and never follows symlinks into possibly-shared or external targets. Symlinks contribute nothing. Directory mode: each cache-entry directory is chmod'd 0o755 the moment it is created (`create_dir_writable`), umask-independent. The directory is writable while it is populated and that is its stable final mode, so the separate post-construction `set_dir_writable_recursive` walk is gone. Cache-entry files are still never chmod'd here — they may be CAS-blob hardlinks (OPT #1) and mutating their mode would corrupt the shared inode. Reconciliation with PR #2357: that PR reworks `set_readonly_recursive` so the recursive walk leaves dirs 0o755 / files 0o555. This commit removes the directory-cache build's dependence on any such recursive walk entirely — modes are set at creation. Whichever lands second, the rebase is a straight delete of the now-unused call site; there is no semantic conflict because both converge on 0o755 directories, and #2357's file handling is irrelevant here since the cache build no longer touches file modes at all. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * worker: narrow the directory-cache lock and single-flight construction The cache write lock was held across syscall-heavy I/O, serializing every concurrent `get_or_create`: * On the cache-hit paths, `cache.write()` was held across the whole `hardlink_directory_tree` (clonefile / per-file hardlink) materialization. * `evict_lru` ran `set_dir_writable_recursive` + `remove_dir_all` on the evicted tree under the write lock during a cache miss. Lock narrowing: * `acquire_entry` / `release_entry` take the write lock only to bump and drop a `ref_count` pin and snapshot the entry path; the `hardlink_directory_tree` materialization runs fully unlocked. The pin is what makes this safe — `evict_lru` never selects an entry with `ref_count > 0`, so the cache tree cannot be deleted mid-hardlink. The newly constructed entry is likewise inserted pre-pinned (`ref_count: 1`) and unpinned only after its destination hardlink completes; otherwise a concurrent miss for an unrelated digest could evict the brand-new entry (its `last_access` is recent but it is the only unpinned one) while this caller is still hardlinking from it. * `evict_if_needed` / `evict_lru` are now pure in-memory: they select victims and remove them from the map under the lock, returning the victim paths. `dispatch_evictions` then performs the chmod + removal on a `background_spawn` task, off the lock. Single-flight: the existing per-digest construction mutex already ensures a digest is constructed once while N callers wait; this commit additionally unmaps the per-digest mutex (`forget_construction_lock`) once construction finishes so `construction_locks` no longer grows unbounded over the worker's lifetime. Unmapping is race-free: a waiter has already cloned the `Arc<Mutex>` before blocking, and a late arrival that creates a fresh mutex still re-checks the cache, finds the entry, and takes the fast hardlink path — never a redundant construct. `ref_count` / `CachedDirectoryMetadata` semantics are unchanged; the hit/miss return contract is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * address merge interactions with read-only CAS * remove inode stat in test * update dependencies to 1.3.1 * worker: materialize executable inputs by hardlink to a created-once 0o555 variant (fix ETXTBSY) Materializing an executable input via a per-action `std::fs::copy` opened a writable fd in the worker's hot prepare path. Under fork-heavy concurrency a sibling action's forked child could inherit that fd, and a concurrent `execve` of the executable then failed with `ETXTBSY` ("Text file busy", os error 26) — seen on Linux RBE (k8) building rules_go's `builder`. macOS was largely shielded because its directory-cache path uses APFS `clonefile(2)` (a distinct COW inode per action), but the per-file `download_to_directory` fallback hardlinks on both platforms, so the regression spanned both. Fix (keep the hot path hardlink-only — no writable fd): - nativelink-store: add `FilesystemStore::get_executable_hardlink_source`. The CAS blob is read-only 0o444 and shared by hardlink, so it cannot carry +x and must never be chmod'd (#2347). This creates a per-digest 0o555 variant exactly once (single-flight), copy -> chmod -> fsync -> atomic rename, so the writer fd is closed before the inode is ever hardlinked or executed. Stored in a sibling `{content_path}.exec` dir (ignored by the content/temp scan + prune) and cleared on startup. On APFS the copy is itself a `clonefile`. - download_to_directory: executables now hardlink that shared 0o555 variant and non-executables hardlink the 0o444 CAS blob. A private copy is used only for the rare custom unix_mode / mtime case, applied to a private inode. The macOS `clonefile` materialization (`hardlink_directory_tree`, #2349) and the directory cache's executable handling are left untouched, preserving the macOS speedup. Test: executable_hardlink_source_created_once_and_readonly asserts the variant is 0o555, a separate inode from the 0o444 blob, stable across calls, leaves the blob untouched, and hardlinks into an executable. nativelink-store 243/0, nativelink-worker 88/0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * filesystem_store: use nativelink spawn_blocking! macro (clippy disallowed_methods) tokio::task::spawn_blocking is banned by clippy.toml in favor of nativelink-util's spawn_blocking! macro (adds the tracing span + JoinHandleDropGuard). Fixes the -D clippy::disallowed-methods CI failure on get_executable_hardlink_source's executable-variant creation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * filesystem_store: gate executable-variant machinery to unix (fix Windows build) The executable 0o555 variant (and its single-flight map, variant path, .exec dir, and spawn_blocking copy) only exists to carry the unix executable bit and dodge the unix ETXTBSY race. On Windows it was dead code, failing the build under -D warnings (unused import spawn_blocking, never-read executable_locks, never-used executable_variant_path). Gate all of it (and the HashMap / Mutex / EXECUTABLE_DIR_SUFFIX it pulls in) behind #[cfg(unix)]; the existing #[cfg(not(unix))] get_executable_hardlink_source just hardlinks the CAS blob. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Ernesto Cambuston <e.cambuston@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 7e79dea commit bff444d

23 files changed

Lines changed: 1675 additions & 388 deletions

File tree

CHANGELOG.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,26 @@
33

44
All notable changes to this project will be documented in this file.
55

6+
## [1.3.1](https://github.com/TraceMachina/nativelink/compare/v1.3.0..v1.3.1) - 2026-05-23
7+
8+
9+
### 🐛 Bug Fixes
10+
11+
- make permission walks symlink-safe ([#2358](https://github.com/TraceMachina/nativelink/issues/2358)) - ([c40fd10](https://github.com/TraceMachina/nativelink/commit/c40fd10fdc9362d1ee0555626b59056d45308c12))
12+
- keep directory-cache materialized inputs read-only by storing CAS blobs read-only - ([5b31e1a](https://github.com/TraceMachina/nativelink/commit/5b31e1a19633dc649dcbb63792bf64a4e8592e56))
13+
14+
### ⚙️ Miscellaneous
15+
16+
- make directory-cache entries already-writable ([#2357](https://github.com/TraceMachina/nativelink/issues/2357)) - ([83f7be1](https://github.com/TraceMachina/nativelink/commit/83f7be1aa63bbf2a1f20451a56225804cae0dc46))
17+
- hardlink CAS blobs in directory-cache construct ([#2359](https://github.com/TraceMachina/nativelink/issues/2359)) - ([bbf086a](https://github.com/TraceMachina/nativelink/commit/bbf086ac0ebe7ae6c0b69b2ec794187c4a0e28cb))
18+
- drop the two redundant full-tree walks in directory-cache build ([#2359](https://github.com/TraceMachina/nativelink/issues/2359)) - ([7c18e3a](https://github.com/TraceMachina/nativelink/commit/7c18e3a2a89b9a6a0b0c03d4f6311bc169ce8e2a))
19+
- narrow the directory-cache lock and single-flight construction ([#2359](https://github.com/TraceMachina/nativelink/issues/2359)) - ([73764ad](https://github.com/TraceMachina/nativelink/commit/73764ad0ac2ebd767a362d304bc306dd83bd66db))
20+
21+
### 🧪 Testing & CI
22+
23+
- remove inode stat in test - ([b7c6c9f](https://github.com/TraceMachina/nativelink/commit/b7c6c9f6e758c6eda03ab9650bb6ab4f4c8bae55))
24+
- Add --fallback to all remaining nix CI commands ([#2360](https://github.com/TraceMachina/nativelink/issues/2360)) - ([590c514](https://github.com/TraceMachina/nativelink/commit/590c514ebae8105ebd567d655245bdca02e79f6c))
25+
626
## [1.3.0](https://github.com/TraceMachina/nativelink/compare/v1.2.0..v1.3.0) - 2026-05-21
727

828

Cargo.lock

Lines changed: 13 additions & 13 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ resolver = "2"
1010
edition = "2024"
1111
name = "nativelink"
1212
rust-version = "1.93.1"
13-
version = "1.3.0"
13+
version = "1.3.1"
1414

1515
[profile.release]
1616
lto = true

MODULE.bazel

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
module(
22
name = "nativelink",
3-
version = "1.3.0",
3+
version = "1.3.1",
44
compatibility_level = 0,
55
)
66

nativelink-config/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ lints.workspace = true
44
[package]
55
edition = "2024"
66
name = "nativelink-config"
7-
version = "1.3.0"
7+
version = "1.3.1"
88

99
[dependencies]
1010
nativelink-error = { path = "../nativelink-error" }

nativelink-error/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ autobins = false
77
autoexamples = false
88
edition = "2024"
99
name = "nativelink-error"
10-
version = "1.3.0"
10+
version = "1.3.1"
1111

1212
[dependencies]
1313
nativelink-metric = { path = "../nativelink-metric" }

nativelink-macro/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ lints.workspace = true
44
[package]
55
edition = "2024"
66
name = "nativelink-macro"
7-
version = "1.3.0"
7+
version = "1.3.1"
88

99
[lib]
1010
proc-macro = true

nativelink-metric/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ lints.workspace = true
44
[package]
55
edition = "2024"
66
name = "nativelink-metric"
7-
version = "1.3.0"
7+
version = "1.3.1"
88

99
[dependencies]
1010
nativelink-metric-macro-derive = { path = "nativelink-metric-macro-derive" }

nativelink-metric/nativelink-metric-macro-derive/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[package]
22
edition = "2024"
33
name = "nativelink-metric-macro-derive"
4-
version = "1.3.0"
4+
version = "1.3.1"
55

66
[lib]
77
proc-macro = true

nativelink-proto/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
[package]
33
edition = "2024"
44
name = "nativelink-proto"
5-
version = "1.3.0"
5+
version = "1.3.1"
66

77
[lib]
88
doctest = false # because some of the generated protos have things that look like doctests but break

0 commit comments

Comments
 (0)