Commit bff444d
* util: make permission walks symlink-safe
The recursive permission walk `set_perms_recursive_impl` (driving both
`set_readonly_recursive` and `set_dir_writable_recursive`) used
`fs::metadata` (stat), which follows symlinks. On input trees containing
symlinks - e.g. `.venv/bin/python3` produced by rules_python /
rules_apple venv tooling - this had two failure modes:
* A symlink to a directory reported `is_dir() == true`, so the walk
recursed *through* the link, escaping the materialized tree or
descending into an unrelated directory.
* A symlink was passed to `set_permissions`; `chmod` follows symlinks,
so it mutated the link's target. When the target did not exist (a
dangling link - common when a venv points outside the action's
input set) the `chmod` returned ENOENT and failed the entire walk.
That ENOENT failure surfaced as `set_readonly_recursive` erroring inside
`DirectoryCache::get_or_create`, which made `prepare_action_inputs` log
"Directory cache failed, falling back to traditional download" and take
the slow `download_to_directory` path.
Fix: `set_perms_recursive_impl` now uses `symlink_metadata` (lstat) and
returns early on symlink entries - it never chmods a symlink and never
recurses through one. Regular files keep their existing read-only
(0o555) treatment, so the CAS-hardlinked-inode hermeticity contract
(PR #2347) is unchanged.
`hardlink_directory_tree_recursive` already recreated symlinks as
symlinks; its symlink branch is reordered ahead of the `is_dir()` /
`is_file()` branches to make the symlink-first intent explicit and
robust.
Adds regression tests covering set-readonly, set-dir-writable, and
hardlink/clone walks over a tree containing a symlink to an in-tree
file, a dangling relative symlink, and a symlink to an in-tree
directory, asserting each walk succeeds and the symlinks are preserved
with their targets intact.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* worker: make directory-cache entries already-writable
`DirectoryCache` locks each cache entry down with `set_readonly_recursive`
after construction. Previously that helper made the entire entry tree mode
0o555 — directories included — so every materialization had to follow up
with a separate `set_dir_writable_recursive` recursive chmod walk in
`prepare_action_inputs` to re-add write permission to directories (Bazel
actions declare outputs at paths nested inside input subdirectories).
That post-walk is redundant work. Directories are not hardlink-shared
between cache entries — only file content inodes are — so directory mode
can safely be made writable once, at the cache entry, instead of on every
materialization.
`set_readonly_recursive` now locks a tree down as a cache entry by making
only FILES read-only (0o555) and leaving DIRECTORIES writable (0o755).
Both materialization paths then produce a directly-usable tree:
- macOS `clonefile(2)` copies the source's modes verbatim, so the clone's
directories are writable and its files read-only.
- The Linux per-file hardlink walk creates fresh directories (writable)
and hardlinks files (which keep the source inode's read-only mode).
Files stay read-only on both paths, so the hermeticity contract and the
CAS-hardlink shared-inode invariant (PR #2347) are preserved. With the
materialized tree already correct, the `set_dir_writable_recursive` call
is removed from `prepare_action_inputs`. `set_dir_writable_recursive`
itself is unchanged and still used by the cache eviction cleanup path.
Tests:
- fs_util: `test_set_readonly_recursive` now also asserts directories stay
writable; the macOS clonefile tests assert cloned subdirs are writable
and that a nested output can be created with no `set_dir_writable_recursive`
walk; `test_set_dir_writable_recursive_walks_nested_dirs` keeps covering
the eviction-cleanup helper.
- directory_cache: new `test_materialized_tree_dirs_writable_files_readonly`
builds a nested tree and asserts that, after `get_or_create` on both the
fresh-materialize and cache-hit paths, every directory is writable and
every file is read-only, with no separate chmod walk.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* worker: hardlink CAS blobs in directory-cache construct
`DirectoryCache::construct_directory` previously materialized every file by
fetching the whole blob into RAM (`get_part_unchunked`) and writing a full
copy (`fs::write`). For a cache that exists to avoid re-fetching from the
CAS, this is the dominant cost on a miss.
Switch the cache-entry file build to hardlink the FilesystemStore CAS blob
directly into the cache entry — zero-copy, metadata-only — exactly the way
`download_to_directory` already does on the fallback path:
`populate_fast_store` then `get_file_entry_for_digest` /
`get_file_path_locked` / `fs::hard_link`.
Correctness:
* A hardlinked CAS blob shares its inode with the CAS store and every
other action that hardlinked the same blob, so it must never be
chmod'd (the inode-corruption bug PR #2347 fixed). Executable files
(`FileNode.is_executable`) therefore get their own private inode via
fetch+write and are chmod'd 0o555 on that unshared copy — never
hardlinked.
* When the blob is not locally hardlinkable (the fast tier is not a
FilesystemStore, or the blob is absent / evicted from it), the file
falls back to fetch+write rather than failing the build.
* Zero-byte files keep their existing direct-write special case.
* The post-construction lockdown switches from `set_readonly_recursive`
(which chmods files, and would corrupt the shared CAS inode) to
`set_dir_writable_recursive`, which only touches directories.
`DirectoryCache::new` now takes the worker's `Arc<FastSlowStore>` so it can
reach `populate_fast_store` and downcast the fast tier to FilesystemStore.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* worker: drop the two redundant full-tree walks in directory-cache build
After `construct_directory`, the cache-miss path walked the materialized
tree twice more: `calculate_directory_size` (an `fs::metadata` per file)
to compute the LRU size, and a recursive permission pass to normalize
directory modes. Both are now folded into construction itself.
Size: `construct_directory` returns the total tree size, accumulated from
`FileNode.digest.size_bytes` in the `Directory` protos it already decodes.
This is also more correct than the old filesystem walk — it counts each
file once by its CAS size and never follows symlinks into possibly-shared
or external targets. Symlinks contribute nothing.
Directory mode: each cache-entry directory is chmod'd 0o755 the moment it
is created (`create_dir_writable`), umask-independent. The directory is
writable while it is populated and that is its stable final mode, so the
separate post-construction `set_dir_writable_recursive` walk is gone.
Cache-entry files are still never chmod'd here — they may be CAS-blob
hardlinks (OPT #1) and mutating their mode would corrupt the shared inode.
Reconciliation with PR #2357: that PR reworks `set_readonly_recursive` so
the recursive walk leaves dirs 0o755 / files 0o555. This commit removes
the directory-cache build's dependence on any such recursive walk
entirely — modes are set at creation. Whichever lands second, the rebase
is a straight delete of the now-unused call site; there is no semantic
conflict because both converge on 0o755 directories, and #2357's file
handling is irrelevant here since the cache build no longer touches file
modes at all.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* worker: narrow the directory-cache lock and single-flight construction
The cache write lock was held across syscall-heavy I/O, serializing every
concurrent `get_or_create`:
* On the cache-hit paths, `cache.write()` was held across the whole
`hardlink_directory_tree` (clonefile / per-file hardlink) materialization.
* `evict_lru` ran `set_dir_writable_recursive` + `remove_dir_all` on the
evicted tree under the write lock during a cache miss.
Lock narrowing:
* `acquire_entry` / `release_entry` take the write lock only to bump and
drop a `ref_count` pin and snapshot the entry path; the
`hardlink_directory_tree` materialization runs fully unlocked. The pin is
what makes this safe — `evict_lru` never selects an entry with
`ref_count > 0`, so the cache tree cannot be deleted mid-hardlink. The
newly constructed entry is likewise inserted pre-pinned (`ref_count: 1`)
and unpinned only after its destination hardlink completes; otherwise a
concurrent miss for an unrelated digest could evict the brand-new entry
(its `last_access` is recent but it is the only unpinned one) while this
caller is still hardlinking from it.
* `evict_if_needed` / `evict_lru` are now pure in-memory: they select
victims and remove them from the map under the lock, returning the
victim paths. `dispatch_evictions` then performs the chmod + removal on
a `background_spawn` task, off the lock.
Single-flight: the existing per-digest construction mutex already ensures a
digest is constructed once while N callers wait; this commit additionally
unmaps the per-digest mutex (`forget_construction_lock`) once construction
finishes so `construction_locks` no longer grows unbounded over the worker's
lifetime. Unmapping is race-free: a waiter has already cloned the `Arc<Mutex>`
before blocking, and a late arrival that creates a fresh mutex still re-checks
the cache, finds the entry, and takes the fast hardlink path — never a
redundant construct.
`ref_count` / `CachedDirectoryMetadata` semantics are unchanged; the
hit/miss return contract is unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* address merge interactions with read-only CAS
* remove inode stat in test
* update dependencies to 1.3.1
* worker: materialize executable inputs by hardlink to a created-once 0o555 variant (fix ETXTBSY)
Materializing an executable input via a per-action `std::fs::copy` opened a
writable fd in the worker's hot prepare path. Under fork-heavy concurrency a
sibling action's forked child could inherit that fd, and a concurrent `execve`
of the executable then failed with `ETXTBSY` ("Text file busy", os error 26) —
seen on Linux RBE (k8) building rules_go's `builder`. macOS was largely shielded
because its directory-cache path uses APFS `clonefile(2)` (a distinct COW inode
per action), but the per-file `download_to_directory` fallback hardlinks on both
platforms, so the regression spanned both.
Fix (keep the hot path hardlink-only — no writable fd):
- nativelink-store: add `FilesystemStore::get_executable_hardlink_source`. The
CAS blob is read-only 0o444 and shared by hardlink, so it cannot carry +x and
must never be chmod'd (#2347). This creates a per-digest 0o555 variant exactly
once (single-flight), copy -> chmod -> fsync -> atomic rename, so the writer fd
is closed before the inode is ever hardlinked or executed. Stored in a sibling
`{content_path}.exec` dir (ignored by the content/temp scan + prune) and
cleared on startup. On APFS the copy is itself a `clonefile`.
- download_to_directory: executables now hardlink that shared 0o555 variant and
non-executables hardlink the 0o444 CAS blob. A private copy is used only for
the rare custom unix_mode / mtime case, applied to a private inode.
The macOS `clonefile` materialization (`hardlink_directory_tree`, #2349) and the
directory cache's executable handling are left untouched, preserving the macOS
speedup.
Test: executable_hardlink_source_created_once_and_readonly asserts the variant is
0o555, a separate inode from the 0o444 blob, stable across calls, leaves the blob
untouched, and hardlinks into an executable. nativelink-store 243/0,
nativelink-worker 88/0.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* filesystem_store: use nativelink spawn_blocking! macro (clippy disallowed_methods)
tokio::task::spawn_blocking is banned by clippy.toml in favor of
nativelink-util's spawn_blocking! macro (adds the tracing span +
JoinHandleDropGuard). Fixes the -D clippy::disallowed-methods CI failure on
get_executable_hardlink_source's executable-variant creation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* filesystem_store: gate executable-variant machinery to unix (fix Windows build)
The executable 0o555 variant (and its single-flight map, variant path, .exec
dir, and spawn_blocking copy) only exists to carry the unix executable bit and
dodge the unix ETXTBSY race. On Windows it was dead code, failing the build
under -D warnings (unused import spawn_blocking, never-read executable_locks,
never-used executable_variant_path). Gate all of it (and the HashMap / Mutex /
EXECUTABLE_DIR_SUFFIX it pulls in) behind #[cfg(unix)]; the existing
#[cfg(not(unix))] get_executable_hardlink_source just hardlinks the CAS blob.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Ernesto Cambuston <e.cambuston@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 7e79dea commit bff444d
23 files changed
Lines changed: 1675 additions & 388 deletions
File tree
- nativelink-config
- nativelink-error
- nativelink-macro
- nativelink-metric
- nativelink-metric-macro-derive
- nativelink-proto
- nativelink-redis-tester
- nativelink-scheduler
- nativelink-service
- nativelink-store
- src
- tests
- nativelink-util
- src
- nativelink-worker
- src
- tests
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
6 | 26 | | |
7 | 27 | | |
8 | 28 | | |
| |||
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
13 | | - | |
| 13 | + | |
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
| 7 | + | |
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
10 | | - | |
| 10 | + | |
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
| 7 | + | |
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
| 7 | + | |
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| |||
0 commit comments