Skip to content

Commit 2bc32f9

Browse files
erneestocclaude
andcommitted
worker: make directory-cache entries already-writable
`DirectoryCache` locks each cache entry down with `set_readonly_recursive` after construction. Previously that helper made the entire entry tree mode 0o555 — directories included — so every materialization had to follow up with a separate `set_dir_writable_recursive` recursive chmod walk in `prepare_action_inputs` to re-add write permission to directories (Bazel actions declare outputs at paths nested inside input subdirectories). That post-walk is redundant work. Directories are not hardlink-shared between cache entries — only file content inodes are — so directory mode can safely be made writable once, at the cache entry, instead of on every materialization. `set_readonly_recursive` now locks a tree down as a cache entry by making only FILES read-only (0o555) and leaving DIRECTORIES writable (0o755). Both materialization paths then produce a directly-usable tree: - macOS `clonefile(2)` copies the source's modes verbatim, so the clone's directories are writable and its files read-only. - The Linux per-file hardlink walk creates fresh directories (writable) and hardlinks files (which keep the source inode's read-only mode). Files stay read-only on both paths, so the hermeticity contract and the CAS-hardlink shared-inode invariant (PR TraceMachina#2347) are preserved. With the materialized tree already correct, the `set_dir_writable_recursive` call is removed from `prepare_action_inputs`. `set_dir_writable_recursive` itself is unchanged and still used by the cache eviction cleanup path. Tests: - fs_util: `test_set_readonly_recursive` now also asserts directories stay writable; the macOS clonefile tests assert cloned subdirs are writable and that a nested output can be created with no `set_dir_writable_recursive` walk; `test_set_dir_writable_recursive_walks_nested_dirs` keeps covering the eviction-cleanup helper. - directory_cache: new `test_materialized_tree_dirs_writable_files_readonly` builds a nested tree and asserts that, after `get_or_create` on both the fresh-materialize and cache-hit paths, every directory is writable and every file is read-only, with no separate chmod walk. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 00c07f1 commit 2bc32f9

3 files changed

Lines changed: 358 additions & 86 deletions

File tree

nativelink-util/src/fs_util.rs

Lines changed: 174 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -51,16 +51,23 @@ pub enum CloneMethod {
5151
/// # Platform Support
5252
/// - macOS: Tries APFS `clonefile(2)` first (O(1), copy-on-write). On failure
5353
/// (e.g., cross-volume EXDEV, or any unexpected errno) falls back to per-file
54-
/// `fs::hard_link`. After a successful clone, only the destination root is
55-
/// chmod'd to 0o755 so the worker can create the action's declared output
56-
/// files inside it. Existing entries inherit the source's read-only mode
57-
/// (0o555) — this matches the hermeticity contract
58-
/// enforced by Bazel's local sandbox and the REAPI `Action.output_files`
59-
/// semantics: actions can only write to declared outputs, never mutate
60-
/// inputs. The COW semantics of `clonefile(2)` mean any writes the worker
61-
/// does make to the destination do not affect the source.
54+
/// `fs::hard_link`. `clonefile(2)` copies the source's modes verbatim, so
55+
/// the destination's directory/file modes mirror the source. For a directory
56+
/// cache entry locked down by [`set_readonly_recursive`], that means
57+
/// directories are writable (0o755) and files are read-only (0o555): the
58+
/// worker can create the action's declared outputs at any nested path, but
59+
/// the hardlinked input files stay immutable. This matches the hermeticity
60+
/// contract enforced by Bazel's local sandbox and the REAPI
61+
/// `Action.output_files` semantics: actions can only write to declared
62+
/// outputs, never mutate inputs. The COW semantics of `clonefile(2)` mean
63+
/// any writes the worker does make to the destination do not affect the
64+
/// source. The destination root is additionally chmod'd to 0o755 as a
65+
/// defensive guarantee for callers that did not pre-mark the source.
6266
/// - Linux: Per-file `fs::hard_link` (directory hardlinks are not supported on
63-
/// ext4/btrfs without root). Always returns `CloneMethod::Hardlink`.
67+
/// ext4/btrfs without root). Directories at the destination are created
68+
/// fresh by this walk, so they are writable regardless of the source's
69+
/// directory modes; files are hardlinked and keep the source inode's mode.
70+
/// Always returns `CloneMethod::Hardlink`.
6471
/// - Windows: Per-file `fs::hard_link` (requires NTFS). Always returns
6572
/// `CloneMethod::Hardlink`.
6673
///
@@ -101,14 +108,15 @@ pub async fn hardlink_directory_tree(src_dir: &Path, dst_dir: &Path) -> Result<C
101108

102109
match try_clonefile(src_dir, dst_dir).await {
103110
Ok(()) => {
104-
// Only chmod the destination root so the worker can create
105-
// the action's declared output files inside it. Existing
106-
// entries (subdirs and files) inherit the source's
107-
// read-only mode (0o555) — that's the hermeticity
108-
// contract. Skipping the per-file chmod walk avoids an
109-
// O(N) syscall sweep that, on real Bazel SwiftCompile
110-
// shapes (~2000 inputs), accounts for ~46% of
111-
// materialization time.
111+
// `clonefile(2)` copies the source's modes verbatim. A
112+
// directory cache entry locked down by
113+
// `set_readonly_recursive` already has writable directories
114+
// (0o755) and read-only files (0o555), so the clone is
115+
// immediately usable: the worker can create declared outputs
116+
// at any nested path and the hardlinked inputs stay
117+
// immutable. No per-directory chmod walk is needed. The root
118+
// is still chmod'd here as a defensive guarantee for callers
119+
// that pass a source whose root was not pre-marked writable.
112120
chmod_dir_writable(dst_dir)
113121
.await
114122
.err_tip(|| "Failed to chmod cloned tree root")?;
@@ -188,11 +196,12 @@ async fn try_clonefile(src: &Path, dst: &Path) -> std::io::Result<()> {
188196

189197
/// Sets the directory `dir`'s mode to 0o755 so callers can create new
190198
/// entries inside it. Used after `clonefile(2)` on the materialized
191-
/// destination root: the clone inherits the source's read-only mode
192-
/// (0o555) but the worker needs to drop the action's declared output
193-
/// files into the root. Existing entries inside `dir` are intentionally
194-
/// left at their cloned read-only perms — that's the hermeticity
195-
/// contract.
199+
/// destination root as a defensive guarantee: a directory cache entry locked
200+
/// down by [`set_readonly_recursive`] already has writable directories, so
201+
/// for those callers this is a no-op, but it keeps `hardlink_directory_tree`
202+
/// correct for any source whose root was not pre-marked writable. Existing
203+
/// entries inside `dir` are intentionally left at their cloned perms — files
204+
/// stay read-only (the hermeticity contract), directories stay writable.
196205
#[cfg(target_os = "macos")]
197206
async fn chmod_dir_writable(dir: &Path) -> Result<(), Error> {
198207
use std::os::unix::fs::PermissionsExt;
@@ -279,15 +288,33 @@ fn hardlink_directory_tree_recursive<'a>(
279288
})
280289
}
281290

282-
/// Sets a directory tree to read-only recursively.
283-
/// This prevents actions from modifying cached directories.
291+
/// Locks down a directory tree as an immutable cache entry: every **file** is
292+
/// made read-only, every **directory** is left writable.
293+
///
294+
/// This is used by the worker's directory cache after it constructs a cache
295+
/// entry. Files must be read-only because they are hardlinked into the CAS
296+
/// (`FilesystemStore`) — keeping them immutable preserves the hermeticity
297+
/// contract (actions cannot mutate inputs) and avoids mutating the shared
298+
/// inode's mode for other in-flight actions.
299+
///
300+
/// Directories are deliberately left writable (0o755). Directories are *not*
301+
/// hardlink-shared between cache entries — only file content inodes are — so a
302+
/// writable directory mode is safe. Keeping cache-entry directories writable
303+
/// means the materialized destination tree (an APFS `clonefile(2)` clone,
304+
/// which copies modes verbatim, or a per-file hardlink walk, which creates
305+
/// fresh directories) already has writable directories. Bazel actions declare
306+
/// outputs at paths nested inside input subdirectories, so every directory in
307+
/// the materialized tree must be writable for the worker to create those
308+
/// outputs; doing it here, once per cache entry, removes the need for a
309+
/// separate per-materialization recursive chmod walk.
284310
///
285311
/// # Arguments
286-
/// * `dir` - Directory to make read-only
312+
/// * `dir` - Directory tree to lock down
287313
///
288314
/// # Platform Notes
289-
/// - Unix: Sets permissions to 0o555 (r-xr-xr-x)
290-
/// - Windows: Sets `FILE_ATTRIBUTE_READONLY`
315+
/// - Unix: files get 0o555 (r-xr-xr-x); directories get 0o755 (rwxr-xr-x).
316+
/// - Windows: files get `FILE_ATTRIBUTE_READONLY`; directories are left
317+
/// writable.
291318
pub async fn set_readonly_recursive(dir: &Path) -> Result<(), Error> {
292319
error_if!(!dir.exists(), "Directory does not exist: {}", dir.display());
293320

@@ -321,19 +348,43 @@ fn set_readonly_one_path(
321348
metadata: Metadata,
322349
) -> Pin<Box<dyn Future<Output = Result<(), Error>> + Send>> {
323350
Box::pin(async move {
324-
// Set the file/directory to read-only
351+
// Directories are left writable on purpose. They are not
352+
// hardlink-shared between cache entries — only file content inodes
353+
// are — so a writable directory mode cannot corrupt anything. Keeping
354+
// them writable means the materialized destination tree already
355+
// accepts the nested output files Bazel actions declare, with no
356+
// separate per-materialization chmod walk.
357+
if metadata.is_dir() {
358+
#[cfg(unix)]
359+
{
360+
use std::os::unix::fs::PermissionsExt;
361+
let mut perms = metadata.permissions();
362+
perms.set_mode(0o755);
363+
364+
fs::set_permissions(&path, perms)
365+
.await
366+
.err_tip(|| format!("Failed to set permissions for: {}", path.display()))?;
367+
}
368+
369+
// On Windows directories are already writable; clearing the
370+
// read-only attribute here would be a no-op, so leave them alone.
371+
372+
return Ok(());
373+
}
374+
375+
// Set the file to read-only.
325376
#[cfg(unix)]
326377
{
327378
use std::os::unix::fs::PermissionsExt;
328379
let mut perms = metadata.permissions();
329380

330-
// Both directories and files get r-xr-xr-x (0o555): read and
331-
// execute for everyone, write for no one. Files use 0o555 rather
332-
// than 0o444 so the execute bit survives on cached executables —
333-
// a stripped +x bit makes an action's interpreter or wrapper
334-
// script fail with EACCES once the tree is materialized into a
335-
// workspace. The write bit stays cleared, so the hermeticity
336-
// contract (inputs are immutable) is unchanged.
381+
// Files get r-xr-xr-x (0o555): read and execute for everyone,
382+
// write for no one. Files use 0o555 rather than 0o444 so the
383+
// execute bit survives on cached executables — a stripped +x bit
384+
// makes an action's interpreter or wrapper script fail with
385+
// EACCES once the tree is materialized into a workspace. The
386+
// write bit stays cleared, so the hermeticity contract (inputs
387+
// are immutable) is unchanged.
337388
perms.set_mode(0o555);
338389

339390
fs::set_permissions(&path, perms)
@@ -567,11 +618,12 @@ mod tests {
567618

568619
#[cfg(target_os = "macos")]
569620
#[nativelink_test("crate")]
570-
async fn test_clonefile_root_writable_inputs_readonly() -> Result<(), Error> {
621+
async fn test_clonefile_dirs_writable_files_readonly() -> Result<(), Error> {
571622
use std::os::unix::fs::PermissionsExt;
572623

573624
let (temp_dir, src_dir) = create_test_directory().await?;
574-
// Source mimics the directory cache: 0o555 dirs and files.
625+
// Source mimics a directory cache entry: writable dirs (0o755),
626+
// read-only files (0o555).
575627
set_readonly_recursive(&src_dir).await?;
576628

577629
let dst_dir = temp_dir.path().join("clone_dst");
@@ -582,20 +634,25 @@ mod tests {
582634
let root_mode = fs::metadata(&dst_dir).await?.permissions().mode() & 0o777;
583635
assert_eq!(root_mode, 0o755, "destination root must be writable");
584636

585-
// Subdir and existing file: stay read-only. Hermeticity contract —
586-
// inputs are not writable. Matches Bazel's local-sandbox model and
587-
// REAPI Action.output_files semantics: actions can only write to
588-
// declared outputs, not mutate inputs.
637+
// Nested subdir: writable too. `clonefile(2)` copies the source's
638+
// modes verbatim and the source's directories were left writable by
639+
// `set_readonly_recursive`. Bazel actions declare outputs at paths
640+
// nested inside input subdirectories, so every directory in the
641+
// materialized tree must be writable — no separate chmod walk needed.
589642
let dst_subdir_mode = fs::metadata(dst_dir.join("subdir"))
590643
.await?
591644
.permissions()
592645
.mode()
593646
& 0o777;
594647
assert_eq!(
595-
dst_subdir_mode, 0o555,
596-
"cloned subdirs must inherit source read-only mode"
648+
dst_subdir_mode, 0o755,
649+
"cloned subdirs must be writable so nested outputs can be created"
597650
);
598651

652+
// Existing file: stays read-only. Hermeticity contract — inputs are
653+
// not writable. Matches Bazel's local-sandbox model and REAPI
654+
// Action.output_files semantics: actions can only write to declared
655+
// outputs, not mutate inputs.
599656
let dst_file_mode = fs::metadata(dst_dir.join("file1.txt"))
600657
.await?
601658
.permissions()
@@ -606,15 +663,24 @@ mod tests {
606663
"cloned files must inherit source read-only mode"
607664
);
608665

609-
// Source untouched.
666+
// Source untouched: dirs writable, files read-only.
610667
let src_subdir_mode = fs::metadata(src_dir.join("subdir"))
611668
.await?
612669
.permissions()
613670
.mode()
614671
& 0o777;
615672
assert_eq!(
616-
src_subdir_mode, 0o555,
617-
"source dir should still be readonly after clone"
673+
src_subdir_mode, 0o755,
674+
"source dir should still be writable after clone"
675+
);
676+
let src_file_mode = fs::metadata(src_dir.join("file1.txt"))
677+
.await?
678+
.permissions()
679+
.mode()
680+
& 0o777;
681+
assert_eq!(
682+
src_file_mode, 0o555,
683+
"source file should still be read-only after clone"
618684
);
619685

620686
Ok(())
@@ -688,78 +754,115 @@ mod tests {
688754
Ok(())
689755
}
690756

691-
/// Regression test for the clonefile directory-cache path. Bazel actions
692-
/// declare outputs at paths nested inside input subdirectories, but
693-
/// `hardlink_directory_tree` chmods only the destination *root* writable
694-
/// — cloned subdirs keep the source's read-only mode (0o555). This test
695-
/// mirrors the two-step `prepare_action_inputs` performs after a cache
696-
/// hit (materialize, then `set_dir_writable_recursive`) and proves a
697-
/// nested output can be created only after the recursive walk.
757+
/// Bazel actions declare outputs at paths nested inside input
758+
/// subdirectories. Because `set_readonly_recursive` leaves directories
759+
/// writable and `clonefile(2)` copies modes verbatim, the materialized
760+
/// tree already accepts a nested output file with NO separate
761+
/// `set_dir_writable_recursive` walk — that is the redundant work this
762+
/// change removes from `prepare_action_inputs`.
698763
#[cfg(target_os = "macos")]
699764
#[nativelink_test("crate")]
700-
async fn test_clonefile_nested_output_after_dir_writable() -> Result<(), Error> {
765+
async fn test_clonefile_nested_output_without_dir_writable_walk() -> Result<(), Error> {
701766
use std::os::unix::fs::PermissionsExt;
702767

703768
let (temp_dir, src_dir) = create_test_directory().await?;
769+
// Lock the source down the way the directory cache does after
770+
// constructing a cache entry: writable dirs, read-only files.
704771
set_readonly_recursive(&src_dir).await?;
705772

706773
let dst_dir = temp_dir.path().join("clone_dst");
707774
hardlink_directory_tree(&src_dir, &dst_dir).await?;
708775

709-
// The clone leaves "subdir" at 0o555, so creating an output nested
710-
// inside it fails — this is the gap a root-only chmod ships.
776+
// Creating an output nested inside a cloned subdir succeeds straight
777+
// away — no recursive chmod walk. This is the post-condition that
778+
// lets `prepare_action_inputs` drop its `set_dir_writable_recursive`
779+
// call.
711780
let nested_output = dst_dir.join("subdir").join("nested_output.o");
712-
let err = fs::write(&nested_output, b"x")
713-
.await
714-
.expect_err("nested write must fail while the subdir is 0o555");
715-
assert_eq!(err.kind(), std::io::ErrorKind::PermissionDenied);
716-
717-
// This is what prepare_action_inputs does after a cache hit: make
718-
// every directory in the materialized tree writable.
719-
set_dir_writable_recursive(&dst_dir).await?;
720-
721781
fs::write(&nested_output, b"action output").await?;
722782
assert_eq!(fs::read(&nested_output).await?, b"action output");
723783

724-
// Files inside the tree stay read-only — hermeticity holds.
784+
// Files inside the tree stay read-only — hermeticity holds, and the
785+
// CAS-hardlink inode invariant is preserved.
725786
let file_mode = fs::metadata(dst_dir.join("subdir").join("file2.txt"))
726787
.await?
727788
.permissions()
728789
.mode()
729790
& 0o777;
730791
assert_eq!(file_mode, 0o555, "input files must remain read-only");
731792

793+
// A write to an input file still fails — actions cannot mutate inputs.
794+
let err = fs::write(dst_dir.join("subdir").join("file2.txt"), b"mutated")
795+
.await
796+
.expect_err("input file write must fail (file is 0o555)");
797+
assert_eq!(err.kind(), std::io::ErrorKind::PermissionDenied);
798+
732799
Ok(())
733800
}
734801

802+
/// `set_readonly_recursive` locks a tree down as a cache entry: every
803+
/// file is made read-only, every directory is left writable. Directories
804+
/// stay writable because they are not hardlink-shared between cache
805+
/// entries, and a writable directory mode lets the materialized
806+
/// destination tree accept nested action outputs without a separate
807+
/// chmod walk.
735808
#[nativelink_test("crate")]
736809
async fn test_set_readonly_recursive() -> Result<(), Error> {
737810
let (_temp_dir, test_dir) = create_test_directory().await?;
738811

739812
set_readonly_recursive(&test_dir).await?;
740813

741-
// Verify files are read-only
814+
// Files are read-only.
742815
let metadata = fs::metadata(test_dir.join("file1.txt")).await?;
743816
assert!(metadata.permissions().readonly());
744817

745818
let metadata = fs::metadata(test_dir.join("subdir/file2.txt")).await?;
746819
assert!(metadata.permissions().readonly());
747820

821+
// Directories are left writable — root and every nested subdir.
822+
#[cfg(unix)]
823+
{
824+
use std::os::unix::fs::PermissionsExt;
825+
for dir in [test_dir.clone(), test_dir.join("subdir")] {
826+
let mode = fs::metadata(&dir).await?.permissions().mode() & 0o777;
827+
assert_eq!(mode, 0o755, "{} must stay writable", dir.display());
828+
}
829+
}
830+
#[cfg(windows)]
831+
{
832+
// On Windows directories carry no read-only attribute that would
833+
// block creating children; assert they are not marked read-only.
834+
for dir in [test_dir.clone(), test_dir.join("subdir")] {
835+
assert!(
836+
!fs::metadata(&dir).await?.permissions().readonly(),
837+
"{} must stay writable",
838+
dir.display()
839+
);
840+
}
841+
}
842+
748843
Ok(())
749844
}
750845

751846
/// `set_dir_writable_recursive` must make *every* directory in a tree
752-
/// writable — including nested subdirs — so actions can create outputs
753-
/// at nested declared-output paths. Files are left read-only because
754-
/// they may share a CAS inode via hardlink.
847+
/// writable — including nested subdirs — so the eviction cleanup path can
848+
/// `remove_dir_all` a cache entry. Files are left read-only because they
849+
/// may share a CAS inode via hardlink. This walk runs on already-read-only
850+
/// directory trees too, so the test first sets every file read-only with
851+
/// `set_readonly_recursive`.
755852
#[cfg(unix)]
756853
#[nativelink_test("crate")]
757854
async fn test_set_dir_writable_recursive_walks_nested_dirs() -> Result<(), Error> {
758855
use std::os::unix::fs::PermissionsExt;
759856

760857
let (_temp_dir, test_dir) = create_test_directory().await?;
761-
// Lock the tree down the way the directory cache does post-construction.
858+
// Lock files down, then explicitly force every directory read-only so
859+
// the walk has real work to do (the directory cache leaves dirs
860+
// writable, but the eviction path must cope with any mode).
762861
set_readonly_recursive(&test_dir).await?;
862+
for dir in [test_dir.clone(), test_dir.join("subdir")] {
863+
fs::set_permissions(&dir, std::fs::Permissions::from_mode(0o555)).await?;
864+
}
865+
763866
set_dir_writable_recursive(&test_dir).await?;
764867

765868
// Every directory — the root and the nested subdir — must be writable.

0 commit comments

Comments
 (0)