Skip to content

Redefine CARGO_TARGET_DIR to be only an artifacts directory #14125

Open
@kornelski

Description

@kornelski

For design, see #14125 (comment)

For tracking the artifact-dir side of this, see #6790

Documentation: https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#build-dir

Testing instructions: #14125 (comment)

Implementation

Stabilizing this would also resolve

Decisions

  • cargo clean will clean both target-dir and build-dir
  • Skipping including a CARGO_BUILD_DIR shortcut for CARGO_BUILD_BUILD_DIR (like CARGO_TARGET_DIR is a shortcut for CARGO_BUILD_TARGET_DIR)

artifact-dir files:

build-dir files (intermediate artifacts, build state, caches):

  • build
    • "pre and non uplifted" binary executables. (ie. bins for examples that contain the hash in the name, bins for benches, proc macros, build scripts)
    • other depinfo files (generated by rustc, fingerprint, etc. See https://github.com/rust-lang/cargo/blob/master/src/cargo/core/compiler/fingerprint/mod.rs#L164)
    • rlibs and debug info from dependencies
    • build script OUT_DIR
    • output from proc macros (previously stored in target/build)
    • incremental build output from rustc
    • fingerprint files used by Cargo for rebuild detection
    • Cache of rustc invocations (.rustc_info.json)
  • cargo package's scratchpad used for the verify step
  • CARGO_TARGET_TMPDIR files (see rational for this here)
  • reports generated from cargo report like future-incompat-report

Open questions

Deferred

  • {workspace-manifest-hash} also hashes the cargo version
    • Excludes release channel information, including which nightly
    • Allows easier cleanup on rustup update (Garbage collect whole target/ #13136)
    • Makes very clear that the hash's meaning is unstable
    • Blocked on Garbage collect whole target/ #13136 (maybe more?) to reduce the need for third-party clean up tools which will need to know the path to every build-dir for a workspace
  • In templates, {{ is an escaped {
    • With us erroring on {, we have the room to allow escaping in the future
    • We'd like escaping support to be based on use cases and not speculatively developed as { in paths is rare and this isn't a general path

Original Issue:


Problem

There are a couple of issues with the CARGO_TARGET_DIR that are seemingly in conflict with each other:

  1. Multiple locations of target dirs complicate excluding them from backups and full-disk search, cleanup of the temp files, moving temp files to dedicated partitions, out of slow network drives or container mounts, etc. Users don't like that the target dir is huge, and multiple instances of it add up to lot of disk space. Users would prefer a central location to ease management of the temp files, and also to dedupe/reuse dependencies across many projects.

  2. People (and tools) are relying on a relative ./target directory being present to copy or run built files out of there. Additionally, users may not want to configure a shared CARGO_TARGET_DIR due to risk of file name conflicts between projects.

However, the dilemma between 1 and 2 exists only because Cargo uses CARGO_TARGET_DIR for two different roles:

  1. A cache for all intermediate build products (a place where crates.io crates are built, where compiler-private temp files are) which aren't project-specific, and/or files that users don't need to access directly.
  2. A location for user-facing final build products (artifacts) that users expect to be there and need to access.

Proposed Solution

So to satisfy both uses, I suggest to change the thinking about what the role of CARGO_TARGET_DIR should be. Instead of thinking where to put the same huge all-purpose mixed CARGO_TARGET_DIR, think how to deduplicate and slim CARGO_TARGET_DIR, and move everything non-user-facing out of it.

Instead of merging or sharding the CARGO_TARGET_DIR as-is with all of its current content, and adding --artifact-dir as a separate place where final products are being copied to — make CARGO_TARGET_DIR to be the artifact dir (without copying).

As long as the CARGO_TARGET_DIR dir is the place for all of the build files, of all crates including all the crates.io and local builds, with all the caches, all the temp junk, then this is going to be a problematic large directory that needs to be managed. But if the purpose of the ./target dir was changed to be only for user-facing files (files that users can name, and would access via ./target path themselves), then this directory would be relatively small, with a good reason to stay workspace-relative.

What isn't an intermediate build product? (and should stay in ./target)

  • linked (and stripped) binaries of the current workspace, including binaries for the examples,
  • libraries of the current workspace as .a/.so, where lib.crate-type calls for them. Possibly .rlib/.rmeta in the future if there's a stable ABI.
  • linked binaries for tests and benches of the current workspace (to make it easy to launch them under a debugger/profiler, and so they can use relative file paths to read workspace assets).
  • debug symbols for all of the above.
  • .d files for all of the above (so that IDEs and other build systems know when to rebuild the artifacts).
  • if Cargo adds some "staging" directory (a non-private OUT_DIR for build.rs, see Allow build scripts to stage final artifacts #13663), then for build scripts belonging to the current workspace it would be inside ./target as well.

So generally files that users build intentionally, and may want to access directly (run themselves, or package up for distribution) and files that users may need configure their IDE and debugger to find inside the project.

Crates in [patch.crates-io] with a path are a gray area, an might also have their artifacts included in the ./target dir (but in some way that avoids clobbering workspaces' files).

What isn't a final build product, and doesn't belong to ./target:

  • anything related to building crates from crates.io, or any other registry (packages with source = "registry+…")
  • all .fingerprint and incremental dir content of all crates. These are implementation details of the compiler, and nobody should be accessing these directly via ./target/….
  • .o files. Users are not supposed to use them directly either (Rust has static libs for this).
  • proc macro libs. They're not useful without rustc present.

All of these should be built in some other shared build cache dir (one that is not inside CARGO_TARGET_DIR), configurable by a new option/env var.

Registry dependencies would get unique paths derived from rustc version + package IDs + enabled features (so that different crates using different features don't invalidate each others' caches all the time). This would enable sharing built crates.io dependencies across all projects for the same local user, without also causing local workspaces to clobber each others' CARGO_TARGET_DIR/profile/product paths. Temp directories for local projects would need some hashed paths in the shared build/temp dir too.

Advantages

  • Such split removes about 90% of the weight from ./target dirs (for cargo itself, it makes ./target/debug with binaries and tests take 415MB, instead of 4.2GB). This makes cleanup of all the scattered target dirs less of a pressing problem.
  • The ./target keeps relatively few files, and removes high-frequency-churning files out of it, which makes it less of a problem for real-time disk indexing (like search and backups on macOS).
  • I/O latency of ./target stops being critical for build speeds, unlike I/O of the incremental cache and rewrites of thousands of .o files. It becomes feasible to have project directory on a network drive without overriding CARGO_TARGET_DIR (network filesystems are used by non-Linux systems where tools like Vagrant and Docker have to run full-fat VMs, and can't cheaply share the file system).
  • It makes ./target contain only workspace-unique files, which makes it justified for every workspace to have one.
  • It enables moving registry deps to a shared build directory, without side effect of local projects overwriting each others' files. Sharing of dependencies matches users' expectation that the same dependencies shouldn't be redundantly rebuilt for each local project.
  • It's almost entirely backwards compatible. Users can get the benefits without breaking their existing workflows, post-build scripts, and integrations. It doesn't invalidate documentation/books/tutorials that refer to target/release/exe etc.
  • It could be the default behavior, so it could benefit all users without friction of adding --artifact-dir or .cargo/config.

Notes

No response

Metadata

Metadata

Assignees

Labels

A-cachingArea: caching of dependencies, repositories, and build artifactsC-feature-requestCategory: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted`C-tracking-issueCategory: A tracking issue for something unstable.S-needs-mentorStatus: Issue or feature is accepted, but needs a team member to commit to helping and reviewing.Z-out-dirNightly: --out-dircall-for-testingMarks issues that require broader testing from the community, e.g. before stabilization.

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions