Skip to content

fix: create ephemeral workspace for git source #13689

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions src/cargo/core/workspace.rs
Original file line number Diff line number Diff line change
Expand Up @@ -283,6 +283,10 @@ impl<'gctx> Workspace<'gctx> {
ws.member_ids.insert(id);
ws.default_members.push(ws.current_manifest.clone());
ws.set_resolve_behavior();
// The find_root function is used here to traverse the directory tree and locate the root of the workspace.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit worried about this change. The in-memory property isn't held for ephemeral workspaces after this patch. It is also a bit fragile because the workspace discovery logic is only guarded by the other place, which is not immediately clear and hard to track.

I wonder if we really need this line here. git_install_reads_workspace_manifest seems to fail if we don't do find_root, but does cargo ever use profile or anything from there? Things we might want to figure out:

  • What kind of workspace support cargo install --git current provide?
  • Can we skip this and still find all members and inheritable fields?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but does cargo ever use profile or anything from there?

After this PR we can only check the manifest for errors. Before that, I belive that we could use any information defined by the workspace. For example lints etc.

  • What kind of workspace support cargo install --git current provide?

Before this PR, we just load the whole workspace and create a workspace based on the root manifest.
After this PR, we create the workspace based on the specific package and we don't use the information from the root manifest. Only searched the root to see if there is an error in the manifest.

  • Can we skip this and still find all members and inheritable fields?

After this PR, we only checked the manifest, we don't respect workspace information anymore.

For example:
If we try cargo install --git https://github.com/hi-rustin/test-cargo-git-install bin-tool;

Before this change:

test-cargo-git-install on  master via 🦀 v1.77.2 cargo install --git https://github.com/hi-rustin/test-cargo-git-install bin-tool 
    Updating git repository `https://github.com/hi-rustin/test-cargo-git-install`
warning: virtual workspace defaulting to `resolver = "1"` despite one or more workspace members being on edition 2021 which implies `resolver = "2"`
note: to keep the current resolver, specify `workspace.resolver = "1"` in the workspace root's manifest
note: to use the edition 2021 resolver, specify `workspace.resolver = "2"` in the workspace root's manifest
note: for more details see https://doc.rust-lang.org/cargo/reference/resolver.html#resolver-versions
  Installing bin-tool v0.1.0 (https://github.com/hi-rustin/test-cargo-git-install#d7a594e5)
   Compiling lib1 v0.1.0 (/Users/hi-rustin/.cargo/git/checkouts/test-cargo-git-install-fae8728302b9b2da/d7a594e/lib1)
   Compiling bin-tool v0.1.0 (/Users/hi-rustin/.cargo/git/checkouts/test-cargo-git-install-fae8728302b9b2da/d7a594e/bin-tool)
error: usage of an `unsafe` block
 --> bin-tool/src/main.rs:2:5
  |
2 | /     unsafe {
3 | |         println!("Hello, world!");
4 | |     }
  | |_____^
  |
  = note: requested on the command line with `-F unsafe-code`

warning: unnecessary `unsafe` block
 --> bin-tool/src/main.rs:2:5
  |
2 |     unsafe {
  |     ^^^^^^ unnecessary `unsafe` block
  |
  = note: `#[warn(unused_unsafe)]` on by default

warning: `bin-tool` (bin "bin-tool") generated 1 warning
error: could not compile `bin-tool` (bin "bin-tool") due to 1 previous error; 1 warning emitted
error: failed to compile `bin-tool v0.1.0 (https://github.com/hi-rustin/test-cargo-git-install#d7a594e5)`, intermediate artifacts can be found at `/var/folders/j1/7l6snwpx6svgqxh79bz_d27m0000gn/T/cargo-installbO7SHc`.
To reuse those artifacts with a future compilation, set the environment variable `CARGO_TARGET_DIR` to that path.

After this change:

test-cargo-git-install on  master via 🦀 v1.77.2 ../cargo/target/debug/cargo install --git  https://github.com/hi-rustin/test-cargo-git-install bin-tool

    Updating git repository `https://github.com/hi-rustin/test-cargo-git-install`
  Installing bin-tool v0.1.0 (https://github.com/hi-rustin/test-cargo-git-install#d7a594e5)
   Compiling lib1 v0.1.0 (https://github.com/hi-rustin/test-cargo-git-install#d7a594e5)
   Compiling bin-tool v0.1.0 (https://github.com/hi-rustin/test-cargo-git-install#d7a594e5)
    Finished `release` profile [optimized] target(s) in 2.57s
  Installing /Users/hi-rustin/.cargo/bin/bin-tool
   Installed package `bin-tool v0.1.0 (https://github.com/hi-rustin/test-cargo-git-install#d7a594e5)` (executable `bin-tool`)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a workspace information perspective, I think this PR introduces some regression.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a workspace information perspective, I think this PR introduces some regression.

workspace is a feature which should not be there as it complicates all optimizations? it is only there because CARGO_TARGET_DIR does not work very well, isn't it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workspace is already there so to move forward we need to first figure out the feature set cargo install --git offers. However, I agree with soloturn, we may need to step back a bit to re-evaludate the bug.

Both workspace initialization and CARGO_TARGET_DIR are symptoms of the bug. From my previous investigation, the bug is in workspace, members are collected as from PathSource, hence cargo install recognizes the as path dependencies. In contrast, when using in git dependency, workspace members will be marked as from GitSource. I wonder if we should mark the entire workspace with the same source as the root one for cargo install. Thinking of nested packages rust-lang/rfcs#3452, if we had that today, we might also need this, otherwise nested package will be marked as from local path and never get updated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workspace is a feature which should not be there as it complicates all optimizations? it is only there because CARGO_TARGET_DIR does not work very well, isn't it?

I think it's hard to say the workspace feature complicates all optimizations. As my example in https://github.com/hi-rustin/test-cargo-git-install, respect workspace information will help us avoid inconsistency.

The problem is not caused by the CARGO_TARGET_DIR. If users set a CARGO_TARGET_DIR, then all compilation would happen in the CARGO_TARGET_DIR.

The problem here is as @weihanglo said, we cannot rebuild the binary after the git repository is updated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should mark the entire workspace with the same source as the root one for cargo install. Thinking of nested packages rust-lang/rfcs#3452, if we had that today, we might also need this, otherwise nested package will be marked as from local path and never get updated.

That sounds like a reasonable solution, I will take a look.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should mark the entire workspace with the same source as the root one for cargo install

I've looked at how we calculate the fingerprint of the path member.

pub fn path_args(ws: &Workspace<'_>, unit: &Unit) -> (PathBuf, PathBuf) {
    let ws_root = ws.root();
    let src = match unit. target.src_path() {
        TargetSourcePath::Path(path) => path.to_path_buf(),
        TargetSourcePath::Metabuild => unit.pkg.manifest().metabuild_path(ws.target_dir()),
    };
    assert!(src.is_absolute());
    if unit.pkg.package_id().source_id().is_path() {
        if let Ok(path) = src.strip_prefix(ws_root) {
            return (path.to_path_buf(), ws_root.to_path_buf());
        }
    }
    (src, unit.pkg.root().to_path_buf())
}

It seems we always try to track the workspace as much as possible.

So I am not sure what you mean by mark the entire workspace with the same source. Can you explain it a little bit?

Copy link
Member

@weihanglo weihanglo Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we read a dependency from a Git source, no matter it is a single package or workspace member, it always returns a PackageId with SourceKind::Git. Under the hood it is actually a PathSource assoicated with a Git SourceId. This helps Cargo track the actual source.

let source_id = self
.source_id
.with_git_precise(Some(actual_rev.to_string()));
let path_source = PathSource::new_recursive(&checkout_path, source_id, self.gctx);
self.path_source = Some(path_source);

However, in cargo install we use workspace to find members so we lost that sticky fashion.

// Despite being ephemeral, we still need to validate all the manifests in the workspace,
// which is what `find_root` helps us achieve here.
ws.find_root(ws.current_manifest.clone().as_path())?;
Ok(ws)
}

Expand Down
2 changes: 1 addition & 1 deletion src/cargo/ops/cargo_install.rs
Original file line number Diff line number Diff line change
Expand Up @@ -816,7 +816,7 @@ fn make_ws_rustc_target<'gctx>(
source_id: &SourceId,
pkg: Package,
) -> CargoResult<(Workspace<'gctx>, Rustc, String)> {
let mut ws = if source_id.is_git() || source_id.is_path() {
let mut ws = if source_id.is_path() {
Workspace::new(pkg.manifest_path(), gctx)?
} else {
Workspace::ephemeral(pkg, gctx, None, false)?
Expand Down
70 changes: 70 additions & 0 deletions tests/testsuite/install.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2064,6 +2064,76 @@ fn git_install_reads_workspace_manifest() {
.run();
}

#[cargo_test]
fn git_install_the_same_bin_twice_with_different_rev() {
let project = git::new("foo", |project| {
project
.file(
"Cargo.toml",
r#"
[workspace]
members = ["bin1"]
"#,
)
.file("bin1/Cargo.toml", &basic_manifest("bin1", "0.1.0"))
.file(
"bin1/src/main.rs",
r#"fn main() { println!("Hello, world!"); }"#,
)
});
let repository = git2::Repository::open(&project.root()).unwrap();
let first_rev = repository.revparse_single("HEAD").unwrap().id().to_string();

// Change the main.rs.
fs::write(
project.root().join("bin1/src/main.rs"),
r#"fn main() { println!("Hello, world! 2"); }"#,
)
.expect("failed to write file");
git::commit(&repository);
let second_rev = repository.revparse_single("HEAD").unwrap().id().to_string();

// Set up a temp target directory.
let temp_dir = paths::root().join("temp-target");
cargo_process(&format!(
"install --git {} --rev {} bin1 --target-dir {}",
project.url().to_string(),
second_rev,
temp_dir.display()
))
.with_stderr(
"\
[UPDATING] git repository [..]
[INSTALLING] bin1 v0.1.0 [..]
[COMPILING] bin1 v0.1.0 [..]
[FINISHED] [..]
[INSTALLING] [..]home/.cargo/bin/bin1[..]
[INSTALLED] package `bin1 [..]
[WARNING] be sure to add [..]
",
)
.run();

cargo_process(&format!(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have a comment or two pointing out we want to observe a rebuild?
Or we can actually do a shasum comparsion like the original did?

"install --git {} --rev {} bin1 --target-dir {}",
project.url().to_string(),
first_rev,
temp_dir.display()
))
.with_stderr(
"\
[UPDATING] git repository [..]
[INSTALLING] bin1 v0.1.0 [..]
[COMPILING] bin1 v0.1.0 [..]
[FINISHED] [..]
[REPLACING] [..]home/.cargo/bin/bin1[..]
[REPLACED] package `bin1 [..]
[WARNING] be sure to add [..]
",
)
.run();
}

#[cargo_test]
fn install_git_with_symlink_home() {
// Ensure that `cargo install` with a git repo is OK when CARGO_HOME is a
Expand Down