Skip to content

feat: disambiguate duplicate crates across workspaces#41

Merged
brndnmtthws merged 3 commits intomainfrom
hardening
Oct 1, 2025
Merged

feat: disambiguate duplicate crates across workspaces#41
brndnmtthws merged 3 commits intomainfrom
hardening

Conversation

@brndnmtthws
Copy link
Collaborator

  • replace analyzer's crate→workspace map with canonical multi-maps and expose crate-path lookups
  • resolve workspace dependency edges using metadata paths instead of substring heuristics
  • reindex ripple analysis on stable crate ids and update reports/tests for duplicate names
  • refresh README guidance

- replace analyzer's crate→workspace map with canonical multi-maps and
  expose crate-path lookups
- resolve workspace dependency edges using metadata paths instead of
  substring heuristics
- reindex ripple analysis on stable crate ids and update reports/tests
  for duplicate names
- refresh README guidance
@brndnmtthws brndnmtthws requested a review from Copilot October 1, 2025 19:17
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors cross-workspace dependency resolution to disambiguate duplicate crate names across multiple workspaces and to rely on canonical crate paths instead of substring heuristics. It updates graph building, affected analysis, executors, docs, and tests accordingly.

  • Replace name→workspace map with canonical multi-maps and crate-path lookups
  • Path-aware edge resolution in dependency graph; remove heuristic scanning of Cargo.toml workspace deps
  • Rework affected analysis to use stable CrateId (name + path) and update reporting/tests for duplicates

Reviewed Changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/integration_test.rs Update test to new graph builder signature (additional path-based maps).
src/lib.rs Update documentation examples to new API parameters.
src/graph/renderer.rs Simplify error propagation using ? on writeln_out! calls.
src/graph/builder.rs Introduce path-based dependency resolution; remove workspace dep cache; add resolution context; update API.
src/executors/graph.rs Update executor to pass new analyzer maps to graph builder.
src/executors/deps.rs Update executor to pass new analyzer maps to graph builder.
src/executors/check.rs Update executor to pass new analyzer maps to graph builder.
src/executors/analyze.rs Update executor to pass new analyzer maps to graph builder.
src/executors/affected.rs Update to new AffectedAnalysis signature; refine JSON/human outputs to handle duplicate names.
src/common.rs Minor test assertion tweak for path comparison.
src/commands/deps.rs Adjust types for compatibility with new crate→workspaces mapping.
src/commands/affected.rs Major refactor: introduce CrateId, build crate-path and crate-workspace indices, path-based dependency connection, and updated reporting/tests.
src/analyzer/dependency_classifier.rs Capture path and workspace flags on dependencies.
src/analyzer/analyzer_impl.rs Add canonical multi-maps; extend Dependency/Builder; canonicalize and index crate paths/workspaces; update discovery pipeline.
README.md Document disambiguation behavior and path-based resolution benefits.

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 165 to 174
let crate_path = member
.path()
.canonicalize()
.unwrap_or_else(|_| member.path().clone());
let Some(from_id) = crate_path_index.get(&crate_path).cloned() else {
continue;
};
let &from_idx = crate_node_indices
.get(&from_id)
.expect("crate node must exist for analyzed member");
Copy link

Copilot AI Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

member.path() has already been canonicalized in WorkspaceAnalyzer::merge_results; re-canonicalizing each member here adds unnecessary IO overhead. Use the stored canonical path directly: let crate_path = member.path().to_path_buf();

Copilot uses AI. Check for mistakes.
Comment on lines 341 to 358
let mut best_match: Option<(usize, CrateId)> = None;

for (crate_path, crate_id) in &self.crate_path_index {
let match_path = (canonical.starts_with(crate_path)
|| abs_file.starts_with(crate_path))
.then_some(crate_path);

if let Some(path) = match_path {
let match_len = path.as_os_str().len();
match &best_match {
None => best_match = Some((match_len, crate_id.clone())),
Some((best_len, _)) if match_len > *best_len => {
best_match = Some((match_len, crate_id.clone()))
}
_ => {}
}
}
}
Copy link

Copilot AI Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Using OsStr::len() as a proxy for 'most specific path' can mis-rank candidates (byte length isn't a good specificity metric across platforms). Prefer comparing path depth via components().count(), e.g., let match_len = path.components().count();, to choose the deepest matching crate root.

Copilot uses AI. Check for mistakes.
Comment on lines 234 to 240
if targets.is_empty() {
for (crate_path, ws_path) in ctx.crate_path_to_workspace.iter() {
if canonical.starts_with(crate_path) || crate_path.starts_with(&canonical) {
targets.insert(ws_path.clone());
}
}
}
Copy link

Copilot AI Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reverse prefix check crate_path.starts_with(&canonical) can spuriously match many unrelated crates when canonical points to a higher-level directory, producing incorrect cross-workspace edges. Because crate_path_to_workspace keys are canonical crate roots, restrict this to canonical.starts_with(crate_path) to only match files under a crate's directory.

Copilot uses AI. Check for mistakes.
Comment on lines 234 to 240
if targets.is_empty() {
for (crate_path, ws_path) in ctx.crate_path_to_workspace.iter() {
if canonical.starts_with(crate_path) || crate_path.starts_with(&canonical) {
targets.insert(ws_path.clone());
}
}
}
Copy link

Copilot AI Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] This O(N) scan over all crate_path_to_workspace entries runs per dependency when earlier lookups fail and could be costly in large monorepos. Consider eliminating the full scan by relying on the dep.name()-scoped candidate sets (ctx.crate_to_paths) only, or pre-index crate roots by prefix (e.g., a trie or grouping by workspace) to reduce the search space.

Suggested change
if targets.is_empty() {
for (crate_path, ws_path) in ctx.crate_path_to_workspace.iter() {
if canonical.starts_with(crate_path) || crate_path.starts_with(&canonical) {
targets.insert(ws_path.clone());
}
}
}
// Removed O(N) scan over all crate_path_to_workspace entries to improve performance in large monorepos.

Copilot uses AI. Check for mistakes.
@brndnmtthws brndnmtthws merged commit 9e7ee01 into main Oct 1, 2025
15 checks passed
@brndnmtthws brndnmtthws deleted the hardening branch October 1, 2025 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants