feat: disambiguate duplicate crates across workspaces#41
Conversation
brndnmtthws
commented
Oct 1, 2025
- replace analyzer's crate→workspace map with canonical multi-maps and expose crate-path lookups
- resolve workspace dependency edges using metadata paths instead of substring heuristics
- reindex ripple analysis on stable crate ids and update reports/tests for duplicate names
- refresh README guidance
- replace analyzer's crate→workspace map with canonical multi-maps and expose crate-path lookups - resolve workspace dependency edges using metadata paths instead of substring heuristics - reindex ripple analysis on stable crate ids and update reports/tests for duplicate names - refresh README guidance
There was a problem hiding this comment.
Pull Request Overview
This PR refactors cross-workspace dependency resolution to disambiguate duplicate crate names across multiple workspaces and to rely on canonical crate paths instead of substring heuristics. It updates graph building, affected analysis, executors, docs, and tests accordingly.
- Replace name→workspace map with canonical multi-maps and crate-path lookups
- Path-aware edge resolution in dependency graph; remove heuristic scanning of Cargo.toml workspace deps
- Rework affected analysis to use stable CrateId (name + path) and update reporting/tests for duplicates
Reviewed Changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/integration_test.rs | Update test to new graph builder signature (additional path-based maps). |
| src/lib.rs | Update documentation examples to new API parameters. |
| src/graph/renderer.rs | Simplify error propagation using ? on writeln_out! calls. |
| src/graph/builder.rs | Introduce path-based dependency resolution; remove workspace dep cache; add resolution context; update API. |
| src/executors/graph.rs | Update executor to pass new analyzer maps to graph builder. |
| src/executors/deps.rs | Update executor to pass new analyzer maps to graph builder. |
| src/executors/check.rs | Update executor to pass new analyzer maps to graph builder. |
| src/executors/analyze.rs | Update executor to pass new analyzer maps to graph builder. |
| src/executors/affected.rs | Update to new AffectedAnalysis signature; refine JSON/human outputs to handle duplicate names. |
| src/common.rs | Minor test assertion tweak for path comparison. |
| src/commands/deps.rs | Adjust types for compatibility with new crate→workspaces mapping. |
| src/commands/affected.rs | Major refactor: introduce CrateId, build crate-path and crate-workspace indices, path-based dependency connection, and updated reporting/tests. |
| src/analyzer/dependency_classifier.rs | Capture path and workspace flags on dependencies. |
| src/analyzer/analyzer_impl.rs | Add canonical multi-maps; extend Dependency/Builder; canonicalize and index crate paths/workspaces; update discovery pipeline. |
| README.md | Document disambiguation behavior and path-based resolution benefits. |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
src/commands/affected.rs
Outdated
| let crate_path = member | ||
| .path() | ||
| .canonicalize() | ||
| .unwrap_or_else(|_| member.path().clone()); | ||
| let Some(from_id) = crate_path_index.get(&crate_path).cloned() else { | ||
| continue; | ||
| }; | ||
| let &from_idx = crate_node_indices | ||
| .get(&from_id) | ||
| .expect("crate node must exist for analyzed member"); |
There was a problem hiding this comment.
member.path() has already been canonicalized in WorkspaceAnalyzer::merge_results; re-canonicalizing each member here adds unnecessary IO overhead. Use the stored canonical path directly: let crate_path = member.path().to_path_buf();
| let mut best_match: Option<(usize, CrateId)> = None; | ||
|
|
||
| for (crate_path, crate_id) in &self.crate_path_index { | ||
| let match_path = (canonical.starts_with(crate_path) | ||
| || abs_file.starts_with(crate_path)) | ||
| .then_some(crate_path); | ||
|
|
||
| if let Some(path) = match_path { | ||
| let match_len = path.as_os_str().len(); | ||
| match &best_match { | ||
| None => best_match = Some((match_len, crate_id.clone())), | ||
| Some((best_len, _)) if match_len > *best_len => { | ||
| best_match = Some((match_len, crate_id.clone())) | ||
| } | ||
| _ => {} | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
[nitpick] Using OsStr::len() as a proxy for 'most specific path' can mis-rank candidates (byte length isn't a good specificity metric across platforms). Prefer comparing path depth via components().count(), e.g., let match_len = path.components().count();, to choose the deepest matching crate root.
src/graph/builder.rs
Outdated
| if targets.is_empty() { | ||
| for (crate_path, ws_path) in ctx.crate_path_to_workspace.iter() { | ||
| if canonical.starts_with(crate_path) || crate_path.starts_with(&canonical) { | ||
| targets.insert(ws_path.clone()); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The reverse prefix check crate_path.starts_with(&canonical) can spuriously match many unrelated crates when canonical points to a higher-level directory, producing incorrect cross-workspace edges. Because crate_path_to_workspace keys are canonical crate roots, restrict this to canonical.starts_with(crate_path) to only match files under a crate's directory.
src/graph/builder.rs
Outdated
| if targets.is_empty() { | ||
| for (crate_path, ws_path) in ctx.crate_path_to_workspace.iter() { | ||
| if canonical.starts_with(crate_path) || crate_path.starts_with(&canonical) { | ||
| targets.insert(ws_path.clone()); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
[nitpick] This O(N) scan over all crate_path_to_workspace entries runs per dependency when earlier lookups fail and could be costly in large monorepos. Consider eliminating the full scan by relying on the dep.name()-scoped candidate sets (ctx.crate_to_paths) only, or pre-index crate roots by prefix (e.g., a trie or grouping by workspace) to reduce the search space.
| if targets.is_empty() { | |
| for (crate_path, ws_path) in ctx.crate_path_to_workspace.iter() { | |
| if canonical.starts_with(crate_path) || crate_path.starts_with(&canonical) { | |
| targets.insert(ws_path.clone()); | |
| } | |
| } | |
| } | |
| // Removed O(N) scan over all crate_path_to_workspace entries to improve performance in large monorepos. |