-
Notifications
You must be signed in to change notification settings - Fork 2k
[analyze] too many roots causing some confusion in large repos #24182
Description
Summary
Perhaps you can tell by this and #24058 - i am finding great value in this tool and think its very close to providing a very effective way to really indicate "what does my change implicate" based on a set of changes.
Problem
Currently, ruff scans the project for any & all directories with an __init__.py file (packages) and adds them to src_roots for searching when constructing the analyze graph
This causes some unexpected edges in the graph. In our codebase (14M LoC, 100k files, 12k __init__.py files (😢 ) ), this caused both some unexpected edges and some unexpected performance issues.
Example:
services/api/start.py
import graphql
... rest of fileIs creating an edge to internal/linter/for/graphql.py. Even though there is no way for them to "see" eachother at runtime. Note we may have things like this see eachother if they were, for example, using uv workspaces.
Proposed Solution
I have put up a PR (#24183) with my proposal. In short:
-
src_roots collection: Instead of adding all package roots, collect
srcpaths from discovered configs usingresolver.settings(). This uses ruff's hierarchical config discovery - each file's settings come from its closest pyproject.toml, and only explicitly configuredsrcpaths are used for resolution. -
module_path computation: Fix the path calculation in lib.rs to use
package.parent()as the src_root when computing the module path. This ensures relative imports resolve correctly (the package directory must be included in the module path, not used as the strip prefix).
With these 2 changes in our jumbo repo (not to say its a representative sample or types of projects, but good data)
- All the "bad edges" described above are gone from our graph
- As an added benefit, removing the thousands of extra src_roots drastically improves resolution time in large projects
- Before: ~27 seconds
- After: ~3.3 seconds (~8x faster)
Version
No response