Simplify cargo purls#988
Conversation
|
new revision with more commits and with suggested improvements |
7d89d2a to
20f7092
Compare
| subpath=str(package_dir.path.relative_to(package_dir.root)), | ||
| ).to_component() | ||
| ) | ||
| elif pkg_name in local_packages.keys(): |
There was a problem hiding this comment.
Non-blocking observation: having a is_local_package() defined somewhere would have slightly improved the flow. Returning it from _find_local_packaged() (properly renamed) could be done. Bending the rules a little and agreeing on the helper to return either an empty string or a path could result in something like this:
elif subpath := is_local_package(pcg_name):
...
subpath=subpath,
...Improve parsing of Cargo.toml, Cargo.lock and .cargo/config.toml files by separating their logic into a distinct function. This enhances code clarity without unnecessary comments and removes duplication. Change type of `version` field from `str` to `Optional[str]` to silence mypy error. The class is used for "main package" component, which can be a virtual workspace without defined package name and version in Cargo.toml. The version is None in that case. In the future patches, I will revert this line back to `str` type and move the logic for "main package" component away from this class. Signed-off-by: Michal Šoltis <msoltis@redhat.com>
When resolving a cargo project, we only parse the top level Cargo.toml. If a project is a virtual workspace, all nested packages can have their own version. But the actual workspace can have its version too. See https://doc.rust-lang.org/cargo/reference/workspaces.html#the-package-table for more details. The current implementation is wrong, since the code assumes we parse a nested Cargo.toml of any workspace package and expect workspace metadata to be present there as well. It is basically an unreachable code. Run generate test data nox session to update SBOM components based on the changes in this commit. Signed-off-by: Michal Šoltis <msoltis@redhat.com>
SBOM components and config template are not connected to each other -> move fetching dependenceis logic (cargo vendor) to a separate function. Also, move verifying presence of Cargo.lock from SBOM component resolution that needs to be called before prefetching. Update function headers to reflect the changes (e.g. unused attribute). Signed-off-by: Michal Šoltis <msoltis@redhat.com>
The `cargo vendor` command flags were clarified to explain their purpose, especially --no-delete which is critical for pip integration. The docstring was updated to better describe the function's purpose and the expected output. Signed-off-by: Michal Šoltis <msoltis@redhat.com>
After splitting this function in previous patches, the new name better reflects what it does. Signed-off-by: Michal Šoltis <msoltis@redhat.com>
WalkthroughThe changes refactor and extend the Cargo package manager logic to better distinguish local, workspace, and external dependencies, introducing a new Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant fetch_cargo_source
participant _fetch_dependencies
participant _use_vendored_sources
participant _generate_sbom_components
User->>fetch_cargo_source: Request
fetch_cargo_source->>_fetch_dependencies: Run cargo vendor
fetch_cargo_source->>_use_vendored_sources: Update Cargo config
fetch_cargo_source->>_generate_sbom_components: Parse Cargo.lock and Cargo.toml
_generate_sbom_components-->>fetch_cargo_source: List[Component]
fetch_cargo_source-->>User: RequestOutput (SBOM, etc.)
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (5)
🚧 Files skipped from review as they are similar to previous changes (5)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
hermeto/core/package_managers/cargo/main.py (1)
305-305: Simplify dictionary membership checkRemove the unnecessary
.keys()call for cleaner, more idiomatic Python code.- elif pkg_name in local_packages.keys(): + elif pkg_name in local_packages:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
hermeto/core/package_managers/cargo/main.py(5 hunks)tests/integration/test_data/cargo_mixed_git_crate_dependency/bom.json(2 hunks)tests/integration/test_data/legacy_entrypoint_e2e_test/bom.json(2 hunks)tests/integration/test_data/pip_multiple_packages/bom.json(53 hunks)tests/unit/package_managers/cargo/test_main.py(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
tests/unit/package_managers/cargo/test_main.py (3)
hermeto/core/package_managers/cargo/main.py (4)
CargoPackage(43-69)_resolve_main_package(150-166)purl(52-65)purl(82-98)hermeto/core/rooted_path.py (1)
RootedPath(13-122)tests/unit/conftest.py (1)
rooted_tmp_path(31-33)
🪛 Ruff (0.12.2)
hermeto/core/package_managers/cargo/main.py
305-305: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Red Hat Konflux / on-pull-request
- GitHub Check: Build container image and run integration tests on it
🔇 Additional comments (10)
hermeto/core/package_managers/cargo/main.py (6)
9-10: LGTM!The new imports are appropriate for the added type annotations and URL parsing functionality.
44-44: Well-implemented purl generation for git sources!The enhanced purl property correctly extracts commit IDs from git URLs and formats them as vcs_url qualifiers, aligning with the purl specification.
Also applies to: 51-65
72-103: Good separation of concerns with LocalCargoPackage!The new dataclass effectively handles local and workspace dependencies, with appropriate handling of optional VCS URLs and subpaths.
123-142: Clear and well-documented dependency fetching!The function properly extracts the cargo vendor logic with excellent documentation of the command flags, especially the critical --no-delete flag for pip integration.
150-167: Excellent support for virtual workspace packages!The function now correctly handles both standard packages and virtual workspaces, with appropriate fallbacks for missing package information.
276-326: Well-structured SBOM component generation!The function effectively differentiates between main, local, and external packages, with proper error handling for scenarios like missing .git directories.
tests/integration/test_data/pip_multiple_packages/bom.json (1)
12-12: Consistent purl simplification across all Cargo packages!The removal of the redundant registry source parameter from all Cargo package purls aligns with the new simplified purl format while preserving the essential checksum information.
Also applies to: 24-24, 36-36, 48-48, 60-60, 72-72, 84-84, 96-96, 108-108, 168-168, 180-180, 192-192, 204-204, 216-216, 228-228, 240-240, 252-252, 264-264, 276-276, 288-288, 300-300, 312-312, 324-324, 336-336, 348-348, 360-360, 372-372, 384-384, 396-396, 408-408, 420-420, 432-432, 444-444, 456-456, 468-468, 480-480, 492-492, 504-504, 516-516, 528-528, 540-540, 552-552, 564-564, 576-576, 588-588, 600-600, 612-612, 624-624, 636-636, 648-648, 660-660, 672-672, 684-684
tests/integration/test_data/cargo_mixed_git_crate_dependency/bom.json (1)
12-12: Correct purl format updates!The changes properly reflect the new purl generation logic:
- Registry source removed for crates.io packages
- Git sources now use the
vcs_urlqualifier with@separator for commit referencesAlso applies to: 36-36
tests/integration/test_data/legacy_entrypoint_e2e_test/bom.json (1)
12-12: Consistent purl format updates!The changes match the pattern in other BOM files, correctly implementing the new purl format.
Also applies to: 36-36
tests/unit/package_managers/cargo/test_main.py (1)
1-96: Comprehensive test coverage for the refactored Cargo functionality!The tests effectively cover:
- Standard package resolution with explicit name/version
- Virtual workspace handling with and without workspace versions
- Purl generation for registry, git, and simple packages
Good use of parameterized tests and clear test structure.
Improve the handling of cargo packages by separating local dependencies from registry/git ones. This change introduces a new LocalCargoPackage class to better represent local dependencies and workspace packages. The SBOM generation logic is updated to properly distinguish between different package types and include relevant metadata like VCS URL and subpath. The `vcs_url` qualifier was incorrectly set only for main packages while dependencies from git sources were missing this qualifier. This caused inconsistent purl generation. We now derive `vcs_url` directly from git source URLs for dependencies by parsing and reconstructing the repository URL with its ref. The main package still uses the project's VCS URL when available. VCS URL format is defined in PURL specification: https://github.com/package-url/purl-spec/blob/main/PURL-SPECIFICATION.rst Remove non-existent source field from PURL qualifiers. Signed-off-by: Michal Šoltis <msoltis@redhat.com>
Maintainers will complete the following section
Note: if the contribution is external (not from an organization member), the CI
pipeline will not run automatically. After verifying that the CI is safe to run:
/ok-to-test(as is the standard for Pipelines as Code)
Summary by CodeRabbit
Refactor
Bug Fixes
Tests