-
Notifications
You must be signed in to change notification settings - Fork 35
fix: ensure %2F in purl names is handled properly #2156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Reviewer's GuideFixes handling of percent-encoded slashes in purl names by tightening PackageUrl construction/formatting and extending SBOM and Purl tests, while updating csaf and packageurl dependencies to patched git branches that include the encoding fix. Sequence diagram for updated PackageUrl construction in OSV translate and split_namesequenceDiagram
participant Caller
participant Ecosystem
participant Translate as translate
participant PackageUrl
Caller->>Translate: translate(ecosystem, name)
Translate->>Ecosystem: ecosystem.ty_for(name)
Ecosystem-->>Translate: ty
alt Maven-style name with group and artifact
Translate->>Translate: split_name(name, "maven", ":")
alt name contains separator
Translate->>Translate: join all but last segment as namespace
Translate->>Translate: take last segment as simple_name
Translate->>PackageUrl: new("maven", simple_name)
PackageUrl-->>Translate: Result<PackageUrl, Error>
alt Ok(package_url)
Translate->>PackageUrl: with_namespace(namespace)
PackageUrl-->>Translate: Result<&mut PackageUrl, Error>
alt Ok(mut_ref)
Translate->>PackageUrl: add_qualifier("repository_url", repo)
PackageUrl-->>Translate: Result<&mut PackageUrl, Error>
Translate-->>Caller: Some(PackageUrl)
else Err
Translate-->>Caller: None (namespace error propagated)
end
else Err
Translate-->>Caller: None (creation error)
end
else no separator
Translate-->>Caller: None
end
else other ecosystems or simple names
Translate->>PackageUrl: new(ty, name)
PackageUrl-->>Translate: Result<PackageUrl, Error>
alt Ok(package_url)
Translate-->>Caller: Some(PackageUrl)
else Err
Translate-->>Caller: None
end
end
Class diagram for updated Purl formatting and PackageUrl constructionclassDiagram
class Purl {
+ty: String
+name: String
+namespace: Option~String~
+version: Option~String~
+qualifiers: HashMap~String, String~
+from_str(input: &str) Result~Purl, Error~
+to_string() String
}
class PurlDisplayImpl {
+fmt(f: &mut Formatter) Result~(), fmt::Error~
}
class PurlDebugImpl {
+fmt(f: &mut Formatter) Result~(), fmt::Error~
}
class PackageUrl~'a~ {
+new(ty: &str, name: &str) Result~PackageUrl~'a~, Error~
+with_namespace(namespace: &str) Result~&mut PackageUrl~'a~, Error~
+with_version(version: &str) Result~&mut PackageUrl~'a~, Error~
+add_qualifier(key: &str, value: &str) Result~&mut PackageUrl~'a~, Error~
+to_string() String
}
Purl ..> PurlDisplayImpl : implements Display
Purl ..> PurlDebugImpl : implements Debug
PurlDisplayImpl ..> PackageUrl~'a~ : constructs and configures
%% Updated behavior:
%% - Errors from PackageUrl::new, with_namespace, with_version, add_qualifier
%% are now propagated as fmt::Error in PurlDisplayImpl.fmt
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and found some issues that need to be addressed.
- In
split_name, the newwith_namespace(namespace).unwrap()reintroduces a potential panic path; consider propagating the error (e.g., viaand_thenor?in aResult-returning helper) instead of unwrapping. - Cargo.toml now declares
packageurltwice (once with a version and once as a git dependency); it would be clearer to consolidate this into a single dependency specification or use[patch]so it’s obvious which source is authoritative.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `split_name`, the new `with_namespace(namespace).unwrap()` reintroduces a potential panic path; consider propagating the error (e.g., via `and_then` or `?` in a `Result`-returning helper) instead of unwrapping.
- Cargo.toml now declares `packageurl` twice (once with a version and once as a git dependency); it would be clearer to consolidate this into a single dependency specification or use `[patch]` so it’s obvious which source is authoritative.
## Individual Comments
### Comment 1
<location> `common/src/purl.rs:241` </location>
<code_context>
let mut purl = PackageUrl::new(&self.ty, &self.name).map_err(|_| fmt::Error)?;
if let Some(ns) = &self.namespace {
- purl.with_namespace(ns);
+ purl.with_namespace(ns).map_err(|_| std::fmt::Error {})?;
}
if let Some(version) = &self.version {
</code_context>
<issue_to_address>
**issue (bug_risk):** Use `fmt::Error` directly instead of constructing `std::fmt::Error {}`
`fmt::Error` is a unit struct, so `std::fmt::Error {}` is invalid and won’t compile. Instead, either `use std::fmt::Error;` and write `map_err(|_| Error)?`, or follow the existing pattern and use `map_err(|_| fmt::Error)?`. Apply the same fix to the `with_version` call as well.
</issue_to_address>
### Comment 2
<location> `common/src/purl.rs:244` </location>
<code_context>
}
if let Some(version) = &self.version {
- purl.with_version(version);
+ purl.with_version(version).map_err(|_| std::fmt::Error {})?;
}
for (key, value) in &self.qualifiers {
</code_context>
<issue_to_address>
**issue (bug_risk):** Fix construction of `fmt::Error` on `with_version` error mapping
As with `with_namespace`, `std::fmt::Error {}` isn’t a valid constructor here and will not compile. Use `map_err(|_| fmt::Error)?` (or an equivalent mapping) so the error conversion is valid and consistent.
</issue_to_address>
### Comment 3
<location> `modules/ingestor/src/service/advisory/osv/translate.rs:63` </location>
<code_context>
PackageUrl::new(ty, name)
.map(|mut purl| {
- purl.with_namespace(namespace);
+ purl.with_namespace(namespace).unwrap();
purl
})
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Avoid `unwrap()` on `with_namespace` to prevent panics and keep error handling consistent
In `translate` you now propagate `with_namespace` errors with `?`, but here `unwrap()` reintroduces a panic on malformed namespaces. Please handle the error instead (e.g., propagate it, or map the failure to `None` for this helper) to stay consistent and avoid panics.
</issue_to_address>
### Comment 4
<location> `modules/fundamental/src/sbom/endpoints/test.rs:1727-1739` </location>
<code_context>
+
+ // Assert that both packages (generic & huggingface) return the same SBOM
+ // First huggingface...
+ let uri = format!(
+ "/api/v2/sbom/by-package?purl={}",
+ encode("pkg:huggingface/ibm-granite/[email protected]")
+ );
+ let req = TestRequest::get().uri(&uri).to_request();
+ let response: Value = app.call_and_read_body_json(req).await;
+ assert_eq!(
</code_context>
<issue_to_address>
**suggestion (testing):** Add a direct assertion that the huggingface request can be made with an already-encoded slash as well
This block covers the case where a literal `/` in the huggingface purl is translated to `%2F` in the generic purl. Another real-world edge case is when clients pre-encode the `/` in the huggingface purl itself (e.g. `pkg:huggingface/ibm-granite%[email protected]`).
Please add a third request that uses a huggingface purl with `%2F` in the name (still wrapped in `encode(...)` for the query parameter) and asserts it resolves to the same generic SBOM id. That will verify the endpoint behaves correctly whether or not the client pre-encodes the slash in the name segment.
```suggestion
// Assert that both packages (generic & huggingface) return the same SBOM
// First huggingface with a literal '/' in the name...
let uri = format!(
"/api/v2/sbom/by-package?purl={}",
encode("pkg:huggingface/ibm-granite/[email protected]")
);
let req = TestRequest::get().uri(&uri).to_request();
let response: Value = app.call_and_read_body_json(req).await;
assert_eq!(
response["items"][0]["described_by"][0]["id"],
"pkg:generic/ibm-granite%[email protected]"
);
// Then huggingface with a pre-encoded '/' in the name...
let uri = format!(
"/api/v2/sbom/by-package?purl={}",
encode("pkg:huggingface/ibm-granite%[email protected]")
);
let req = TestRequest::get().uri(&uri).to_request();
let response: Value = app.call_and_read_body_json(req).await;
assert_eq!(
response["items"][0]["described_by"][0]["id"],
"pkg:generic/ibm-granite%[email protected]"
);
// And then generic...
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2156 +/- ##
=======================================
Coverage ? 68.20%
=======================================
Files ? 376
Lines ? 21190
Branches ? 21190
=======================================
Hits ? 14453
Misses ? 5863
Partials ? 874 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
66039c2 to
01b1d43
Compare
|
@sourcery-ai review |
|
I'll release the |
I wouldn't think so. Only those with %2F in their purl names, and it's only through silently failing actions within the UI that anyone would even notice, at which point one could simply re-upload the SBOM in question. |
Can the formulate a SQL query finding such SBOMs?
Assuming that the user still has the original document. |
I'm not sure that's worth the effort. It would take me days to come up with that. :) And I've only seen it in the AIBOM's -- and I honestly think the %2F's in them is a mistake.
They can always download the original and re-ingest, right? |
e9c234d to
f520b0c
Compare
Cargo.toml
Outdated
| csaf = { git = "https://github.com/trustification/csaf-rs", rev = "17620a225744b4a18845d4f7bf63354e01109b91" } | ||
| csaf = { git = "https://github.com/trustification/csaf-rs", branch = "purl-name-encoding-fix" } | ||
| # required due to https://github.com/scm-rs/packageurl.rs/issues/28 | ||
| packageurl = { git = "https://github.com/scm-rs/packageurl.rs", rev = "d24f20f3d3c0242d88687119a5353dbc681eac2e" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC the version with the change is released. We should use it. If something is missing, let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest release is Sep 9: https://github.com/scm-rs/packageurl.rs/releases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, that was missing. Fixed that: https://github.com/scm-rs/packageurl.rs/releases/tag/v0.6.0
| #cpe = { git = "https://github.com/ctron/cpe-rs", rev = "c3c05e637f6eff7dd4933c2f56d070ee2ddfb44b" } | ||
| # required due to https://github.com/voteblake/csaf-rs/pull/29 | ||
| csaf = { git = "https://github.com/trustification/csaf-rs", rev = "17620a225744b4a18845d4f7bf63354e01109b91" } | ||
| csaf = { git = "https://github.com/trustification/csaf-rs", branch = "purl-name-encoding-fix" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we keep patching it, we should move it over to the scm-rs organization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, but I'd prefer to adopt https://github.com/csaf-rs/csaf instead.
This fixes guacsec#2146 and therefore indirectly resolves the downstream issue, https://issues.redhat.com/browse/TC-3090 It relies on unpublished forks of two interdependent crates: csaf-rs and packageurl.rs Ultimately, it'd be swell if we could instead depend on something less dead and more supported like https://github.com/csaf-rs/csaf
This fixes #2146 and therefore indirectly resolves the downstream issue, https://issues.redhat.com/browse/TC-3090
It relies on unpublished forks of two interdependent crates: csaf-rs and packageurl.rs
Ultimately, it'd be swell if we could instead depend on something less dead and more supported like https://github.com/csaf-rs/csaf
Summary by Sourcery
Handle percent-encoded slashes in package URLs and align SBOM and advisory translation logic with updated packageurl behavior.
Bug Fixes:
Enhancements:
Build:
Tests: