Based on a slack discussion, I figured it'd be worth recording some things I've found in looking at the existing BB1 JLLs for GeneralMetadata.jl and how it compares with the BB2 descriptive toml. It's a huge improvement! But I think there are three places where I have thoughts with respect to my recent security work:
- The easy one: I'd like to have a record of what build dependency versions were actually used when building a particular release, not just the
compat bounds that constrained it. E.g., a recent Clang_jll build is missing this.
- I'd also like a record of the buildscript and build system. This is a bit more meta-meta, but it's helpful to know more details about how the artifacts were generated. E.g., a link to the originating Yggdrasil build_tarballs and some information about the build stack itself.
- The hard one: what does
src_version actually mean? Or more pointedly, what's the "src_name" that it's implicitly versioning? How can I distinguish Swift vs Swift or Atlas vs Atlas?
- There exist BB1 Yggdrasil recipes that combine multiple projects at multiple distinct version numbers. Two examples I've found include MPItrampoline and Perl. It's more common (but still quite rare) to have have different sources/versions per architecture (e.g., Git), and that works nicely with the
[[builds]] array.
- What I think I actually want is to have all the
[[build.source]]s themselves identified and versioned if at all possible — and this includes patches! It's hard to describe what "identification" and "versioning" actually mean, though.
- For identification, often the project repository or homepage URLs can be used to anchor the namespace. There's also the ontologies defined by https://repology.org. The latter can then be used to connect against other identifiers like CPEs in the NVD and/or the vendor/product in the EUVD.
- For versioning, you can sometimes extract a version number from a download URL or tag on a git commit, but these are both highly heuristic. You really want the version number exactly as the upstream project would use it to report a vulnerability. It's also sometimes possible for projects to report git ranges for their vulns (e.g., in the OSV spec), but that's quite rare in practice.
Based on a slack discussion, I figured it'd be worth recording some things I've found in looking at the existing BB1 JLLs for GeneralMetadata.jl and how it compares with the BB2 descriptive toml. It's a huge improvement! But I think there are three places where I have thoughts with respect to my recent security work:
compatbounds that constrained it. E.g., a recent Clang_jll build is missing this.src_versionactually mean? Or more pointedly, what's the "src_name" that it's implicitly versioning? How can I distinguish Swift vs Swift or Atlas vs Atlas?[[builds]]array.[[build.source]]s themselves identified and versioned if at all possible — and this includes patches! It's hard to describe what "identification" and "versioning" actually mean, though.