Skip to content

Support CycloneDX#3

Open
xaker00 wants to merge 4 commits intomainfrom
feat/process-cyclonedx
Open

Support CycloneDX#3
xaker00 wants to merge 4 commits intomainfrom
feat/process-cyclonedx

Conversation

@xaker00
Copy link
Member

@xaker00 xaker00 commented Oct 1, 2025

  • create pydantic models for CycloneDX v1.6 and SPDX v2.3
  • use uv for running without venv
  • remove outdated test file

UV is not required, but recommended. Versions in the requirements.txt are known to work.

- create pydantic models for CycloneDX v1.6 and SPDX v2.3
- use uv for running without venv
- remove outdated test file

UV is not required, but recommended. Versions in the requirements.txt are known to work.
@xaker00 xaker00 requested a review from Copilot October 1, 2025 18:36
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds CycloneDX (v1.6) support alongside existing SPDX (v2.3) by introducing Pydantic models and refactoring SBOM parsing and merging logic. Also updates usage instructions to optionally run via uv and removes an outdated test file.

  • Introduces typed models for SPDX and CycloneDX and unified FDARecord export format
  • Refactors merge/dedup logic (list-based FDARecord processing) and Excel export
  • Removes prior merge test; README updated with supported formats and uv usage

Reviewed Changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 5 comments.

File Description
gen_sbom.py Major refactor: adds data models, new merge/deduplicate logic, Excel export rewrite, CycloneDX support
README.md Documents supported formats and uv-based execution
test_gen_sbom.py Removed legacy merge test (no replacement added)

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@xaker00 xaker00 requested a review from Copilot October 1, 2025 19:08
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 3 out of 4 changed files in this pull request and generated 3 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

- use `packaging.version` to compare versions
- minor fixes
@xaker00 xaker00 force-pushed the feat/process-cyclonedx branch from dfb4631 to 7c91e20 Compare October 1, 2025 19:21
VCPKG is in the process of better supporting SBOMs. In the meantime,
this is the next best solution. This allows mapping the packages to CPE
strings. There is a manual map in vcpkg.yml. Not very nice, but better
than nothing.
Mostly AI generated. Trivy and Syft/Grype are better tools. Use this
only as a last resort. There is a reason there are entire projects and
paid products exist solve this problem. It is not trivial. You
can get around 80% with this tool, but the last 20% is very difficult.
Copy link

@joshuatz joshuatz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xaker00 Left some comments, just in case you were looking for feedback on this

import sys
import time

import nvdlib
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this package in the requirements.txt file (I'm aware this isn't technically necessary if this file is run as a script, but this breaks standard tooling on the IDE side of things if you are only using the requirements file to install into a virtual environment).

Side note: Rather than trying to keep both systems going - old-school requirements.txt and uv / pyproject dependency management style - I would just fully commit and switch the whole repo over to uv. I think, especially with internal tools, we should take advantage of opportunities like this to switch over to new tools whenever possible.

Comment on lines +317 to +331
@click.command()
@click.argument(
"input_directory_path",
type=click.Path(exists=True, file_okay=False, path_type=Path),
)
@click.argument("output_file_path", type=click.Path(dir_okay=False, path_type=Path))
@click.option("--verbose", is_flag=True, help="Enable verbose logging.")
@click.option("--author-name", type=str, default=None, help="Override the Author Name.")
@click.option("--vcpkg", is_flag=True, help="Combine VCPKG SBOMs.")
@click.option(
"--spdx-output-file",
type=click.Path(dir_okay=False, path_type=Path),
default=None,
help="Output combined SPDX SBOM file path (for VCPKG only).",
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad to see the removal of argparse!

Btw, just because I think you might be interested in exploring this in the future - a library I like even more than click is typer. It essentially re-uses the type-annotations as validation (kinda similar to pydantic), so it cuts way down on boilerplate.

Comment on lines +226 to +235
class FDARecord(BaseModel):
"""FDA required fields."""

author: str
timestamp: str
supplier: str = "Open-source software"
name: str
version: str
unique_identifier: str
relationship: Literal["Is contained by"] = "Is contained by"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future maintainability, it would be nice to have a docstring here noting where these are coming from (e.g., Based on the ___ guidance, [link]).

Comment on lines +263 to +268
"""Merge two SBOMs, keeping the newest version of each package."""
records = {(r.name, r.supplier): r for r in sbom1}
for r in sbom2:
key = (r.name, r.supplier)
if key in records:
records[key] = newer(records[key], r)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... from a security perspective, I'm not sure about the soundness of this.

In general, if you had to pick one, wouldn't it make more sense to use the oldest version as the record instead of the newest? I.e., [insert_thing] is only as strong as its weakest link - if we are scanning a combined sbom for vulns, I would think that the worst offenders would come from being out-of-date, rather than newly introduced vulnerabilities.

That being said, I also don't think we should just be using one or the other - cyclonedx supports declaring version ranges instead of a singular version, and both I'm not sure about spdx, but I know you can use parent-child / linkage to declare the same dependency twice, but as belonging to different parts of the application.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants