Skip to content

Commit dbe6f61

Browse files
committed
feat!(ISV-6786): consume new data structure for generating image SBOMs
1 parent 6b19c10 commit dbe6f61

23 files changed

Lines changed: 377 additions & 536 deletions

docs/sboms/oci_image.md

Lines changed: 9 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -9,14 +9,9 @@ generated by Hermeto (previously known as Cachi2), with the requirement that
99
at least one SBOM is provided in total. It combines these SBOMs
1010
and takes them as a context of the built image.
1111

12-
The script also parses a JSON-ified Dockerfile of the image, parses its
13-
content and determines which base images were used to build the image.
14-
It identifies builder images as well as a _parent image_ which is the
15-
latest image in the Dockerfile (or the base image for the stage identified
16-
by the build target).
17-
18-
Additionally, you can also supply additional builder images on top of those
19-
already parsed from the Dockerfile.
12+
The script uses buildprobe (see [capo](https://github.com/konflux-ci/capo) for
13+
details) to determine container content and which base images were used to
14+
build the image.
2015

2116
All provided SBOMs must be in the same specification! This script does not
2217
support combining SPDX and CycloneDX SBOMs.
@@ -28,25 +23,15 @@ mobster --verbose generate oci-image \
2823
--from-syft tests/sbom/test_merge_data/cyclonedx/syft-sboms/pip-e2e-test.bom.json \
2924
--from-syft tests/sbom/test_merge_data/cyclonedx/syft-sboms/ubi-micro.bom.json \
3025
--from-hermeto tests/sbom/test_merge_data/cyclonedx/cachi2.bom.json \
31-
--image-pullspec quay.io/foobar/examplecontainer:v10 \
32-
--image-digest sha256:1 \
33-
--parsed-dockerfile-path tests/data/dockerfiles/somewhat_believable_sample/parsed.json \
34-
--dockerfile-target build \
35-
--additional-base-image quay.io/ubi9:latest@sha256:123456789012345678901234567789012
26+
--metadata-path tests/data/dockerfiles/somewhat_believable_sample/metadata.yaml
3627
```
3728

3829
## List of arguments
3930
- `--from-syft` -- points to an SBOM file (in a JSON format) created by Syft, can be used multiple times
4031
- `--from-hermeto` -- points to an SBOM file (in a JSON format) created by Hermeto
4132
- `--image-pullspec` -- the pullspec of the image processed in the format `<registry>/<repository>:<tag>`
4233
- `--image-digest` -- the digest of the image processed in the format `sha256:<digest value>`
43-
- `--parsed-dockerfile-path` -- points to a dockerfile processed by `dockerfile-json`
44-
- `--base-image-digest-file` -- points to a file with digests for images used in Dockerfile.
45-
if omitted, the references will be fetched via `oras`. The expected format of the file is
46-
`<registry>/<repository>:<tag> <registry>/<repository>:<tag>@sha256:<digest>`
47-
- `--dockerfile-target` -- if a build target was used for multi-stage build, use this argument to specify the build target
48-
- `--additional-base-images` -- optionally add references to other build images outside the parsed Dockerfile.
49-
expects the format `<registry>/<repository>:<tag>@sha256:<digest value>`
34+
- `--metadata-path` -- points to Dockerfile/Containerfile metadata processed by `buildprobe`
5035
- `--contextualize` -- Allows SBOM contextualization (see [Contextual SBOM](#contextual-sbom))
5136
- `--output` -- where to save the SBOM. prints it to STDOUT if this is not specified
5237
- `--skip-validation` -- skips validation of the SBOM
@@ -57,7 +42,7 @@ To build an SBOM with only the OCI image, you will need to run several tools to
5742
get prerequisite files to use in the `mobster generate` command. These two tools are:
5843

5944
* Syft (https://github.com/anchore/syft): for initial scanning and SBOM generation
60-
* dockerfile-json (https://github.com/keilerkonzept/dockerfile-json): for
45+
* buildprobe (provided by capo at https://github.com/konflux-ci/capo): for
6146
generating a human-readable Dockerfile/Containerfile manifest
6247

6348
Download and install them before this process.
@@ -71,11 +56,11 @@ Assuming you have the Containerfile, and the OCI image stored in a repository
7156
syft scan quay.io/konflux-ci/mobster:latest --output spdx-json > syft.json
7257
```
7358

74-
2. Use dockerfile-json to generate a machine-readable Dockerfile/Containerfile
59+
2. Use buildprobe to generate a machine-readable Dockerfile/Containerfile
7560
description for the tool to use:
7661

7762
```sh
78-
dockerfile-json ./Containerfile > containerfile.json
63+
buildprobe buildah --tag="quay.io/your-image:latest" --target="" --containerfile="Containerfile" > metadata.yaml
7964
```
8065

8166
3. Run `mobster generate` with the prerequisite files and the same OCI image
@@ -87,8 +72,7 @@ mobster generate \
8772
--output full-sbom.json \
8873
oci-image \
8974
--from-syft syft.json \
90-
--parsed-dockerfile-path containerfile.json \
91-
--image-pullspec quay.io/konflux-ci/mobster:latest
75+
--metadata-path metadata.yaml
9276
```
9377

9478
Once the command is complete, you should see the full Mobster SBOM in

poetry.lock

Lines changed: 4 additions & 4 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ dependencies = [
3636
"aiofiles (>=24.1.0,<25.0.0)",
3737
"httpx (>=0.28.1,<0.29.0)",
3838
"aioboto3 (>=15.2.0,<15.3.0)",
39+
"pyyaml (>=6.0.3,<7.0.0)",
3940
]
4041

4142
[project.urls]

src/mobster/cli.py

Lines changed: 3 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
product,
1818
)
1919
from mobster.cmd.upload import upload
20-
from mobster.image import ARTIFACT_PATTERN, PULLSPEC_PATTERN
20+
from mobster.image import PULLSPEC_PATTERN
2121
from mobster.release import ReleaseId
2222

2323

@@ -97,13 +97,6 @@ def validated_pullspec(value: str) -> str:
9797
)
9898
return value
9999

100-
def validated_additional_reference(value: str) -> str:
101-
assert re.match(ARTIFACT_PATTERN, value), (
102-
"Additional references must be in the format "
103-
"<registry>/<repository>:<tag>@sha256:<digest>"
104-
)
105-
return value
106-
107100
oci_image_parser = subparsers.add_parser(
108101
"oci-image", help="Generate an SBOM document for OCI image"
109102
)
@@ -130,31 +123,9 @@ def validated_additional_reference(value: str) -> str:
130123
help="Image digest for the OCI image in the format sha256:<digest>",
131124
)
132125
oci_image_parser.add_argument(
133-
"--parsed-dockerfile-path",
126+
"--metadata-path",
134127
type=Path,
135-
help="Path to the parsed Dockerfile file",
136-
)
137-
oci_image_parser.add_argument(
138-
"--base-image-digest-file",
139-
type=Path,
140-
help="Path to the file containing references "
141-
"to images in the Dockerfile and their digests. "
142-
"Expected format: "
143-
"`<registry>/<repository>:<tag> <registry>/<repository>:<tag>@sha256:<digest>`",
144-
)
145-
oci_image_parser.add_argument(
146-
"--dockerfile-target",
147-
type=str,
148-
help="The name of the build target from the Dockerfile",
149-
default=None,
150-
)
151-
oci_image_parser.add_argument(
152-
"--additional-base-image",
153-
type=validated_additional_reference,
154-
action="append",
155-
default=[],
156-
help="Base (builder) image to add, can be specified multiple times. "
157-
"Expects the format <registry>/<repository>:<tag>@sha256:<digest value>",
128+
help="Path to a metadata file generated by buildprobe.",
158129
)
159130
oci_image_parser.add_argument(
160131
"--arch",

src/mobster/cmd/generate/base.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ def __init__(self, *args: Any, **kwargs: Any) -> None:
2020
super().__init__(*args, **kwargs)
2121

2222
self._content: Any = None
23+
self._metadata: Any = None
2324

2425
@property
2526
def content(self) -> Any:

src/mobster/cmd/generate/oci_image/__init__.py

Lines changed: 51 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
from pathlib import Path
1010
from typing import Any
1111

12+
import yaml
1213
from cyclonedx.exception import CycloneDxException
1314
from spdx_tools.spdx.jsonschema.document_converter import DocumentConverter
1415
from spdx_tools.spdx.model.document import Document
@@ -20,11 +21,10 @@
2021
from mobster.cmd.generate.base import GenerateCommandWithOutputTypeSelector
2122
from mobster.cmd.generate.oci_image.add_image import extend_sbom_with_image_reference
2223
from mobster.cmd.generate.oci_image.base_images_dockerfile import (
23-
extend_sbom_with_base_images_from_dockerfile,
24-
get_base_images_refs_from_dockerfile,
24+
extend_sbom_with_base_images,
2525
get_digest_for_image_ref,
26-
get_image_objects_from_file,
2726
)
27+
from mobster.cmd.generate.oci_image.buildprobe import SBOMMetadata
2828
from mobster.cmd.generate.oci_image.contextual_sbom.builder import (
2929
BuilderContextualizationError,
3030
BuilderPkgMetadata,
@@ -48,7 +48,7 @@
4848
from mobster.image import Image
4949
from mobster.log import log_elapsed
5050
from mobster.sbom.merge import merge_sboms
51-
from mobster.utils import identify_arch, load_sbom_from_json
51+
from mobster.utils import load_sbom_from_json
5252

5353
logging.captureWarnings(True) # CDX validation uses `warn()`
5454
LOGGER = logging.getLogger(__name__)
@@ -99,6 +99,15 @@ async def _load_and_filter_hermeto_sbom(self) -> dict[str, Any]:
9999
arch = self.cli_args.arch or mobster.utils.identify_arch()
100100
return filter_hermeto_sbom_by_arch(hermeto_sbom, arch)
101101

102+
def _load_metadata(self) -> None:
103+
"""
104+
Load a metadata file from the --metadata-path argument into
105+
self._metadata.
106+
"""
107+
with open(self.cli_args.metadata_path, encoding="utf-8") as metadata_file:
108+
raw_metadata = yaml.safe_load(metadata_file)
109+
self._metadata = SBOMMetadata.from_dict(raw_metadata)
110+
102111
async def _handle_bom_inputs(
103112
self,
104113
) -> dict[str, Any]:
@@ -113,13 +122,19 @@ async def _handle_bom_inputs(
113122
self.cli_args.from_hermeto is None
114123
and self.cli_args.from_syft is None
115124
and self.cli_args.image_pullspec is None
125+
and self.cli_args.metadata_path is None
116126
):
117127
raise ArgumentError(
118128
None,
119-
"At least one of --from-syft, --from-hermeto or --image-pullspec"
120-
" must be provided",
129+
"At least one of --from-syft, --from-hermeto, --image-pullspec, "
130+
"or --metadata-path must be provided",
121131
)
122132

133+
if self.cli_args.metadata_path is not None:
134+
self._load_metadata()
135+
# if we don't have an sbom provided to us, use syft to generate it
136+
if self.cli_args.from_syft is None and self.cli_args.from_hermeto is None:
137+
return await syft.scan_image(self._metadata.image.pullspec)
123138
if self.cli_args.from_syft is not None:
124139
# Merging Syft & Hermeto SBOMs
125140
if len(self.cli_args.from_syft) > 1 or self.cli_args.from_hermeto:
@@ -228,7 +243,7 @@ async def _assess_and_dispatch_contextual_workflow(
228243
(non-modified) SBOM is furtherly processed by mobster.
229244
Args:
230245
component_sbom_doc: The component SBOM created for this image.
231-
base_images_refs: List of references from the parsed Dockerfile.
246+
base_images_refs: List of references from the build.
232247
image_arch: CPU architecture of this image.
233248
234249
Returns:
@@ -264,11 +279,12 @@ async def execute(self) -> Any:
264279
"""
265280
LOGGER.debug("Generating SBOM document for OCI image")
266281

282+
# Get/merge the raw SBOM
267283
merged_sbom_dict = await self._handle_bom_inputs()
268284
sbom: Document | CycloneDX1BomWrapper
269-
image_arch = identify_arch()
285+
image_arch = self.cli_args.arch or mobster.utils.identify_arch()
270286

271-
# Parsing into objects
287+
# Parse into objects
272288
if merged_sbom_dict.get("bomFormat") == "CycloneDX":
273289
if self.cli_args.contextualize:
274290
raise ArgumentError(
@@ -280,9 +296,32 @@ async def execute(self) -> Any:
280296
else:
281297
raise ValueError("Unknown SBOM Format!")
282298

283-
# Extending with image reference
284-
if self.cli_args.image_pullspec:
285-
image_arch = self.cli_args.arch or mobster.utils.identify_arch()
299+
base_images_refs = []
300+
base_images_map: dict[str, Image] = {}
301+
302+
# Extend with image reference
303+
if self.cli_args.metadata_path:
304+
image = Image.from_image_index_url_and_digest(
305+
self._metadata.image.pullspec,
306+
self._metadata.image.digest,
307+
arch=image_arch,
308+
)
309+
await extend_sbom_with_image_reference(sbom, image, False)
310+
for base_image_data in self._metadata.base_images:
311+
base_image = Image.from_image_index_url_and_digest(
312+
base_image_data.pullspec,
313+
base_image_data.digest,
314+
)
315+
base_images_refs.append(base_image_data.pullspec)
316+
base_images_map[base_image_data.pullspec] = base_image
317+
await extend_sbom_with_base_images(sbom, base_images_refs, base_images_map)
318+
for extra_image_data in self._metadata.extra_images:
319+
extra_image = Image.from_image_index_url_and_digest(
320+
extra_image_data.pullspec,
321+
extra_image_data.digest,
322+
)
323+
await extend_sbom_with_image_reference(sbom, extra_image, True)
324+
elif self.cli_args.image_pullspec:
286325
if not self.cli_args.image_digest:
287326
LOGGER.info(
288327
"Provided pullspec but not digest."
@@ -308,37 +347,6 @@ async def execute(self) -> Any:
308347
"Provided image digest but no pullspec. The digest value is ignored."
309348
)
310349

311-
base_images_refs = []
312-
base_images_map: dict[str, Image] = {}
313-
314-
# Extending with base images references from a dockerfile
315-
if self.cli_args.parsed_dockerfile_path:
316-
with open(
317-
self.cli_args.parsed_dockerfile_path, encoding="utf-8"
318-
) as parsed_dockerfile_io:
319-
parsed_dockerfile = json.load(parsed_dockerfile_io)
320-
321-
base_images_refs = await get_base_images_refs_from_dockerfile(
322-
parsed_dockerfile, self.cli_args.dockerfile_target
323-
)
324-
325-
if self.cli_args.base_image_digest_file:
326-
LOGGER.debug(
327-
"Supplied pre-parsed image digest file, will operate offline."
328-
)
329-
base_images_map = await get_image_objects_from_file(
330-
self.cli_args.base_image_digest_file
331-
)
332-
await extend_sbom_with_base_images_from_dockerfile(
333-
sbom, base_images_refs, base_images_map
334-
)
335-
336-
# Extending with additional base images
337-
for image_ref in self.cli_args.additional_base_image:
338-
image_object = Image.from_oci_artifact_reference(image_ref)
339-
await extend_sbom_with_image_reference(
340-
sbom, image_object, is_builder_image=True
341-
)
342350
with log_elapsed("Contextual workflow", logging.INFO):
343351
contextual_sbom = await self._assess_and_dispatch_contextual_workflow(
344352
sbom, base_images_refs, base_images_map, image_arch

0 commit comments

Comments
 (0)