diff --git a/docs/Plugin-SBOM.md b/docs/Plugin-SBOM.md new file mode 100644 index 000000000..680f0cf70 --- /dev/null +++ b/docs/Plugin-SBOM.md @@ -0,0 +1,320 @@ +--- +layout: default +title: Plugin SBOM Generator +--- + +This plugin generates a Software Bill of Materials (SBOM) in CycloneDX format for packages built with Mock. The SBOM provides detailed information about the build environment, source files, and resulting packages, optimized for security use cases. + +## Features + +* Generates SBOM in CycloneDX 1.5 format (JSON) and SPDX 2.3 format +* Deep Chroot Integration: + * Uses the target distribution's own `rpm` binary via `doChroot` for metadata extraction, ensuring 100% version compatibility across different distributions. + * Correctly handles path mapping between chroot and host environments. +* Captures detailed information about: + * Source files and patches from spec files with a resilient regex-based fallback for legacy/strict syntax errors. + * Binary RPM metadata with standard PURL and CPE identifiers. + * Complete build toolchain packages with per-package GPG signature metadata. + * Runtime dependencies. + * File hashes (SHA-256). +* Optimized Performance: Consolidated file listing and metadata extraction into a single pass. +* Outputs SBOM in the build results directory. +* Compatible with security scanners (Grype, Trivy, Snyk). + +## Usage + +### Basic Usage + +The simplest way to use the SBOM generator is to enable it for a single build: + +```bash +# Build a package and generate SBOM +mock --enable-plugin=sbom_generator --rebuild package.src.rpm + +# Or build from an existing SRPM +mock --enable-plugin=sbom_generator --rebuild ~/rpmbuild/SRPMS/package-1.0-1.fc42.src.rpm + +# Specify a chroot configuration +mock --enable-plugin=sbom_generator --rebuild package.src.rpm -r rocky-9-x86_64 +``` + +After the build completes, the SBOM will be available in the build results directory + +### Viewing and Analyzing the SBOM + +The generated SBOM can be analyzed using various tools: + +```bash +# View basic SBOM information +jq '.metadata.component' sbom.cyclonedx.json +jq '.components | length' sbom.cyclonedx.json +jq '.dependencies | length' sbom.cyclonedx.json + +# List all built packages +jq '.components[] | select(.type == "library") | {name, version, purl}' sbom.cyclonedx.json + +# List source files used in the build +jq '.components[] | select(.properties[]?.name == "mock:source:type") | {name, hashes}' sbom.cyclonedx.json + +# View runtime dependencies for a specific package +jq '.dependencies[] | select(.ref | contains("httpd"))' sbom.cyclonedx.json +``` + +### Using with Security Scanners + +The SBOM can be directly used with security vulnerability scanners: + +```bash + +# Scan with SBOM Auditor +sbom-auditor sbom.cyclonedx.json + +# Scan with Grype +grype sbom:./sbom.cyclonedx.json + +# Scan with Trivy +trivy sbom sbom.cyclonedx.json + +# Export to other formats if needed +syft convert sbom.cyclonedx.json -o spdx-json > sbom.spdx.json +``` + +## Configuration + +### Enabling the Plugin + +The plugin is disabled by default. You can enable it in several ways: + +**Option 1: Command line (recommended for one-off builds)** +```bash +mock --enable-plugin=sbom_generator --rebuild package.src.rpm +``` + +**Option 2: Configuration file (for persistent enablement)** + +Add to your Mock configuration file (e.g., `/etc/mock/fedora-rawhide-x86_64.cfg`): + +```python +config_opts['plugin_conf']['sbom_generator_enable'] = True +config_opts['plugin_conf']['sbom_generator_opts'] = { + 'generate_sbom': True +} +``` + +**Option 3: User configuration** + +Add to `~/.config/mock/mock.cfg`: + +```python +config_opts['plugin_conf']['sbom_generator_enable'] = True +``` + +### Configuration Options + +The plugin supports several configuration options to control SBOM generation: + +```python +config_opts['plugin_conf']['sbom_generator_opts'] = { + 'generate_sbom': True, # Enable SBOM generation (default: True) + 'include_file_components': True, # Include file-level components (default: True) + 'include_file_dependencies': False, # Include file-to-package dependencies (default: False) + 'include_debug_files': False, # Include debug files in file components (default: False) + 'include_man_pages': True, # Include man pages in file components (default: True) + 'include_toolchain_dependencies': False, # Include build toolchain in dependencies (default: False) +} +``` + +**Configuration Options Explained:** + +- `include_file_components`: When enabled, creates individual file components for each file in built packages, including hashes, permissions, and ownership information. +- `include_file_dependencies`: Creates dependency relationships showing which files belong to which packages. +- `include_debug_files`: Filters out debug files (`.debug`, files in `/usr/lib/debug`) from file components. +- `include_man_pages`: Filters out man pages from file components. +- `include_toolchain_dependencies`: Adds build toolchain packages to the dependencies array (useful for complete build provenance, but can make dependency graphs very large). + +## Output + +The plugin generates a file named `--.sbom` (for CycloneDX) or `--.spdx.json` (for SPDX) in the build results directory. The SBOM includes: + +* CycloneDX/SPDX document metadata + * Build timestamp + * Tool information (Mock SBOM Generator) + * Mock-specific build properties (host, distribution, chroot, config) + * RPM header metadata surfaced at the document level (buildhost, buildtime, source RPM, group, epoch, distribution, manufacture/vendor) +* Components array containing: + * Built packages (type: "library" or "application") + * Package name, version, and PURL + * CPE identifiers for vulnerability matching + * License information plus RPM summary as description + * RPM file SHA-256 hash + * Vendor, packager, buildhost, buildtime, source RPM, group, epoch, distribution metadata + * Upstream/project URLs and source RPM links via `externalReferences` + * GPG signature details + * Note: Source tarballs and patches are represented as separate file components in the components array with their own BOM refs for traceability + * Build toolchain packages (type: "library") + * All packages installed in the build environment + * Signature information + * Marked with `mock:role: "build-toolchain"` property + * Source files (type: "file") + * Source and patch files from spec + * SHA-256 hashes + * Signature information if available +* Dependencies array + * Runtime dependencies for built packages (libraries/RPMs the package depends on) + * Dependency relationships modeled using bom-refs + * Note: Source code relationships are represented in component properties and the components array, not in the dependencies section (source code is a build input, not a runtime dependency) + +## Example SBOM Structure + +```json +{ + "bomFormat": "CycloneDX", + "specVersion": "1.5", + "serialNumber": "urn:uuid:...", + "version": 1, + "metadata": { + "timestamp": "2024-01-19T15:20:00Z", + "tools": [ + { + "vendor": "Mock", + "name": "mock-sbom-generator", + "version": "1.0" + } + ], + "properties": [ + { "name": "mock:build:host", "value": "build.example.com" }, + { "name": "mock:build:distribution", "value": "Fedora 42" }, + { "name": "mock:build:chroot", "value": "/var/lib/mock/fedora-42-x86_64/root" }, + { "name": "mock:rpm:buildhost", "value": "builder.fedora.example.org" }, + { "name": "mock:rpm:buildtime", "value": "2024-01-19T15:15:00+00:00" }, + { "name": "mock:rpm:sourcerpm", "value": "package-name-1.0-1.fc42.src.rpm" }, + { "name": "mock:rpm:group", "value": "System Environment/Libraries" }, + { "name": "mock:rpm:epoch", "value": "1" } + ], + "manufacture": { + "name": "Fedora Project" + }, + "component": { + "type": "application", + "name": "package-name", + "version": "1.0-1.fc42", + "bom-ref": "build-output:package-name", + "description": "Package summary (build output containing 3 package(s))", + "licenses": [ + { + "license": { + "id": "MIT" + } + } + ], + "externalReferences": [ + { "type": "distribution", "url": "package-name-1.0-1.fc42.src.rpm" }, + { "type": "website", "url": "https://example.com/package-name" } + ] + } + }, + "components": [ + { + "type": "library", + "bom-ref": "pkg:rpm/fedora/package-name@1.0-1.fc42?arch=x86_64", + "name": "package-name", + "version": "1.0-1.fc42", + "purl": "pkg:rpm/fedora/package-name@1.0-1.fc42?arch=x86_64", + "externalReferences": [ + { + "type": "other", + "comment": "CPE 2.3", + "url": "cpe:2.3:a:fedora:package-name:1.0:*:*:*:*:*:*:*:*" + }, + { + "type": "website", + "url": "https://src.fedoraproject.org/rpms/package-name" + }, + { + "type": "distribution", + "url": "package-name-1.0-1.fc42.src.rpm" + } + ], + "licenses": [ + { + "license": { + "id": "MIT" + } + } + ], + "hashes": [ + { + "alg": "SHA-256", + "content": "..." + } + ], + "properties": [ + { + "name": "mock:rpm:vendor", + "value": "Fedora Project" + }, + { + "name": "mock:rpm:buildhost", + "value": "builder.fedora.example.org" + }, + { + "name": "mock:rpm:buildtime", + "value": "2024-01-19T15:15:00+00:00" + }, + { + "name": "mock:rpm:sourcerpm", + "value": "package-name-1.0-1.fc42.src.rpm" + }, + { + "name": "mock:signature:type", + "value": "GPG" + } + ] + } + ], + "dependencies": [ + { + "ref": "pkg:rpm/fedora/package-name@1.0-1.fc42", + "dependsOn": [ + "pkg:rpm/fedora/glibc@2.38-1.fc42" + ] + } + ] +} +``` + +## Security Tool Compatibility + +The generated CycloneDX SBOM is compatible with popular security scanners: + +* **Grype**: `grype sbom:./sbom.cyclonedx.json` +* **Trivy**: `trivy sbom sbom.cyclonedx.json` +* **Snyk**: Supports CycloneDX format for vulnerability scanning + +The SBOM includes PURL (Package URL) and CPE identifiers for accurate vulnerability matching. + +## Requirements + +* Python 3.x +* Access to build environment for package information +* Native `rpm` and `specfile` libraries (recommended) + +## Notes + +* The plugin runs in the `postbuild` hook, after the build completes. +* SBOM generation is skipped if no RPM, source RPM, or spec file is found. +* **Hybrid Analysis**: Uses `doChroot` to analyze artifacts within the buildroot (ensuring compatibility with target RPM versions) and host tools for artifacts already exported to the `result/` directory. +* **Resilient Parsing**: Includes a regex-based fallback for spec files that fail strict parsing by the `specfile` library (e.g., legacy `%patchN` syntax). +* **PURL format**: `pkg:rpm/{distro}/{package}@{version}?arch={arch}`. Architecture is always separated into a qualifier, never baked into the version string. +* Mock-specific metadata is stored in properties with the `mock:` prefix. + +## Competitive Advantages + +This SBOM generator leverages Mock's unique build environment visibility: + +* **Complete Build Toolchain**: Captures every package installed in the build chroot, not just declared dependencies +* **Build-Time Provenance**: Records the exact build environment, including tool versions and signatures +* **RPM-Native Intelligence**: Deep integration with RPM metadata, spec files, and package signatures +* **Reproducible Build Context**: Complete build environment fingerprinting for reproducibility verification + +Available since version 6.7. \ No newline at end of file diff --git a/mock/py/mockbuild/buildroot.py b/mock/py/mockbuild/buildroot.py index 2952519d8..42fd92931 100644 --- a/mock/py/mockbuild/buildroot.py +++ b/mock/py/mockbuild/buildroot.py @@ -195,6 +195,17 @@ def make_chroot_path(self, *paths): new_path = os.path.join(new_path, path) return new_path + def from_chroot_path(self, host_path): + """Convert an absolute host path into the corresponding path inside the build chroot.""" + if not self.rootdir: + return host_path + if host_path.startswith(self.rootdir): + rel_path = host_path[len(self.rootdir):] + if not rel_path.startswith("/"): + rel_path = "/" + rel_path + return rel_path + return host_path + @traceLog() def initialize(self, prebuild=False): """ diff --git a/mock/py/mockbuild/config.py b/mock/py/mockbuild/config.py index c4830e47f..a16e8ab0f 100644 --- a/mock/py/mockbuild/config.py +++ b/mock/py/mockbuild/config.py @@ -33,7 +33,7 @@ 'lvm_root', 'compress_logs', 'sign', 'pm_request', 'hw_info', 'procenv', 'showrc', 'rpkg_preprocessor', 'rpmautospec', 'buildroot_lock', 'export_buildroot_image', - 'unbreq', 'expand_spec', 'system_monitor'] + 'unbreq', 'expand_spec', 'sbom_generator', 'system_monitor'] def nspawn_supported(): """Detect some situations where the systemd-nspawn chroot code won't work""" @@ -264,6 +264,16 @@ def setup_default_config_opts(): 'expand_spec_opts': { 'rpmspec_opts': [], }, + 'sbom_generator_enable': False, + 'sbom_generator_opts': { + 'generate_sbom': True, + 'include_file_components': True, + 'include_file_dependencies': False, + 'include_debug_files': False, + 'include_man_pages': True, + 'include_source_dependencies': True, + 'include_toolchain_dependencies': False, + }, 'system_monitor_enable': False, 'system_monitor_opts': { 'interval' : 2 diff --git a/mock/py/mockbuild/plugins/sbom_cyclonedx.py b/mock/py/mockbuild/plugins/sbom_cyclonedx.py new file mode 100644 index 000000000..f8366a18b --- /dev/null +++ b/mock/py/mockbuild/plugins/sbom_cyclonedx.py @@ -0,0 +1,941 @@ +# -*- coding: utf-8 -*- +# vim:expandtab:autoindent:tabstop=4:shiftwidth=4:filetype=python:textwidth=0: +# License: GPL2 or later see COPYING +# Written by Scott R. Shinn +# Copyright (C) 2026, Atomicorp, Inc. + +import os +import re +import uuid +from datetime import datetime, timezone + +""" +CycloneDX generation functions for the SBOM generator plugin. +""" + + +class CycloneDxGenerator: + """Helper class for generating CycloneDX documents.""" + + def __init__(self, rpm_helper, buildroot, conf=None): + self.rpm_helper = rpm_helper + self.buildroot = buildroot + self.conf = conf or {} + + # Configuration options for file-level dependencies and filtering + self.include_file_dependencies = self.conf.get("include_file_dependencies", False) + self.include_file_components = self.conf.get("include_file_components", True) + self.include_debug_files = self.conf.get("include_debug_files", False) + self.include_man_pages = self.conf.get("include_man_pages", True) + self.include_toolchain_dependencies = self.conf.get( + "include_toolchain_dependencies", False + ) + + def create_built_package_component( + self, rpm_path, distro_obj, _source_components=None + ): + """Creates a CycloneDX component for a built RPM package.""" + package_data = self.rpm_helper.get_rpm_metadata(rpm_path) + if not package_data: + self.buildroot.root_log.debug(f"[SBOM] FAILED to get metadata for {rpm_path}, skipping component") + return None + + package_name = package_data.get("name") + version = package_data.get("version") + release = package_data.get("release") + arch = package_data.get("arch") + + # Combine version and release + full_version = f"{version}-{release}" if release else version + + # Generate PURL and bom-ref + purl = self.rpm_helper.generate_purl(package_name, full_version, distro_obj, arch) + bom_ref = purl + + # Determine component type (application vs library) + component_type = "library" + + component = { + "type": component_type, + "bom-ref": bom_ref, + "name": package_name, + "version": full_version, + "purl": purl + } + + # Add external references (CPE) + vendor = package_data.get("vendor") + cpe = self.rpm_helper.generate_cpe(package_name, version, vendor=vendor) + if cpe: + component["externalReferences"] = [ + { + "type": "other", + "comment": "CPE 2.3", + "url": cpe + } + ] + + # Add license information + license_str = package_data.get("license") + if license_str and license_str != "(none)": + component["licenses"] = [{"expression": license_str}] + + # Add supplier information (from Packager field) + packager = package_data.get("packager") + if packager and packager != "(none)": + component["supplier"] = {"name": packager} + + # Add properties for RPM metadata + properties = [] + + properties.append({ + "name": "mock:rpm:filename", + "value": os.path.basename(rpm_path) + }) + + vendor = package_data.get("vendor") + if vendor and vendor != "(none)": + properties.append({"name": "mock:rpm:vendor", "value": vendor}) + + packager = package_data.get("packager") + if packager and packager != "(none)": + properties.append({"name": "mock:rpm:packager", "value": packager}) + + buildhost = package_data.get("buildhost") + if buildhost and buildhost != "(none)": + properties.append({"name": "mock:rpm:buildhost", "value": buildhost}) + + buildtime_iso = self.format_epoch_timestamp(package_data.get("buildtime")) + if buildtime_iso: + properties.append({"name": "mock:rpm:buildtime", "value": buildtime_iso}) + + group = package_data.get("group") + if group and group != "(none)": + properties.append({"name": "mock:rpm:group", "value": group}) + + epoch_val = package_data.get("epoch") + if epoch_val and epoch_val != "(none)": + properties.append({"name": "mock:rpm:epoch", "value": epoch_val}) + + distribution = package_data.get("distribution") + if distribution and distribution != "(none)": + properties.append({"name": "mock:rpm:distribution", "value": distribution}) + + url = package_data.get("url") + if url and url != "(none)": + component["externalReferences"] = component.get("externalReferences", []) + component["externalReferences"].append({"type": "website", "url": url}) + + summary = package_data.get("summary") + if summary and summary != "(none)": + component["description"] = summary + + # Add GPG signature information if available + signature = self.rpm_helper.get_rpm_signature(rpm_path) + if signature: + # Parse signature info + sig_props = self.parse_signature_to_properties(signature) + properties.extend(sig_props) + + if properties: + component["properties"] = properties + + return component + + def parse_signature_to_properties(self, signature_string): + """Parses RPM signature string into CycloneDX properties.""" + properties = [] + if not signature_string or signature_string == "(none)": + return properties + + properties.append({"name": "mock:signature:type", "value": "GPG"}) + + if "RSA/SHA256" in signature_string: + properties.append({"name": "mock:signature:algorithm", "value": "RSA/SHA256"}) + elif "DSA/SHA1" in signature_string: + properties.append({"name": "mock:signature:algorithm", "value": "DSA/SHA1"}) + elif "ECDSA/SHA256" in signature_string: + properties.append({"name": "mock:signature:algorithm", "value": "ECDSA/SHA256"}) + elif "Ed25519/SHA256" in signature_string: + properties.append({"name": "mock:signature:algorithm", "value": "Ed25519/SHA256"}) + + key_id_match = re.search(r'Key ID ([0-9a-fA-F]+)', signature_string) + if key_id_match: + properties.append({"name": "mock:signature:key", "value": key_id_match.group(1)}) + + date_match = re.search( + r'([A-Za-z]{3} [A-Za-z]{3}\s+\d{1,2} \d{2}:\d{2}:\d{2} \d{4})', signature_string + ) + if date_match: + properties.append({"name": "mock:signature:date", "value": date_match.group(1)}) + + properties.append({"name": "mock:signature:raw", "value": signature_string}) + return properties + + def signature_info_to_properties(self, signature_info): + """Converts signature info dict to CycloneDX properties.""" + properties = [] + sig_type = signature_info.get("signature_type", "unsigned") + properties.append({"name": "mock:signature:type", "value": sig_type}) + + if ( + sig_type not in ('unsigned', 'unknown') and + 'missing key' not in sig_type and + 'BAD' not in sig_type + ): + algorithm = signature_info.get("signature_algorithm") + if algorithm: + properties.append({"name": "mock:signature:algorithm", "value": algorithm}) + + key_id = signature_info.get("signature_key") + if key_id: + properties.append({"name": "mock:signature:key", "value": key_id}) + + sig_date = signature_info.get("signature_date") + if sig_date: + properties.append({"name": "mock:signature:date", "value": sig_date}) + + sig_valid = signature_info.get("signature_valid", False) + properties.append({"name": "mock:signature:valid", "value": str(sig_valid).lower()}) + + raw_data = signature_info.get("raw_signature_data") + if raw_data: + properties.append({"name": "mock:signature:raw", "value": raw_data}) + + return properties + + def create_cyclonedx_document(self): + """Initializes the base CycloneDX JSON structure.""" + return { + "bomFormat": "CycloneDX", + "specVersion": "1.5", + "serialNumber": f"urn:uuid:{uuid.uuid4()}", + "version": 1, + "metadata": {}, + "components": [], + "dependencies": [] + } + + def generate_bom_ref(self, package_name, version, _component_type="package"): + """Generates a stable bom-ref ID based on package name and version.""" + safe_name = re.sub(r'[^a-zA-Z0-9.-]', '-', package_name) + safe_version = re.sub(r'[^a-zA-Z0-9.-]', '-', version) + return f"build-output:{safe_name}-{safe_version}" + + def generate_file_bom_ref(self, package_name, package_version, file_path): + """Generates a unique but stable bom-ref for a file.""" + safe_name = re.sub(r'[^a-zA-Z0-9.-]', '-', package_name) + safe_version = re.sub(r'[^a-zA-Z0-9.-]', '-', package_version) + safe_path = re.sub(r'[^a-zA-Z0-9.-]', '-', file_path.lstrip('/')) + return f"file:{safe_name}-{safe_version}:{safe_path}" + + def add_source_components(self, _bom, source_files): + """Adds source files (from spec) to the components list.""" + source_components = [] + source_component_entries = [] + for src_file in source_files: + file_comp = self.create_source_file_component(src_file) + _bom["components"].append(file_comp) + source_components.append(file_comp) + source_component_entries.append({ + "filename": src_file["filename"], + "bom-ref": file_comp["bom-ref"] + }) + return source_components, source_component_entries + + def create_source_file_component(self, source_file): + """Creates a CycloneDX component for a source file.""" + filename = source_file["filename"] + sha256 = source_file.get("sha256") + sig = source_file.get("digital_signature") + + safe_name = re.sub(r'[^a-zA-Z0-9.-]', '-', filename) + hash_suffix = sha256[:8] if sha256 else "unknown" + bom_ref = f"source-file:{safe_name}-{hash_suffix}" + + comp = { + "type": "file", + "bom-ref": bom_ref, + "name": filename, + "properties": [ + {"name": "mock:source:type", "value": "patch" if self.is_patch_file(filename) else "archive"} + ] + } + if sha256: + comp["hashes"] = [{"alg": "SHA-256", "content": sha256}] + if sig: + comp["properties"].append({"name": "mock:signature:status", "value": sig}) + + return comp + + def is_patch_file(self, filename): + """Determines if a file is a patch file based on common extensions.""" + patch_extensions = ['.patch', '.diff'] + return any(filename.lower().endswith(ext) for ext in patch_extensions) + + def format_epoch_timestamp(self, epoch_value): + """Converts an epoch integer to an ISO 8601 timestamp string.""" + try: + val_int = int(epoch_value) + dt = datetime.fromtimestamp(val_int, timezone.utc) + return dt.isoformat() + except (ValueError, TypeError): + return "" + + def append_source_properties(self, properties, source_entries): + """Appends source and patch references to a component's properties.""" + for i, src in enumerate(source_entries): + filename = src["filename"] + prop_name = f"mock:source:patch{i}" if self.is_patch_file(filename) else f"mock:source:file{i}" + properties.append({ + "name": prop_name, + "value": src["bom-ref"] + }) + + def get_source_file_bom_refs(self, _package_name, source_files): + """Returns a list of bom-refs for source files.""" + refs = [] + for src in source_files: + filename = src["filename"] + sha256 = src.get("sha256") + safe_name = re.sub(r'[^a-zA-Z0-9.-]', '-', filename) + hash_suffix = sha256[:8] if sha256 else "unknown" + bom_ref = f"source-file:{safe_name}-{hash_suffix}" + refs.append(bom_ref) + return refs + + def get_iso_timestamp(self): + """Returns the current UTC time in ISO 8601 format.""" + return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ") + + def create_dependency(self, bom_ref, dependencies, component_map, distro_obj): + """Creates a dependency entry mapping raw requires to parsed bom-refs.""" + dep_entry = { + "ref": bom_ref, + "dependsOn": [] + } + for raw_dep in dependencies: + target_ref = self.dependency_to_bom_ref(raw_dep, component_map, distro_obj) + if target_ref and target_ref not in dep_entry["dependsOn"] and target_ref != bom_ref: + dep_entry["dependsOn"].append(target_ref) + + return dep_entry if dep_entry["dependsOn"] else None + + def dependency_to_bom_ref(self, dependency_string, component_map, _distro): + """ + Attempts to map a raw RPM dependency string (e.g., 'libc.so.6', 'bash >= 4.0') + to a concrete bom-ref in the component_map. + """ + if not dependency_string: + return None + + clean_dep = dependency_string.strip() + + if " " in clean_dep: + # Handle forms like 'bash >= 5.0' -> just look for 'bash' + pkg_name = clean_dep.split()[0].strip() + # If the requirement is a package name we know about + if pkg_name in component_map: + return component_map[pkg_name] + + # Sometimes dependencies look like 'config(bash) = 5.0' + if clean_dep.startswith("config(") and ")" in clean_dep: + inner_name = clean_dep[7:clean_dep.find(")")] + if inner_name in component_map: + return component_map[inner_name] + + else: + # Handle raw names like 'bash' or 'libc.so.6' + if clean_dep in component_map: + return component_map[clean_dep] + + # Check if any component *provides* this string (this is an approximation, + # true resolution requires full RPM capability mapping which is slow) + # For now, we rely on the direct package name match which covers 80% of cases. + + return None + def process_built_packages(self, bom, rpm_files, build_dir, distro_id, + source_component_entries, build_subject_name, + build_toolchain_packages, toolchain_bom_refs): + """Processes binary RPMs and creates structured CycloneDX components and dependencies.""" + built_package_bom_refs = [] + all_built_components = [] + component_map = {} + primary_rpm_metadata = None + + # Build component map from toolchain packages + for toolchain_pkg in build_toolchain_packages: + pkg_name = toolchain_pkg.get("name") + pkg_version = toolchain_pkg.get("version") + if pkg_name and pkg_version: + purl = self.rpm_helper.generate_purl(pkg_name, pkg_version, distro_id) + component_map[pkg_name.lower()] = purl + + for rpm_file in rpm_files: + rpm_path = os.path.join(build_dir, rpm_file) + component = self.create_built_package_component( + rpm_path, distro_id, source_component_entries + ) + if not component: + continue + + bom_ref = component.get("bom-ref") + package_name = component.get("name") + package_version = component.get("version") + + if bom_ref: + built_package_bom_refs.append(bom_ref) + if package_name: + component_map[package_name.lower()] = bom_ref + + bom["components"].append(component) + + # Determine primary RPM metadata + if not primary_rpm_metadata: + if not package_name or 'debuginfo' not in package_name.lower(): + primary_rpm_metadata = self.rpm_helper.get_rpm_metadata(rpm_path) + else: + current_name = primary_rpm_metadata.get('name', '').lower() + is_current_debuginfo = 'debuginfo' in current_name + should_replace = False + if (is_current_debuginfo and package_name and + 'debuginfo' not in package_name.lower()): + should_replace = True + elif (build_subject_name and package_name and + package_name.lower() == build_subject_name.lower()): + should_replace = True + + if should_replace: + self.buildroot.root_log.debug(f"[SBOM] Selecting {package_name} as primary metadata source") + primary_rpm_metadata = self.rpm_helper.get_rpm_metadata(rpm_path) + + # File components + if package_name and package_version and self.include_file_components: + # Extract CPE and GPG info from the component to pass to files + rpm_cpe = None + for ext_ref in component.get("externalReferences", []): + if ext_ref.get("comment") == "CPE 2.3": + rpm_cpe = ext_ref.get("url") + + rpm_gpg = None + for prop in component.get("properties", []): + if prop.get("name") == "mock:signature:key": + rpm_gpg = prop.get("value") + + file_components = self.create_file_components( + rpm_path, package_name, package_version, + rpm_cpe=rpm_cpe, rpm_gpg=rpm_gpg + ) + + if file_components: + if "components" not in component: + component["components"] = [] + + for file_comp in file_components: + # Set scope to required for all files in the produced RPM + file_comp["scope"] = "required" + component["components"].append(file_comp) + + if self.should_include_file_dependency(file_comp.get("name", "")): + bom["dependencies"].append({ + "ref": file_comp["bom-ref"], + "dependsOn": [bom_ref] + }) + + # Sort file components alphabetically + component["components"].sort(key=lambda x: x.get("name", "")) + + # Dependencies + dependencies = self.rpm_helper.get_rpm_dependencies(rpm_path) or [] + runtime_dependency = self.create_dependency( + bom_ref, dependencies, component_map, distro_id + ) + + all_depends_on = [] + if runtime_dependency and runtime_dependency.get("dependsOn"): + all_depends_on.extend(runtime_dependency.get("dependsOn")) + + if self.include_toolchain_dependencies and toolchain_bom_refs: + for t_ref in toolchain_bom_refs: + if t_ref not in all_depends_on: + all_depends_on.append(t_ref) + + all_depends_on = sorted(list(set(all_depends_on))) + if all_depends_on: + bom["dependencies"].append({"ref": bom_ref, "dependsOn": all_depends_on}) + elif runtime_dependency: + bom["dependencies"].append(runtime_dependency) + + all_built_components.append(component) + + return built_package_bom_refs, primary_rpm_metadata, all_built_components + + # pylint: disable=too-many-arguments,too-many-locals,too-many-branches,too-many-statements,too-many-positional-arguments + + def finalize_bom_metadata(self, bom, primary_rpm_metadata, built_package_bom_refs, + build_subject_name, build_subject_version, + build_subject_release, distro_id, spec_metadata=None): + """Finalizes BOM metadata, sets the primary component, and adds RPM properties.""" + # Add BuildRequires and Requires from spec if available + if spec_metadata: + metadata_props = [] + build_reqs = spec_metadata.get("build_requires", []) + if build_reqs: + metadata_props.append({ + "name": "mock:spec:build_requires", + "value": ",".join(build_reqs) + }) + + reqs = spec_metadata.get("requires", []) + if reqs: + metadata_props.append({ + "name": "mock:spec:requires", + "value": ",".join(reqs) + }) + + if metadata_props: + bom["metadata"]["properties"] = bom["metadata"].get("properties", []) + bom["metadata"]["properties"].extend(metadata_props) + + if primary_rpm_metadata: + if "properties" not in bom["metadata"]: + bom["metadata"]["properties"] = [] + rpm_props = bom["metadata"]["properties"] + for key, prop_name in [("buildhost", "mock:rpm:buildhost"), + ("buildtime", "mock:rpm:buildtime"), + ("group", "mock:rpm:group"), + ("epoch", "mock:rpm:epoch"), + ("distribution", "mock:rpm:distribution")]: + val = primary_rpm_metadata.get(key) + if val and val != "(none)" and (key != "epoch" or val.strip()): + rpm_props.append({"name": prop_name, "value": val}) + + vendor = primary_rpm_metadata.get("vendor") + if vendor and vendor != "(none)": + bom["metadata"]["manufacturer"] = {"name": vendor} + bom["metadata"]["authors"] = [{"name": vendor}] + + packager = primary_rpm_metadata.get("packager") + if packager and packager != "(none)": + bom["metadata"]["supplier"] = {"name": packager} + + if built_package_bom_refs: + if len(built_package_bom_refs) == 1: + primary_ref = built_package_bom_refs[0] + primary_component = next((c for c in bom["components"] + if c.get("bom-ref") == primary_ref), None) + if primary_component: + component_obj = { + "type": primary_component.get("type", "application"), + "name": primary_component.get("name"), + "version": primary_component.get("version"), + "bom-ref": primary_ref, + "purl": primary_component.get("purl") + } + if primary_component.get("description"): + component_obj["description"] = primary_component.get("description") + elif primary_rpm_metadata: + summary = primary_rpm_metadata.get("summary") + if summary and summary != "(none)": + component_obj["description"] = summary + + external_refs = [] + if primary_rpm_metadata: + sourcerpm = primary_rpm_metadata.get("sourcerpm") + if sourcerpm and sourcerpm != "(none)": + external_refs.append({"type": "distribution", "url": sourcerpm}) + url = primary_rpm_metadata.get("url") + if url and url != "(none)": + external_refs.append({"type": "website", "url": url}) + if external_refs: + component_obj["externalReferences"] = external_refs + + if primary_component.get("licenses"): + component_obj["licenses"] = primary_component.get("licenses") + elif primary_rpm_metadata: + lic = primary_rpm_metadata.get("license") + if lic and lic != "(none)": + component_obj["licenses"] = [{"expression": lic}] + bom["metadata"]["component"] = component_obj + else: + first_pkg = next((c for c in bom["components"] + if c.get("bom-ref") == built_package_bom_refs[0]), None) + if first_pkg: + aggregate_name = build_subject_name or first_pkg.get("name", "unknown") + aggregate_version = None + if build_subject_version and build_subject_release: + aggregate_version = f"{build_subject_version}-{build_subject_release}" + elif primary_rpm_metadata: + v = primary_rpm_metadata.get("version") + r = primary_rpm_metadata.get("release") + if v and r: + aggregate_version = f"{v}-{r}" + if not aggregate_version: + aggregate_version = first_pkg.get("version", "unknown") + + description = ( + f"Build output containing {len(built_package_bom_refs)} package(s)" + ) + if primary_rpm_metadata: + summary = primary_rpm_metadata.get("summary") + if summary and summary != "(none)": + description = f"{summary} ({description})" + + component_obj = { + "type": "application", + "name": aggregate_name, + "version": aggregate_version, + "bom-ref": f"build-output:{aggregate_name}", + "description": description + } + if primary_rpm_metadata: + lic = primary_rpm_metadata.get("license") + if lic and lic != "(none)": + component_obj["licenses"] = [{"expression": lic}] + elif spec_metadata and spec_metadata.get("license"): + component_obj["licenses"] = [{"expression": spec_metadata["license"]}] + + if aggregate_name and aggregate_version: + component_obj["purl"] = self.rpm_helper.generate_purl( + aggregate_name, aggregate_version, distro_id + ) + bom["metadata"]["component"] = component_obj + + # pylint: disable=too-many-locals,too-many-branches,too-many-statements + + def finalize_dependencies(self, bom, source_component_entries, + build_toolchain_packages, distro_id, + built_package_bom_refs, toolchain_bom_refs, + spec_metadata=None, + source_components=None, + toolchain_components=None, + all_built_components=None): + """Finalizes BOM dependencies, linking primary package to hierarchical grouping components + and implementing nested component composition.""" + # Find primary component ref (metadata.component or first built package) + primary_ref = None + if bom.get("metadata") and bom["metadata"].get("component"): + primary_ref = bom["metadata"]["component"].get("bom-ref") + + if not primary_ref: + return + + # Create virtual grouping references + inputs_ref = "build:inputs" + toolchain_ref = "build:toolchain" + outputs_ref = "build:outputs" + + # Prepare grouping components + inputs_group = { + "type": "application", + "bom-ref": inputs_ref, + "name": "Build Inputs", + "description": "Source code and patches used for the build", + "properties": [{"name": "mock:type", "value": "grouping-node"}] + } + if source_components: + inputs_group["components"] = sorted(source_components, key=lambda x: x.get("name", "")) + + toolchain_group = { + "type": "application", + "bom-ref": toolchain_ref, + "name": "Build Toolchain", + "description": "Packages and tools used to perform the build", + "scope": "excluded", # Tools are not part of the runtime payload + "properties": [{"name": "mock:type", "value": "grouping-node"}] + } + if toolchain_components: + # Group toolchain components by their GPG Key ID + signer_groups = {} + pkg_map = {p.get("name"): p for p in build_toolchain_packages} + + for comp in toolchain_components: + comp["scope"] = "excluded" + pkg_info = pkg_map.get(comp.get("name")) + sig_info = pkg_info.get("digital_signature", {}) if pkg_info else {} + key_id = sig_info.get("signature_key", "unsigned") + + # Attach signature properties to the individual package component + if sig_info: + sig_props = self.signature_info_to_properties(sig_info) + comp["properties"] = comp.get("properties", []) + comp["properties"].extend([p for p in sig_props if p["name"] != "mock:signature:raw"]) + + if key_id not in signer_groups: + # Create group properties - common only to the signer + group_props = [ + {"name": "mock:role", "value": "build-toolchain"}, + {"name": "mock:type", "value": "signer-group"}, + {"name": "mock:signature:key", "value": key_id} + ] + + signer_groups[key_id] = { + "type": "application", + "bom-ref": f"signer:{key_id}", + "name": f"Packages signed by {key_id}" if key_id != "unsigned" else "Unsigned Packages", + "scope": "excluded", + "properties": group_props, + "components": [] + } + signer_groups[key_id]["components"].append(comp) + + # Add signer groups as children of toolchain_group + sorted_groups = sorted( + list(signer_groups.values()), + key=lambda x: x.get("name", "") + ) + for group in sorted_groups: + group["components"].sort(key=lambda x: x.get("name", "")) + + toolchain_group["components"] = sorted_groups + + outputs_group = { + "type": "application", + "bom-ref": outputs_ref, + "name": "RPM Contents", + "description": "RPM packages and their contained files produced by the build", + "scope": "required", + "properties": [{"name": "mock:type", "value": "grouping-node"}] + } + if all_built_components: + outputs_group["components"] = sorted(all_built_components, key=lambda x: x.get("name", "")) + + # Nest groups into the primary component + primary_comp = bom["metadata"]["component"] + primary_comp["components"] = [inputs_group, toolchain_group, outputs_group] + # Sort metadata components alphabetically + primary_comp["components"].sort(key=lambda x: x.get("name", "")) + + # 1. Primary component depends on the three groups + bom["dependencies"].append({ + "ref": primary_ref, + "dependsOn": sorted([inputs_ref, toolchain_ref, outputs_ref]) + }) + + # 2. Build Inputs Group -> Source components + input_deps = [] + for entry in source_component_entries: + if entry.get("bom-ref"): + input_deps.append(entry["bom-ref"]) + + if input_deps: + bom["dependencies"].append({ + "ref": inputs_ref, + "dependsOn": sorted(list(set(input_deps))) + }) + + # 3. Build Toolchain Group -> Signer Groups + signer_refs = [g["bom-ref"] for g in toolchain_group.get("components", [])] + if signer_refs: + bom["dependencies"].append({ + "ref": toolchain_ref, + "dependsOn": sorted(signer_refs) + }) + + # 3b. Signer Groups -> Individual packages + for group in toolchain_group["components"]: + pkg_refs = [c["bom-ref"] for c in group["components"]] + bom["dependencies"].append({ + "ref": group["bom-ref"], + "dependsOn": sorted(pkg_refs) + }) + + # 4. RPM Contents Group -> Built RPMs (Packages) + if built_package_bom_refs: + bom["dependencies"].append({ + "ref": outputs_ref, + "dependsOn": sorted(list(set(built_package_bom_refs))) + }) + + + + def create_toolchain_component(self, toolchain_pkg, distro_obj): + """Creates a CycloneDX component for a build toolchain package.""" + package_name = toolchain_pkg.get("name") + version = toolchain_pkg.get("version") + + if not package_name or not version: + return None + + # Generate PURL and bom-ref + purl = self.rpm_helper.generate_purl(package_name, version, distro_obj, arch=toolchain_pkg.get("arch")) + bom_ref = purl + + component = { + "type": "library", + "bom-ref": bom_ref, + "name": package_name, + "version": version, + "purl": purl + } + + # Add checksum - REMOVED per user request to only have hashes for files contained in RPM + # (This follows the rule that only the 'RPM Contents' section should have hashes) + # checksum = toolchain_pkg.get("checksum") + # if checksum and checksum != "error" and not checksum.startswith("error"): + # if len(checksum) == 64: + # alg = "SHA-256" + # elif len(checksum) == 40: + # alg = "SHA-1" + # else: + # alg = "SHA-256" + # component["hashes"] = [{"alg": alg, "content": checksum}] + + # Add CPE + cpe = toolchain_pkg.get("cpe") + if cpe: + component["externalReferences"] = [ + { + "type": "other", + "comment": "CPE 2.3", + "url": cpe + } + ] + + # Add license + license_str = toolchain_pkg.get("licenseDeclared") + if license_str and license_str != "(none)": + component["licenses"] = [ + { + "expression": license_str + } + ] + + # Add properties + properties = [] + + # Add build date if available + signature_info = toolchain_pkg.get("digital_signature", {}) + build_date = signature_info.get("build_date") + if build_date: + properties.append({ + "name": "mock:build:date", + "value": build_date + }) + + if properties: + component["properties"] = properties + + return component + + + def create_file_components(self, rpm_path, package_name, package_version, + rpm_cpe=None, rpm_gpg=None): + """Creates file components for all files in an RPM package.""" + if not self.include_file_components: + return [] + + file_info = self.rpm_helper.get_rpm_file_info(rpm_path) + if not file_info: + return [] + + file_list = sorted(file_info.keys()) + + file_components = [] + for file_path in file_list: + if not file_path or not file_path.strip(): + continue + + # Filtering logic + if not self.include_debug_files and ("/usr/lib/debug/" in file_path or "/usr/src/debug/" in file_path): + self.buildroot.root_log.debug(f"[SBOM] Filtering debug file: {file_path}") + continue + + if not self.include_man_pages and ("/usr/share/man/" in file_path): + self.buildroot.root_log.debug(f"[SBOM] Filtering man page: {file_path}") + continue + + # Filter files based on configuration + if not self.include_debug_files: + if '/usr/lib/debug/' in file_path or file_path.endswith('.debug'): + continue + + file_data = file_info.get(file_path, {}) + file_hash = file_data.get("hash") + algo_id = file_data.get("algo") + + bom_ref = self.generate_file_bom_ref(package_name, package_version, file_path) + component = { + "type": "file", + "bom-ref": bom_ref, + "name": file_path + } + + # Add hash if available with detected algorithm + if file_hash: + # Map RPM algo ID to CycloneDX algo name + # 8: SHA-256, 10: SHA-512, 1: MD5, 2: SHA-1 + algo_map = { + 8: "SHA-256", + 10: "SHA-512", + 1: "MD5", + 2: "SHA-1", + 9: "SHA-384", + 11: "SHA-224" + } + alg_name = algo_map.get(algo_id, "SHA-256") + + component["hashes"] = [ + { + "alg": alg_name, + "content": file_hash + } + ] + + # Add properties for file metadata + properties = [] + if file_data.get("permissions"): + properties.append({ + "name": "mock:file:permissions", + "value": file_data["permissions"] + }) + if file_data.get("owner"): + properties.append({ + "name": "mock:file:owner", + "value": file_data["owner"] + }) + if file_data.get("group"): + properties.append({ + "name": "mock:file:group", + "value": file_data["group"] + }) + + if rpm_cpe: + properties.append({ + "name": "mock:package:cpe", + "value": rpm_cpe + }) + if rpm_gpg: + properties.append({ + "name": "mock:package:gpg:key", + "value": rpm_gpg + }) + + if properties: + component["properties"] = properties + + file_components.append(component) + + return file_components + + + def should_include_file_dependency(self, file_path): + """Determine if a file should have a dependency entry.""" + if not self.include_file_dependencies: + return False + + # Filter out debug files if configured + if not self.include_debug_files: + if '/usr/lib/debug/' in file_path or file_path.endswith('.debug'): + return False + + # Filter out man pages if configured + if not self.include_man_pages: + if ( + '/usr/share/man/' in file_path or + (file_path.endswith('.gz') and '/man' in file_path) + ): + return False + + return True + diff --git a/mock/py/mockbuild/plugins/sbom_generator.py b/mock/py/mockbuild/plugins/sbom_generator.py new file mode 100644 index 000000000..c506cb018 --- /dev/null +++ b/mock/py/mockbuild/plugins/sbom_generator.py @@ -0,0 +1,732 @@ +# -*- coding: utf-8 -*- +# vim:expandtab:autoindent:tabstop=4:shiftwidth=4:filetype=python:textwidth=0: +# License: GPL2 or later see COPYING +# Written by Scott R. Shinn +# Copyright (C) 2026, Atomicorp, Inc. +"""Mock plugin for generating CycloneDX SBOMs from built RPM packages.""" + +from mockbuild.plugins.sbom_utils import RpmQueryHelper +from mockbuild.plugins.sbom_spdx import SpdxGenerator +from mockbuild.plugins.sbom_cyclonedx import CycloneDxGenerator +import os +import json +import subprocess +import socket +import traceback +from datetime import datetime, timezone + + + + + +from mockbuild.trace_decorator import traceLog + +# pylint: disable=invalid-name +requires_api_version = "1.1" # Ensure compatibility with mock API +# pylint: enable=invalid-name + +# Plugin entry point +@traceLog() +def init(plugins, conf, buildroot): + """Initializes the SBOM generator plugin.""" + # Ensure configuration exists for the plugin + if "type" in conf and conf["type"] not in ("cyclonedx", "spdx"): + # We only support cyclonedx and spdx for now + buildroot.root_log.warning( + f"SBOM generator type '{conf['type']}' not supported, defaulting to 'cyclonedx'" + ) + conf["type"] = "cyclonedx" + + SBOMGenerator(plugins, conf, buildroot) + +class SBOMGenerator: + """Generates SBOM for the built packages.""" + # pylint: disable=too-few-public-methods,too-many-instance-attributes + @traceLog() + def __init__(self, plugins, conf, buildroot): + + self.buildroot = buildroot + self.conf = conf + self.rpm_helper = RpmQueryHelper(self.buildroot) + self.spdx_gen = SpdxGenerator(self.rpm_helper, self.buildroot, conf=self.conf) + self.cdx_gen = CycloneDxGenerator(self.rpm_helper, self.buildroot, conf=self.conf) + self.state = buildroot.state + self.rootdir = buildroot.rootdir + self.builddir = buildroot.builddir + self.sbom_enabled = self.conf.get('generate_sbom', True) + self.sbom_type = self.conf.get('type', 'cyclonedx') + self.sbom_done = False + + # Configuration options for file-level dependencies and filtering + self.include_file_dependencies = self.conf.get('include_file_dependencies', False) + self.include_file_components = self.conf.get('include_file_components', True) + self.include_debug_files = self.conf.get('include_debug_files', False) + self.include_man_pages = self.conf.get('include_man_pages', True) + self.include_source_dependencies = self.conf.get('include_source_dependencies', True) + self.include_toolchain_dependencies = self.conf.get('include_toolchain_dependencies', False) + + self.prebuild_source_files = [] + self.prebuild_spec_metadata = {} + + plugins.add_hook("prebuild", self._capture_prebuild_state) + plugins.add_hook("postbuild", self._generate_sbom_post_build_hook) + + @traceLog() + def _capture_prebuild_state(self): + """Captures pristine source artifacts before the build begins.""" + + self.buildroot.root_log.debug("Capturing pre-build state from SPECS and SOURCES") + + # Look for spec file in the build directory + specs_dir = os.path.join(self.buildroot.rootdir, "builddir/build/SPECS") + try: + if os.path.exists(specs_dir): + for file in os.listdir(specs_dir): + if file.endswith('.spec'): + spec_file = os.path.join(specs_dir, file) + self.buildroot.root_log.debug(f"Parsing spec file for pre-build state: {spec_file}") + metadata, sources = self.rpm_helper.parse_spec_file(spec_file) + self.prebuild_spec_metadata = metadata + self.prebuild_source_files = sources + break + else: + self.buildroot.root_log.debug("SPECS directory does not exist for pre-build capture.") + except Exception as e: + self.buildroot.root_log.debug(f"Failed to capture pre-build state: {e}") + + def _create_metadata(self): + """Creates CycloneDX metadata object with Mock-specific build information.""" + metadata = { + "timestamp": datetime.now(timezone.utc).isoformat(), + "tools": [ + { + "vendor": "Mock", + "name": "mock-sbom-generator", + "version": self.buildroot.config.get('version', 'unknown') + } + ], + "lifecycles": [ + { + "phase": "build" + } + ], + "licenses": [ + { + "license": { + "id": "CC0-1.0" + } + } + ], + "properties": [] + } + + # Add Mock-specific build metadata as properties + properties = metadata["properties"] + + # Add SBOM completeness declaration + properties.append({ + "name": "sbom:completeness", + "value": "complete" + }) + + properties.append({ + "name": "mock:build:host", + "value": socket.gethostname() + }) + + distro_name = self.rpm_helper.get_distribution() + if distro_name: + properties.append({ + "name": "mock:build:distribution", + "value": distro_name + }) + + # Add chroot information if available + if hasattr(self.buildroot, 'rootdir') and self.buildroot.rootdir: + properties.append({ + "name": "mock:build:chroot", + "value": self.buildroot.rootdir + }) + + # Add Mock config if available + if hasattr(self.buildroot, 'config') and self.buildroot.config: + config = self.buildroot.config + config_name = config.get('config_path', 'unknown') + properties.append({ + "name": "mock:build:config", + "value": config_name + }) + + # Capture network isolation and access status + online = config.get('online', True) + properties.append({ + "name": "mock:build:network:online", + "value": str(online).lower() + }) + + rpm_net = config.get('rpmbuild_networking', False) + properties.append({ + "name": "mock:build:network:rpmbuild", + "value": str(rpm_net).lower() + }) + + isolation = config.get('isolation') + if isolation: + properties.append({ + "name": "mock:build:isolation", + "value": str(isolation) + }) + + use_nspawn = config.get('use_nspawn') + if use_nspawn is not None: + properties.append({ + "name": "mock:build:nspawn", + "value": str(use_nspawn).lower() + }) + + hardening_props = self._collect_build_hardening_properties() + if hardening_props: + properties.extend(hardening_props) + + return metadata + + def _evaluate_rpm_macro(self, macro): + """Evaluate an RPM macro inside the buildroot (falling back to host).""" + cmd = ["rpm", "--eval", macro] + # Prefer evaluating inside the chroot to capture build-specific settings + if hasattr(self.buildroot, "doChroot"): + try: + output, _ = self.buildroot.doChroot( + cmd, + shell=False, + returnOutput=True, + printOutput=False, + ) + if output: + return output.strip() + except Exception as exc: # pylint: disable=broad-except + self.buildroot.root_log.debug( + f"Warning: failed to eval macro {macro} in chroot: {exc}" + ) + try: + result = subprocess.run( + cmd, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE, + check=True, + text=True, + ) + return result.stdout.strip() + except subprocess.CalledProcessError as exc: + self.buildroot.root_log.debug(f"Warning: failed to eval macro {macro}: {exc}") + return "" + + def _read_file_from_chroot(self, relative_path): + """ + Read a file from inside the buildroot. + Returns the file content as a string or empty string on failure. + """ + chroot_path = os.path.join(self.buildroot.rootdir, relative_path.lstrip("/")) + try: + with open(chroot_path, "r", encoding="utf-8", errors="ignore") as handle: + return handle.read().strip() + except (OSError, IOError): + pass + if hasattr(self.buildroot, "doChroot"): + try: + output, _ = self.buildroot.doChroot( + ["cat", relative_path], + shell=False, + returnOutput=True, + printOutput=False, + ) + return output.strip() + except Exception: # pylint: disable=broad-except + return "" + return "" + + def _collect_build_hardening_properties(self): + """ + Capture key compiler/linker macro settings that influence hardening + (FORTIFY, PIE, RELRO, LTO, etc.) and expose them as SBOM properties. + """ + macro_queries = { + "build:hardening:optflags": "%{?optflags}", + "build:hardening:hardening_cflags": "%{?_hardening_cflags}", + "build:hardening:global_cflags": "%{?__global_cflags}", + "build:hardening:global_ldflags": "%{?__global_ldflags}", + "build:hardening:build_ldflags": "%{?build_ldflags}", + } + + properties = [] + macro_values = {} + for prop_name, macro in macro_queries.items(): + value = self._evaluate_rpm_macro(macro) + macro_values[prop_name] = value + if value: + properties.append({ + "name": prop_name, + "value": value + }) + + cflags_combined = " ".join( + filter( + None, + [ + macro_values.get("build:hardening:optflags"), + macro_values.get("build:hardening:hardening_cflags"), + macro_values.get("build:hardening:global_cflags"), + ], + ) + ).lower() + ldflags_combined = " ".join( + filter( + None, + [ + macro_values.get("build:hardening:global_ldflags"), + macro_values.get("build:hardening:build_ldflags"), + ], + ) + ).lower() + flag_union = f"{cflags_combined} {ldflags_combined}" + + def _contains_flag(flag): + return flag in flag_union if flag_union else False + + feature_map = { + "build:hardening:fortify_enabled": any( + token in flag_union + for token in ["-d_fortify_source", "_fortify_source="] + ), + "build:hardening:pie_enabled": any( + token in flag_union for token in ["-fpie", "-pie"] + ), + "build:hardening:relro_enabled": any( + token in flag_union + for token in ["-z relro", "-z now", "-wl,-z,relro", "-wl,-z,now"] + ), + "build:hardening:lto_enabled": _contains_flag("-flto"), + } + for name, enabled in feature_map.items(): + properties.append({ + "name": name, + "value": "true" if enabled else "false" + }) + + fips_value = self._read_file_from_chroot("/proc/sys/crypto/fips_enabled") + if fips_value != "": + properties.append({ + "name": "build:hardening:fips_enabled", + "value": "true" if fips_value.strip() == "1" else "false" + }) + + return properties + + def _find_build_artifacts(self, build_dir): + """Locates RPMs, source RPMs, and spec files in the build directory.""" + rpm_files = [] + src_rpm_files = [] + spec_file = None + + # Use os.scandir for better performance + try: + with os.scandir(build_dir) as entries: + for entry in entries: + if not entry.is_file(): + continue + if entry.name.endswith('.src.rpm'): + src_rpm_files.append(entry.name) + elif entry.name.endswith('.rpm'): + rpm_files.append(entry.name) + except OSError as e: + self.buildroot.root_log.debug(f"Failed to scan build directory {build_dir}: {e}") + + # Look for spec file in the chroot build directory + build_build_dir = os.path.join(self.buildroot.rootdir, "builddir/build") + if os.path.exists(build_build_dir): + try: + for root, _dirs, files in os.walk(build_build_dir): + for file in files: + if file.endswith('.spec'): + spec_file = os.path.join(root, file) + break + if spec_file: + break + except OSError as e: + self.buildroot.root_log.debug( + f"Failed to scan chroot build dir {build_build_dir}: {e}" + ) + + return rpm_files, src_rpm_files, spec_file + + def _get_build_subject_metadata(self, spec_file, src_rpm_files, build_dir): + """Determines the build subject metadata (name, version, release).""" + build_subject_name = None + build_subject_version = None + build_subject_release = None + source_files = [] + spec_metadata = {} + + if hasattr(self, 'prebuild_spec_metadata') and self.prebuild_spec_metadata: + spec_metadata = self.prebuild_spec_metadata + source_files = self.prebuild_source_files + build_subject_name = spec_metadata.get("name") + build_subject_version = spec_metadata.get("version") + build_subject_release = spec_metadata.get("release") + elif spec_file: + spec_metadata, parsed_sources = self.rpm_helper.parse_spec_file(spec_file) + if spec_metadata: + build_subject_name = spec_metadata.get("name") + build_subject_version = spec_metadata.get("version") + build_subject_release = spec_metadata.get("release") + if parsed_sources: + source_files = parsed_sources + + if src_rpm_files: + srpm_path = os.path.join(build_dir, src_rpm_files[0]) + srpm_metadata = self.rpm_helper.get_rpm_metadata(srpm_path) + if srpm_metadata: + if not build_subject_name: + build_subject_name = srpm_metadata.get("name") + if not build_subject_version: + build_subject_version = srpm_metadata.get("version") + if not build_subject_release: + build_subject_release = srpm_metadata.get("release") + + if not source_files: + # Extract metadata for source files from source RPM without full extraction + source_files = self.rpm_helper.extract_source_files_from_srpm(srpm_path) + + # Record the source RPM itself as an input artifact + srpm_name = src_rpm_files[0] + srpm_sig = self.rpm_helper.get_rpm_signature(srpm_path) + srpm_hash = self.rpm_helper.hash_file(srpm_path) + # Add to the beginning of the list for visibility + source_files.insert(0, { + "filename": srpm_name, + "sha256": srpm_hash, + "digital_signature": srpm_sig, + "source_type": "source_rpm" + }) + + return ( + spec_metadata, build_subject_name, build_subject_version, + build_subject_release, source_files + ) + + def _add_toolchain_components(self, _bom, build_toolchain_packages, distro_id): + """Adds toolchain components to the BOM and returns their components and bom-refs.""" + toolchain_components = [] + toolchain_bom_refs = [] + for toolchain_pkg in build_toolchain_packages: + component = self.cdx_gen.create_toolchain_component(toolchain_pkg, distro_id) + if component: + bom_ref = component.get("bom-ref") + if bom_ref: + toolchain_bom_refs.append(bom_ref) + toolchain_components.append(component) + return toolchain_components, toolchain_bom_refs + + @traceLog() + # pylint: disable=too-many-locals + def _generate_sbom_post_build_hook(self): + """Plugin hook called after the build is complete.""" + self.buildroot.root_log.debug("[SBOM] Starting post-build SBOM generation") + if self.sbom_done or not self.sbom_enabled: + return + + state_text = f"Generating {self.sbom_type.upper()} SBOM for built packages v1.0" + self.state.start(state_text) + + try: + build_dir = self.buildroot.resultdir + rpm_files, src_rpm_files, spec_file = self._find_build_artifacts(build_dir) + + if not rpm_files and not src_rpm_files and not spec_file: + self.buildroot.root_log.debug( + "No RPM, source RPM, or spec file found for SBOM generation." + ) + return + + # Get build subject metadata + ( + spec_metadata, build_subject_name, build_subject_version, + build_subject_release, source_files + ) = self._get_build_subject_metadata(spec_file, src_rpm_files, build_dir) + + if not build_subject_name or not build_subject_version or not build_subject_release: + self.buildroot.root_log.debug("[SBOM] Cannot generate SBOM - build metadata incomplete") + return + + # Gather common data + distro_id = self.rpm_helper.detect_chroot_distribution() or "unknown" + build_toolchain_packages = self.rpm_helper.get_build_toolchain_packages() + + # Dispatch based on type + if self.sbom_type == "spdx": + sbom_filename = ( + f"{build_subject_name}-{build_subject_version}-{build_subject_release}.spdx.json" + ) + out_file = os.path.join(self.buildroot.resultdir, sbom_filename) + + # Collect hardening flags + hardening_props = self._collect_build_hardening_properties() + + doc = self.spdx_gen.generate_spdx_document( + build_subject_name, build_subject_version, build_subject_release, + build_dir, rpm_files, source_files, + build_toolchain_packages, distro_id, + spec_metadata=spec_metadata, hardening_props=hardening_props + ) + + with open(out_file, "w", encoding="utf-8") as f: + json.dump(doc, f, indent=2) + + self.buildroot.root_log.debug(f"SPDX SBOM successfully written to: {out_file}") + + else: + # Default: CycloneDX + sbom_filename = ( + f"{build_subject_name}-{build_subject_version}-{build_subject_release}.sbom" + ) + out_file = os.path.join(self.buildroot.resultdir, sbom_filename) + + # Create CycloneDX document + bom = self.cdx_gen.create_cyclonedx_document() + + # Add source and toolchain components + source_components, source_component_entries = self.cdx_gen.add_source_components(bom, source_files) + toolchain_components, toolchain_bom_refs = self._add_toolchain_components( + bom, build_toolchain_packages, distro_id + ) + + # Process binary RPMs and convert to components + ( + built_package_bom_refs, primary_rpm_metadata, all_built_components + ) = self.cdx_gen.process_built_packages( + bom, rpm_files + src_rpm_files, build_dir, distro_id, source_component_entries, + build_subject_name, build_toolchain_packages, toolchain_bom_refs + ) + + # Add RPM-specific metadata and finalize dependencies + self.cdx_gen.finalize_bom_metadata(bom, primary_rpm_metadata, built_package_bom_refs, + build_subject_name, build_subject_version, + build_subject_release, distro_id, + spec_metadata=spec_metadata) + self.cdx_gen.finalize_dependencies(bom, source_component_entries, + build_toolchain_packages, distro_id, + built_package_bom_refs, toolchain_bom_refs, + spec_metadata=spec_metadata, + source_components=source_components, + toolchain_components=toolchain_components, + all_built_components=all_built_components) + + # Write CycloneDX BOM + with open(out_file, "w", encoding="utf-8") as f: + json.dump(bom, f, indent=2) + + self.buildroot.root_log.debug(f"CycloneDX SBOM successfully written to: {out_file}") + + # pylint: disable=broad-exception-caught + except Exception as e: + self.buildroot.root_log.debug(f"[SBOM] FAILED: An error occurred during SBOM generation: {e}") + traceback.print_exc() + finally: + self.sbom_done = True + self.state.finish(state_text) + + # pylint: disable=too-many-arguments,too-many-locals,too-many-branches,too-many-statements,too-many-positional-arguments + def _create_built_package_component( + self, rpm_path, distro_obj, _source_components=None + ): + """Creates a CycloneDX component for a built RPM package.""" + package_data = self.rpm_helper.get_rpm_metadata(rpm_path) + if not package_data: + return None + + package_name = package_data.get("name") + version = package_data.get("version") + release = package_data.get("release") + arch = package_data.get("arch") + + # Combine version and release + full_version = f"{version}-{release}" if release else version + + # Generate PURL and bom-ref + purl = self.rpm_helper.generate_purl(package_name, full_version, distro_obj, arch) + bom_ref = purl + + # Determine component type (application vs library) + # Most RPMs are libraries, but we could check for executables + component_type = "library" + + component = { + "type": component_type, + "bom-ref": bom_ref, + "name": package_name, + "version": full_version, + "purl": purl + } + + # Add external references (CPE) + vendor = package_data.get("vendor") + cpe = self.rpm_helper.generate_cpe(package_name, version, vendor=vendor) + if cpe: + component["externalReferences"] = [ + { + "type": "other", + "comment": "CPE 2.3", + "url": cpe + } + ] + + # Add hierarchical grouping for "RPM Contents" + + # Add hash of RPM file - REMOVED per user request to only have hashes for files contained in RPM + # or if needed for PURL integrity, but we'll prioritize the "only" constraint. + # rpm_hash = package_data.get("sha256") + # if not rpm_hash or rpm_hash == "(none)": + # rpm_hash = self.rpm_helper.hash_file(rpm_path) + + # if rpm_hash: + # component["hashes"] = [ + # { + # "alg": "SHA-256", + # "content": rpm_hash + # } + # ] + + # Add license information + license_str = package_data.get("license") + if license_str and license_str != "(none)": + component["licenses"] = [ + { + "expression": license_str + } + ] + + # Add supplier information (from Packager field) + packager = package_data.get("packager") + if packager and packager != "(none)": + component["supplier"] = { + "name": packager + } + + # Add properties for RPM metadata + properties = [] + + properties.append({ + "name": "mock:rpm:filename", + "value": os.path.basename(rpm_path) + }) + + vendor = package_data.get("vendor") + if vendor and vendor != "(none)": + properties.append({ + "name": "mock:rpm:vendor", + "value": vendor + }) + + packager = package_data.get("packager") + if packager and packager != "(none)": + properties.append({ + "name": "mock:rpm:packager", + "value": packager + }) + + buildhost = package_data.get("buildhost") + if buildhost and buildhost != "(none)": + properties.append({ + "name": "mock:rpm:buildhost", + "value": buildhost + }) + + buildtime_iso = self.cdx_gen.format_epoch_timestamp(package_data.get("buildtime")) + if buildtime_iso: + properties.append({ + "name": "mock:rpm:buildtime", + "value": buildtime_iso + }) + + group = package_data.get("group") + if group and group != "(none)": + properties.append({ + "name": "mock:rpm:group", + "value": group + }) + + epoch_val = package_data.get("epoch") + if epoch_val and epoch_val != "(none)": + properties.append({ + "name": "mock:rpm:epoch", + "value": epoch_val + }) + + distribution = package_data.get("distribution") + if distribution and distribution != "(none)": + properties.append({ + "name": "mock:rpm:distribution", + "value": distribution + }) + + url = package_data.get("url") + if url and url != "(none)": + component["externalReferences"] = component.get("externalReferences", []) + component["externalReferences"].append({ + "type": "website", + "url": url + }) + + summary = package_data.get("summary") + if summary and summary != "(none)": + component["description"] = summary + + # Add GPG signature information if available + signature = self.rpm_helper.get_rpm_signature(rpm_path) + if signature: + # Parse signature info + sig_props = self.cdx_gen.parse_signature_to_properties(signature) + properties.extend(sig_props) + + # Note: Source/patch file relationships are represented in component properties + # (mock:source:files, mock:source:refs, mock:patch:files, mock:patch:refs) + # but are removed from individual package components to reduce noise. + # Source code relationships are still available in the components array. + + if properties: + component["properties"] = properties + + # Add external reference for source RPM if available + sourcerpm = package_data.get("sourcerpm") + if sourcerpm and sourcerpm != "(none)": + component["externalReferences"] = component.get("externalReferences", []) + component["externalReferences"].append({ + "type": "distribution", + "url": sourcerpm + }) + + return component + + def get_file_signature(self, file_path): + """Attempts to detect if a file has a digital signature.""" + try: + # Check for .asc signature file + asc_file = file_path + ".asc" + if os.path.isfile(asc_file): + return "GPG signature file exists: " + os.path.basename(asc_file) + + # Check for .sig signature file + sig_file = file_path + ".sig" + if os.path.isfile(sig_file): + return "GPG signature file exists: " + os.path.basename(sig_file) + + # Check if the file itself is a signature + if file_path.endswith('.asc') or file_path.endswith('.sig'): + return "File is a signature file" + + return None + except OSError as e: + self.buildroot.root_log.debug(f"Failed to check signature for {file_path}: {e}") + return None + + diff --git a/mock/py/mockbuild/plugins/sbom_spdx.py b/mock/py/mockbuild/plugins/sbom_spdx.py new file mode 100644 index 000000000..b507930ad --- /dev/null +++ b/mock/py/mockbuild/plugins/sbom_spdx.py @@ -0,0 +1,409 @@ +# -*- coding: utf-8 -*- +# vim:expandtab:autoindent:tabstop=4:shiftwidth=4:filetype=python:textwidth=0: +# License: GPL2 or later see COPYING +# Written by Scott R. Shinn +# Copyright (C) 2026, Atomicorp, Inc. +""" +SPDX generation functions for the SBOM generator plugin. +""" + +import os +import re +import uuid +from datetime import datetime, timezone + + + +# pylint: disable=too-many-instance-attributes +class SpdxGenerator: + """Helper class for generating SPDX documents.""" + + def __init__(self, rpm_helper, buildroot, conf=None): + self.rpm_helper = rpm_helper + self.buildroot = buildroot + self.conf = conf or {} + + # Configuration options for file-level dependencies and filtering + self.include_file_dependencies = self.conf.get("include_file_dependencies", False) + self.include_file_components = self.conf.get("include_file_components", True) + self.include_debug_files = self.conf.get("include_debug_files", False) + self.include_man_pages = self.conf.get("include_man_pages", True) + self.include_toolchain_dependencies = self.conf.get( + "include_toolchain_dependencies", False + ) + + # pylint: disable=too-many-locals,too-many-branches,too-many-statements,too-many-arguments,too-many-positional-arguments + def generate_spdx_document(self, name, version, release, build_dir, rpm_files, + source_files, build_toolchain_packages, distro_id, + spec_metadata=None, hardening_props=None): + """Generates the full SPDX document using hierarchical grouping and enhanced metadata.""" + doc_spdx_id = "SPDXRef-DOCUMENT" + creation_time = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ") + + # 1. Initialize Document + document = { + "spdxVersion": "SPDX-2.3", + "dataLicense": "CC0-1.0", + "SPDXID": doc_spdx_id, + "name": f"SBOM for {name}-{version}-{release}", + "documentNamespace": f"http://spdx.org/spdxdocs/{name}-{version}-{release}-{uuid.uuid4()}", + "creationInfo": { + "creators": [ + "Tool: mock-sbom-generator-1.0", + "Organization: Atomicorp" + ], + "created": creation_time + }, + "packages": [], + "files": [], + "relationships": [] + } + + # 1.5 Add Spec Metadata and Hardening Props to Document Comment + doc_metadata = [] + if spec_metadata: + build_reqs = spec_metadata.get("build_requires", []) + if build_reqs: + doc_metadata.append(f"Build-Requires: {', '.join(build_reqs)}") + reqs = spec_metadata.get("requires", []) + if reqs: + doc_metadata.append(f"Requires: {', '.join(reqs)}") + + # Hardening flags + if hardening_props: + for prop in hardening_props: + doc_metadata.append(f"{prop['name']}: {prop['value']}") + + if doc_metadata: + document["comment"] = " | ".join(doc_metadata) + + # Virtual Grouping Refs + inputs_ref = "SPDXRef-Build-Inputs" + toolchain_ref = "SPDXRef-Build-Toolchain" + outputs_ref = "SPDXRef-RPM-Contents" + + # 2. Add Grouping Packages (Represented as virtual packages) + document["packages"].extend([ + { + "name": "Build Inputs", + "SPDXID": inputs_ref, + "downloadLocation": "NOASSERTION", + "filesAnalyzed": False, + "comment": "Grouping node for source files and patches used in the build." + }, + { + "name": "Build Toolchain", + "SPDXID": toolchain_ref, + "downloadLocation": "NOASSERTION", + "filesAnalyzed": False, + "comment": "Grouping node for packages and tools used to perform the build." + }, + { + "name": "RPM Contents", + "SPDXID": outputs_ref, + "downloadLocation": "NOASSERTION", + "filesAnalyzed": False, + "comment": "Grouping node for RPM packages and their contained files produced by the build." + } + ]) + + # Core relationships for the grouped architecture + document["relationships"].extend([ + {"spdxElementId": doc_spdx_id, "relatedSpdxElement": inputs_ref, "relationshipType": "CONTAINS"}, + {"spdxElementId": doc_spdx_id, "relatedSpdxElement": toolchain_ref, "relationshipType": "CONTAINS"}, + {"spdxElementId": doc_spdx_id, "relatedSpdxElement": outputs_ref, "relationshipType": "CONTAINS"} + ]) + + # 3. Process Source Files (Inputs) + for src_file in source_files: + spdx_file = self.create_spdx_file(src_file) + if spdx_file: + document["files"].append(spdx_file) + document["relationships"].append({ + "spdxElementId": inputs_ref, + "relatedSpdxElement": spdx_file["SPDXID"], + "relationshipType": "CONTAINS" + }) + + # 4. Process Build Toolchain (Grouped by Signer) + signer_groups = {} + for tc_pkg in build_toolchain_packages: + sig_info = tc_pkg.get("digital_signature", {}) + key_id = sig_info.get("signature_key", "unsigned") + + if key_id not in signer_groups: + safe_key = re.sub(r'[^a-zA-Z0-9.-]', '-', key_id) + signer_ref = f"SPDXRef-Signer-{safe_key}" + signer_pkg = { + "name": f"Packages signed by {key_id}" if key_id != "unsigned" else "Unsigned Packages", + "SPDXID": signer_ref, + "downloadLocation": "NOASSERTION", + "filesAnalyzed": False, + "comment": f"Grouping for build toolchain packages signed with GPG key {key_id}." + } + document["packages"].append(signer_pkg) + document["relationships"].append({ + "spdxElementId": toolchain_ref, + "relatedSpdxElement": signer_ref, + "relationshipType": "DEPENDS_ON" + }) + signer_groups[key_id] = signer_ref + + spdx_pkg = self.create_spdx_package_from_dict(tc_pkg) + if spdx_pkg: + document["packages"].append(spdx_pkg) + document["relationships"].append({ + "spdxElementId": signer_groups[key_id], + "relatedSpdxElement": spdx_pkg["SPDXID"], + "relationshipType": "DEPENDS_ON" + }) + + # 5. Process Build Artifacts (Outputs) + all_built_packages = [] + + for rpm_file in rpm_files: + rpm_path = os.path.join(build_dir, rpm_file) + spdx_pkg = self.create_spdx_package_from_rpm(rpm_path, distro_id) + if spdx_pkg: + all_built_packages.append((spdx_pkg, rpm_path)) + document["packages"].append(spdx_pkg) + document["relationships"].append({ + "spdxElementId": outputs_ref, + "relatedSpdxElement": spdx_pkg["SPDXID"], + "relationshipType": "DEPENDS_ON" + }) + + # Add file components if enabled + if self.include_file_components: + file_spdx_objs = self.create_file_components(rpm_path, spdx_pkg["SPDXID"]) + for file_obj in file_spdx_objs: + document["files"].append(file_obj) + document["relationships"].append({ + "spdxElementId": spdx_pkg["SPDXID"], + "relatedSpdxElement": file_obj["SPDXID"], + "relationshipType": "CONTAINS" + }) + + # 6. Select Primary Package for DESCRIBES relationship + if all_built_packages: + # Logic: Avoid debuginfo, prefer exact name match + primary_pkg_ref = self._select_primary_package(all_built_packages, name) + document["relationships"].append({ + "spdxElementId": doc_spdx_id, + "relatedSpdxElement": primary_pkg_ref, + "relationshipType": "DESCRIBES" + }) + + return document + + def _select_primary_package(self, pkg_tuples, subject_name): + """Selects the most suitable primary package from the list of built RPMs.""" + # tuples are (spdx_pkg, rpm_path) + candidates = [t for t in pkg_tuples if "debuginfo" not in t[0]["name"].lower()] + if not candidates: + candidates = pkg_tuples + + # Prefer exact name match + for pkg, _ in candidates: + if pkg["name"].lower() == subject_name.lower(): + return pkg["SPDXID"] + + # Fallback to the first non-debuginfo candidate + return candidates[0][0]["SPDXID"] + + # pylint: disable=too-many-locals,too-many-branches,too-many-statements + def create_spdx_package_from_rpm(self, rpm_path, distro_obj): + """Creates an SPDX Package from an RPM file, including all header metadata.""" + pkg_data = self.rpm_helper.get_rpm_metadata(rpm_path) + if not pkg_data: + self.buildroot.root_log.debug(f"[SBOM] FAILED to get metadata for {rpm_path}, skipping SPDX package") + return None + + name = pkg_data.get("name") + version = pkg_data.get("version") + release = pkg_data.get("release") + full_version = f"{version}-{release}" if release else version + + safe_name = re.sub(r'[^a-zA-Z0-9.-]', '-', name) + safe_ver = re.sub(r'[^a-zA-Z0-9.-]', '-', full_version) + spdx_id = f"SPDXRef-Package-{safe_name}-{safe_ver}" + + # SPDX Package Structure + package = { + "name": name, + "SPDXID": spdx_id, + "versionInfo": full_version, + "downloadLocation": "NOASSERTION", + "filesAnalyzed": self.include_file_components, + "supplier": "NOASSERTION", + "homepage": "NOASSERTION" + } + + # Map RPM Header Fields to SPDX pkg fields or comments + lic = pkg_data.get("license") + if lic and lic != "(none)": + package["licenseDeclared"] = lic + else: + package["licenseDeclared"] = "NOASSERTION" + package["licenseConcluded"] = "NOASSERTION" + package["copyrightText"] = "NOASSERTION" + + url = pkg_data.get("url") + if url and url != "(none)": + package["homepage"] = url + + packager = pkg_data.get("packager") + if packager and packager != "(none)": + package["supplier"] = f"Person: {packager}" + + # Store additional RPM metadata in a comment block + metadata_fields = [] + for key, label in [("vendor", "Vendor"), ("buildhost", "Build Host"), + ("group", "Group"), ("epoch", "Epoch"), + ("distribution", "Distribution"), ("arch", "Architecture")]: + val = pkg_data.get(key) + if val and val != "(none)": + metadata_fields.append(f"{label}: {val}") + + buildtime = pkg_data.get("buildtime") + if buildtime: + try: + dt = datetime.fromtimestamp(int(buildtime), timezone.utc) + metadata_fields.append(f"Build Time: {dt.isoformat()}") + except (ValueError, TypeError): + pass + + # GPG Signature Information + signature = self.rpm_helper.get_rpm_signature(rpm_path) + if signature: + metadata_fields.append(f"GPG Signature: {signature}") + + if metadata_fields: + package["comment"] = " | ".join(metadata_fields) + + # Checksums + rpm_hash = pkg_data.get("sha256") + if not rpm_hash or rpm_hash == "(none)": + rpm_hash = self.rpm_helper.hash_file(rpm_path) + + if rpm_hash: + package["checksums"] = [{"algorithm": "SHA256", "checksumValue": rpm_hash}] + + # External References (CPE and PURL) + external_refs = [] + vendor = pkg_data.get("vendor") + cpe = self.rpm_helper.generate_cpe(name, version, vendor=vendor) + if cpe: + external_refs.append({ + "referenceCategory": "SECURITY", + "referenceType": "cpe23Type", + "referenceLocator": cpe + }) + + purl = self.rpm_helper.generate_purl(name, full_version, distro_obj, pkg_data.get("arch")) + if purl: + external_refs.append({ + "referenceCategory": "PACKAGE-MANAGER", + "referenceType": "purl", + "referenceLocator": purl + }) + + if external_refs: + package["externalRefs"] = external_refs + + return package + + def create_spdx_package_from_dict(self, pkg_data): + """Creates an SPDX Package from a dictionary (e.g. toolchain).""" + name = pkg_data.get("name") + version = pkg_data.get("version") + if not name or not version: + self.buildroot.root_log.debug( + "[SBOM] Skipping toolchain package due to missing name/version" + ) + return None + + safe_name = re.sub(r'[^a-zA-Z0-9.-]', '-', name) + safe_ver = re.sub(r'[^a-zA-Z0-9.-]', '-', version) + spdx_id = f"SPDXRef-Package-{safe_name}-{safe_ver}" + + package = { + "name": name, + "SPDXID": spdx_id, + "versionInfo": version, + "downloadLocation": "NOASSERTION", + "filesAnalyzed": False, + "supplier": "NOASSERTION" + } + + lic = pkg_data.get("licenseDeclared") + if lic and lic != "(none)": + package["licenseDeclared"] = lic + else: + package["licenseDeclared"] = "NOASSERTION" + package["licenseConcluded"] = "NOASSERTION" + + # Checksums - REMOVED per user request to only have hashes for files contained in RPM + # (Follows CycloneDX parity where external toolchain hashes are omitted) + # checksum = pkg_data.get("checksum") + # if checksum and not checksum.startswith("error"): + # alg = "SHA256" if len(checksum) == 64 else "SHA1" + # package["checksums"] = [{ + # "algorithm": alg, + # "checksumValue": checksum + # }] + + return package + + def create_spdx_file(self, file_data, parent_pkg_id=None): + """Creates an SPDX File from file metadata.""" + filename = file_data.get("filename") + if not filename: + return None + + safe_name = re.sub(r'[^a-zA-Z0-9.-]', '-', filename) + # Use a more unique ID if parent is provided + if parent_pkg_id: + parent_suffix = parent_pkg_id.split("-")[-1] + spdx_id = f"SPDXRef-File-{safe_name}-{parent_suffix}" + else: + spdx_id = f"SPDXRef-File-{safe_name}" + + file_obj = { + "fileName": f"./{filename}", + "SPDXID": spdx_id, + "licenseConcluded": "NOASSERTION", + "copyrightText": "NOASSERTION" + } + + sha256 = file_data.get("sha256") + if sha256: + file_obj["checksums"] = [{"algorithm": "SHA256", "checksumValue": sha256}] + + # Store GPG flag as a comment if present + if file_data.get("digital_signature"): + file_obj["comment"] = f"Signature Status: {file_data['digital_signature']}" + + return file_obj + + def create_file_components(self, rpm_path, parent_spdx_id): + """Extracts file list from an RPM and creates SPDX File objects.""" + file_info = self.rpm_helper.get_rpm_file_info(rpm_path) or {} + spdx_files = [] + + for filename in sorted(file_info.keys()): + f_data = file_info[filename] + # Ensure filename is in the data dict for create_spdx_file + f_data["filename"] = filename + + # Filtering logic (man pages, debug files) + if not self.include_debug_files and (".build-id" in filename or ".debug" in filename): + continue + if not self.include_man_pages and ("/usr/share/man" in filename or "/usr/share/info" in filename): + continue + + f_obj = self.create_spdx_file(f_data, parent_pkg_id=parent_spdx_id) + if f_obj: + spdx_files.append(f_obj) + + return spdx_files diff --git a/mock/py/mockbuild/plugins/sbom_utils.py b/mock/py/mockbuild/plugins/sbom_utils.py new file mode 100644 index 000000000..b1642d720 --- /dev/null +++ b/mock/py/mockbuild/plugins/sbom_utils.py @@ -0,0 +1,756 @@ +# -*- coding: utf-8 -*- +# vim:expandtab:autoindent:tabstop=4:shiftwidth=4:filetype=python:textwidth=0: +# License: GPL2 or later see COPYING +# Written by Scott R. Shinn +# Copyright (C) 2026, Atomicorp, Inc. + +import os +import re +import subprocess +import hashlib +import traceback +import rpm +from datetime import datetime, timezone + +""" +Utility functions for the SBOM generator plugin. +""" + + +class RpmQueryHelper: + # pylint: disable=broad-exception-caught + """Helper class for querying RPM metadata.""" + + def __init__(self, buildroot): + """Initializes the helper with a buildroot for doChroot access.""" + self.buildroot = buildroot + + def _from_chroot_path(self, path): + """Standardizes from_chroot_path as a fallback for older mock versions.""" + if hasattr(self.buildroot, 'from_chroot_path'): + return self.buildroot.from_chroot_path(path) + + # Fallback implementation + rootdir = getattr(self.buildroot, 'rootdir', None) + if not rootdir: + return path + if path.startswith(rootdir): + rel_path = path[len(rootdir):] + if not rel_path.startswith("/"): + rel_path = "/" + rel_path + return rel_path + return path + + def _resolve_chroot_path(self, rpm_path): + """Resolves a host RPM path to its equivalent path inside the chroot if possible.""" + # Check if it's already a chroot path (from_chroot_path returns a path for any file in rootdir) + chroot_path = self._from_chroot_path(rpm_path) + if not chroot_path: + return None + + # Check if it's in the resultdir. If so, it should be in /builddir/build/RPMS + if rpm_path.startswith(self.buildroot.resultdir): + filename = os.path.basename(rpm_path) + # Search in common build directory structures inside the chroot + search_paths = [ + "/builddir/build/RPMS", + "/builddir/build/RPMS/x86_64", + "/builddir/build/RPMS/noarch", + "/builddir/build/SRPMS", + "/builddir/build/SOURCES" + ] + for search_path in search_paths: + candidate = os.path.join(search_path, filename) + # Verify existence via doChroot + cmd = ["ls", candidate] + try: + res, _ = self.buildroot.doChroot(cmd, shell=False, returnOutput=True, printOutput=False) + if res and candidate in res: + return candidate + except Exception: + pass + return None + + return chroot_path + + def generate_purl(self, package_name, version, distro_obj=None, arch=None): + """Generates a Package URL (PURL) for an RPM package.""" + # pkg:rpm/fedora/curl@7.50.3-1.fc25?arch=i386&distro=fedora-25 + # We simplify to pkg:rpm/distro/name@version?arch=arch + clean_name = re.sub(r'[^a-zA-Z0-9.-]', '-', package_name) + purl = f"pkg:rpm/{distro_obj}/{clean_name}@{version}" + if arch: + purl += f"?arch={arch}" + return purl + + def generate_cpe(self, package_name, package_version, vendor=None): + """Generates a CPE identifier for a package.""" + # CPE format: cpe:2.3:a:{vendor}:{product}:{version}:*:*:*:*:*:*:*:* + + # Default vendor if not provided + if not vendor or vendor == "(none)": + vendor = "unknown" + + # Clean up vendor name for CPE + vendor = re.sub(r'[^a-zA-Z0-9._-]', '_', vendor.lower()) + + # Clean up package name for CPE + product = re.sub(r'[^a-zA-Z0-9._-]', '_', package_name.lower()) + + # Clean up version for CPE (remove release part if present) + version = package_version + if '-' in version: + version = version.split('-')[0] # Remove release part + + # Generate CPE + cpe = f"cpe:2.3:a:{vendor}:{product}:{version}:*:*:*:*:*:*:*:*" + return cpe + + + def _parse_signature_data(self, sig_data, signature_info): + """Parses the raw signature string and updates the signature_info dict.""" + if sig_data and sig_data != "(none)" and sig_data != "": + signature_info["signature_type"] = "GPG" + signature_info["signature_valid"] = True + + # Parse signature line like: + # "RSA/SHA256, Fri 08 Nov 2024 ... Key ID ..." + if "RSA/SHA256" in sig_data: + signature_info["signature_algorithm"] = "RSA/SHA256" + elif "DSA/SHA1" in sig_data: + signature_info["signature_algorithm"] = "DSA/SHA1" + elif "ECDSA/SHA256" in sig_data: + signature_info["signature_algorithm"] = "ECDSA/SHA256" + elif "Ed25519/SHA256" in sig_data: + signature_info["signature_algorithm"] = "Ed25519/SHA256" + + # Extract key ID + if "Key ID" in sig_data: + key_id_match = re.search(r'Key ID ([0-9a-fA-F]+)', sig_data) + if key_id_match: + signature_info["signature_key"] = key_id_match.group(1) + + # Extract date - handle various time formats including EST/EDT + date_match = re.search( + r'([A-Za-z]{3} [A-Za-z]{3}\s+\d{1,2} \d{2}:\d{2}:\d{2} \d{4})', + sig_data + ) + if date_match: + signature_info["signature_date"] = date_match.group(1) + else: + signature_info["signature_type"] = "unsigned" + signature_info["signature_valid"] = False + + def get_rpm_metadata(self, rpm_path): + """Extracts metadata from an RPM file. + Uses doChroot if the file is within the chroot to ensure compatibility.""" + if not os.path.isfile(rpm_path): + self.buildroot.root_log.debug(f"RPM file not found: {rpm_path}") + return {} + + # Try to resolve to a chroot path to prioritize chroot-native analysis + chroot_path = self._resolve_chroot_path(rpm_path) + if chroot_path: + self.buildroot.root_log.debug(f"[SBOM] Using chroot-native rpm for: {chroot_path}") + return self._get_rpm_metadata_chroot(chroot_path) + + # Fallback to host-native bindings + self.buildroot.root_log.debug(f"[SBOM] Using host-native analysis for: {rpm_path}") + return self._get_rpm_metadata_native(rpm_path) + + def _get_rpm_metadata_chroot(self, chroot_rpm_path): + """Extracts metadata using rpm -qp inside the chroot.""" + fields = { + "name": "%{NAME}", "version": "%{VERSION}", "release": "%{RELEASE}", + "arch": "%{ARCH}", "epoch": "%{EPOCH}", "summary": "%{SUMMARY}", + "license": "%{LICENSE}", "vendor": "%{VENDOR}", "url": "%{URL}", + "packager": "%{PACKAGER}", "buildtime": "%{BUILDTIME}", + "buildhost": "%{BUILDHOST}", "sourcerpm": "%{SOURCERPM}", + "group": "%{GROUP}", "distribution": "%{DISTRIBUTION}", + "sha256": "%{SHA256HEADER}" + } + + metadata = {} + try: + query = "|".join(fields.values()) + cmd = ["rpm", "-qp", "--queryformat", query, chroot_rpm_path] + output, _ = self.buildroot.doChroot( + cmd, shell=False, returnOutput=True, printOutput=False + ) + + if output: + parts = output.split("|") + for i, field_name in enumerate(fields.keys()): + if i < len(parts): + val = parts[i].strip() + if field_name == "epoch" and (not val or val == "(none)"): + val = "0" + metadata[field_name] = val + return metadata + except Exception as e: + self.buildroot.root_log.debug(f"Failed to extract metadata via doChroot for {chroot_rpm_path}: {e}") + return {} + + def _get_rpm_metadata_native(self, rpm_path): + """Extracts metadata using native host bindings (fallback).""" + # pylint: disable=no-member + try: + ts = rpm.TransactionSet() + with open(rpm_path, "rb") as f: + hdr = ts.hdrFromFdno(f.fileno()) + + tag_map = { + "name": rpm.RPMTAG_NAME, "version": rpm.RPMTAG_VERSION, + "release": rpm.RPMTAG_RELEASE, "arch": rpm.RPMTAG_ARCH, + "epoch": rpm.RPMTAG_EPOCH, "summary": rpm.RPMTAG_SUMMARY, + "license": rpm.RPMTAG_LICENSE, "vendor": rpm.RPMTAG_VENDOR, + "url": rpm.RPMTAG_URL, "packager": rpm.RPMTAG_PACKAGER, + "buildtime": rpm.RPMTAG_BUILDTIME, "buildhost": rpm.RPMTAG_BUILDHOST, + "sourcerpm": rpm.RPMTAG_SOURCERPM, "group": rpm.RPMTAG_GROUP, + "distribution": rpm.RPMTAG_DISTRIBUTION, "sha256": rpm.RPMTAG_SHA256HEADER + } + + metadata = {} + for field_name, tag in tag_map.items(): + value = hdr[tag] + if field_name == "epoch" and value is None: + value = "0" + elif value is None: + value = "" + elif isinstance(value, bytes): + value = value.decode('utf-8', errors='replace') + metadata[field_name] = str(value) + return metadata + except Exception: + self.buildroot.root_log.debug(f"Failed to extract metadata via native bindings for {rpm_path}") + return {} + + + + def get_rpm_file_info(self, rpm_path): + """Extracts file hashes, ownership, and permissions from an RPM file.""" + chroot_path = self._resolve_chroot_path(rpm_path) + if chroot_path: + self.buildroot.root_log.debug(f"[SBOM] Using chroot-native file info for: {chroot_path}") + return self._get_rpm_file_info_chroot(chroot_path) + + self.buildroot.root_log.debug(f"[SBOM] Using host-native file info for: {rpm_path}") + return self._get_rpm_file_info_native(rpm_path) + + def _get_rpm_file_info_chroot(self, chroot_rpm_path): + """Extracts file info using rpm -qp inside the chroot.""" + file_info = {} + try: + # Query format for files: path|hash|mode|user|group + qf = "[%{FILENAMES}|%{FILEDIGESTS}|%{FILEMODES:octal}|%{FILEUSERNAME}|%{FILEGROUPNAME}\\n]" + cmd = ["rpm", "-qp", "--queryformat", qf, chroot_rpm_path] + output, _ = self.buildroot.doChroot( + cmd, shell=False, returnOutput=True, printOutput=False + ) + + # Detect digest algorithm from header + cmd_algo = ["rpm", "-qp", "--queryformat", "%{FILEDIGESTALGO}", chroot_rpm_path] + algo_out, _ = self.buildroot.doChroot( + cmd_algo, shell=False, returnOutput=True, printOutput=False + ) + try: + algo = int(algo_out.strip()) if algo_out and algo_out.strip() else 8 + except ValueError: + algo = 8 + + for line in output.splitlines(): + parts = line.split("|") + if len(parts) >= 5: + filename = parts[0] + file_info[filename] = { + "hash": parts[1] if parts[1] and parts[1] != "(none)" else None, + "algo": algo, + "permissions": parts[2], + "owner": parts[3], + "group": parts[4] + } + return file_info + except Exception as e: + self.buildroot.root_log.debug(f"Failed to get file info via doChroot for {chroot_rpm_path}: {e}") + return {} + + def _get_rpm_file_info_native(self, rpm_path): + """Extracts file information using native host bindings (fallback).""" + # pylint: disable=no-member + file_info = {} + try: + ts = rpm.TransactionSet() + # pylint: disable=protected-access + ts.setVSFlags(rpm._RPMVSF_NOSIGNATURES | rpm._RPMVSF_NODIGESTS) + with open(rpm_path, "rb") as f: + hdr = ts.hdrFromFdno(f.fileno()) + + basenames = hdr[rpm.RPMTAG_BASENAMES] + dirnames = hdr[rpm.RPMTAG_DIRNAMES] + dirindexes = hdr[rpm.RPMTAG_DIRINDEXES] + filedigests = hdr[rpm.RPMTAG_FILEDIGESTS] + filemodes = hdr[rpm.RPMTAG_FILEMODES] + fileusernames = hdr[rpm.RPMTAG_FILEUSERNAME] + filegroupnames = hdr[rpm.RPMTAG_FILEGROUPNAME] + + try: + algo = hdr[rpm.RPMTAG_FILEDIGESTALGO] + except (KeyError, IndexError): + algo = 8 + + file_info = {} + for i, basename in enumerate(basenames): + dirname = dirnames[dirindexes[i]] + if isinstance(dirname, bytes): + dirname = dirname.decode('utf-8', 'replace') + if isinstance(basename, bytes): + basename = basename.decode('utf-8', 'replace') + filename = os.path.join(dirname, basename) + + digest = filedigests[i] + if isinstance(digest, bytes): + digest = digest.decode('utf-8') + + file_info[filename] = { + "hash": digest if digest else None, + "algo": algo, + "permissions": f"0{filemodes[i]:o}", + "owner": fileusernames[i].decode('utf-8', 'replace') if isinstance(fileusernames[i], bytes) else fileusernames[i], + "group": filegroupnames[i].decode('utf-8', 'replace') if isinstance(filegroupnames[i], bytes) else filegroupnames[i] + } + return file_info + except Exception: + return {} + + def get_rpm_dependencies(self, rpm_path): + """Extracts the list of dependencies from an RPM file.""" + chroot_path = self._resolve_chroot_path(rpm_path) + if chroot_path: + self.buildroot.root_log.debug(f"[SBOM] Using chroot-native dependencies for: {chroot_path}") + return self._get_rpm_dependencies_chroot(chroot_path) + + self.buildroot.root_log.debug(f"[SBOM] Using host-native dependencies for: {rpm_path}") + return self._get_rpm_dependencies_native(rpm_path) + + def _get_rpm_dependencies_chroot(self, chroot_rpm_path): + """Extracts dependencies using rpm -qpR inside the chroot.""" + try: + cmd = ["rpm", "-qpR", chroot_rpm_path] + output, _ = self.buildroot.doChroot( + cmd, shell=False, returnOutput=True, printOutput=False + ) + return output.splitlines() if output else [] + except Exception: + return [] + + def _get_rpm_dependencies_native(self, rpm_path): + """Extracts dependencies using native host bindings (fallback).""" + # pylint: disable=no-member + try: + ts = rpm.TransactionSet() + with open(rpm_path, "rb") as f: + hdr = ts.hdrFromFdno(f.fileno()) + + requirements = hdr[rpm.RPMTAG_REQUIRENAME] + if not requirements: + return [] + + return [r.decode('utf-8', 'replace') if isinstance(r, bytes) else str(r) for r in requirements] + except Exception: # pylint: disable=broad-exception-caught + self.buildroot.root_log.debug(f"Failed to extract dependencies via native bindings for {rpm_path}") + return [] + + def get_rpm_signature(self, rpm_path): + """Extracts the GPG signature of an RPM file.""" + chroot_path = self._resolve_chroot_path(rpm_path) + if chroot_path: + self.buildroot.root_log.debug(f"[SBOM] Using chroot-native signature query for: {chroot_path}") + return self._get_rpm_signature_chroot(chroot_path) + + self.buildroot.root_log.debug(f"[SBOM] Using host-native signature query for: {rpm_path}") + return self._get_rpm_signature_host(rpm_path) + + def _get_rpm_signature_chroot(self, chroot_rpm_path): + """Extracts signature using rpm inside the chroot.""" + try: + # Try to get it via queryformat first + cmd = ["rpm", "-qp", "--queryformat", "%{SIGPGP:pgpsig} %{SIGGPG:pgpsig}", chroot_rpm_path] + output, _ = self.buildroot.doChroot( + cmd, shell=False, returnOutput=True, printOutput=False + ) + sig = output.strip() if output else "" + if sig and sig != "(none) (none)" and sig != "(none)": + return sig.replace("(none)", "").strip() + + # Fallback to rpm -qip + cmd = ["rpm", "-qip", chroot_rpm_path] + output, _ = self.buildroot.doChroot( + cmd, shell=False, returnOutput=True, printOutput=False + ) + if output: + for line in output.splitlines(): + if "Signature" in line and ":" in line: + sig_val = line.split(":", 1)[1].strip() + if sig_val and sig_val != "(none)": + return sig_val + return None + except Exception: + return None + + def _get_rpm_signature_host(self, rpm_path): + """Extracts signature using host tools (fallback).""" + try: + # Query format for signatures + cmd = ["rpm", "-qp", "--queryformat", "%{SIGPGP:pgpsig} %{SIGGPG:pgpsig}", rpm_path] + result = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, check=True, text=True) + sig = result.stdout.strip() + if sig and sig != "(none) (none)" and sig != "(none)": + return sig.replace("(none)", "").strip() + + # Second try via -qip + cmd = ["rpm", "-qip", rpm_path] + result = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, check=True, text=True) + for line in result.stdout.splitlines(): + if "Signature" in line and ":" in line: + sig_val = line.split(":", 1)[1].strip() + if sig_val and sig_val != "(none)": + return sig_val + return None + except Exception: + return None + + + def hash_file(self, file_path): + """Calculates the SHA256 hash of a file.""" + sha256 = hashlib.sha256() + try: + with open(file_path, "rb") as f: + for chunk in iter(lambda: f.read(4096), b""): + sha256.update(chunk) + return sha256.hexdigest() + # pylint: disable=broad-exception-caught + except Exception as e: + self.buildroot.root_log.debug(f"Failed to hash file {file_path}: {e}") + return None + + def extract_source_files_from_srpm(self, src_rpm_path): + """Extracts metadata for source files from a source RPM without full extraction.""" + # pylint: disable=no-member + self.buildroot.root_log.debug(f"Extracting source metadata from source RPM: {src_rpm_path}") + source_files = [] + if not os.path.isfile(src_rpm_path): + return source_files + try: + ts = rpm.TransactionSet() + with open(src_rpm_path, "rb") as f: + hdr = ts.hdrFromFdno(f.fileno()) + + basenames = hdr[rpm.RPMTAG_BASENAMES] + digests = hdr[rpm.RPMTAG_FILEDIGESTS] + + # Create a set for quick lookup of signature files + file_set = set(basenames) + + for filename, sha256 in zip(basenames, digests): + if filename.endswith(".spec"): + continue + + signature = None + if filename.endswith(".asc") or filename.endswith(".sig"): + signature = "File is a signature file" + else: + for ext in [".asc", ".sig"]: + if filename + ext in file_set: + signature = f"GPG signature file exists: {filename}{ext}" + break + + source_files.append({ + "filename": filename, + "sha256": sha256, + "digital_signature": signature + }) + except Exception as e: + self.buildroot.root_log.debug(f"Failed to extract source metadata from {src_rpm_path}: {e}") + + return source_files + + + + def parse_spec_file(self, spec_path): + """Parses a spec file to extract metadata and source/patch files using the specfile library.""" + self.buildroot.root_log.debug(f"[SBOM] Parsing spec file: {spec_path}") + + sources = [] + metadata = { + "name": "", + "version": "", + "release": "", + "license": "", + "build_requires": [], + "requires": [] + } + + if not os.path.isfile(spec_path): + self.buildroot.root_log.debug(f"Spec file not found: {spec_path}") + return metadata, sources + try: + chroot_spec_path = self._from_chroot_path(spec_path) or spec_path + # Use rpmspec --parse inside the build chroot to ensure macro expansion + # matches the build environment exactly. + cmd = ["rpmspec", "--parse", chroot_spec_path] + result, _ = self.buildroot.doChroot( + cmd, shell=False, returnOutput=True, printOutput=False + ) + + if not result: + # If doChroot returned empty, try reading local spec as fallback + try: + with open(spec_path, 'r', encoding='utf-8') as f: + result = f.read() + except Exception: + return metadata, sources + + try: + from specfile import Specfile + # Use specfile to parse the expanded content + spec = Specfile(content=result, sourcedir=os.path.dirname(spec_path)) + + + # Extract canonical metadata + metadata.update({ + "name": spec.expanded_name, + "version": spec.expanded_version, + "release": spec.expanded_release, + "license": spec.expanded_license, + }) + + # Extract BuildRequires and Requires from headers + try: + br = spec.rpm_spec.sourceHeader[rpm.RPMTAG_REQUIRENAME] + metadata["build_requires"] = [ + r.decode('utf-8', 'replace') if isinstance(r, bytes) else str(r) + for r in br + ] if br else [] + except (AttributeError, KeyError): + metadata["build_requires"] = [] + + try: + r = spec.rpm_spec.packages[0].header[rpm.RPMTAG_REQUIRENAME] + metadata["requires"] = [ + req.decode('utf-8', 'replace') if isinstance(req, bytes) else str(req) + for req in r + ] if r else [] + except (AttributeError, KeyError, IndexError): + metadata["requires"] = [] + + # Extract both sources and patches from the spec object model + all_locs = [] + with spec.sources() as spec_sources: + all_locs.extend(s.location for s in spec_sources if s.location) + with spec.patches() as spec_patches: + all_locs.extend(p.location for p in spec_patches if p.location) + + for loc in all_locs: + filename, _, hash_value = loc.partition('#') + actual_filename = os.path.basename(filename) + build_dir = os.path.dirname(spec_path) + sources_dir = os.path.join(os.path.dirname(build_dir), "SOURCES") + file_path = os.path.join(sources_dir, actual_filename) + + actual_hash = None + if os.path.isfile(file_path): + actual_hash = self.rpm_helper.hash_file(file_path) + elif hash_value: + actual_hash = hash_value + + signature = ( + self.get_file_signature(file_path) if os.path.isfile(file_path) else None + ) + + sources.append({ + "filename": actual_filename, + "sha256": actual_hash, + "digital_signature": signature + }) + self.buildroot.root_log.debug(f"Extracted metadata {metadata} and {len(sources)} source/patch files from spec") + + # Double check we actually got metadata + if not metadata.get("name"): + raise ValueError("Empty metadata from Specfile") + + except Exception as e: + self.buildroot.root_log.debug(f"[SBOM] FALLBACK: Specfile library failed for {spec_path}, trying regex: {e}") + + # Ensure result is a string for regex + content = str(result) if result else "" + + # Fallback to simple regex parsing of the expanded result + name_match = (re.search(r'^Name:\s+(.+)$', content, re.MULTILINE) or + re.search(r'^name\s*:\s*(.+)$', content, re.IGNORECASE | re.MULTILINE)) + version_match = (re.search(r'^Version:\s+(.+)$', content, re.MULTILINE) or + re.search(r'^version\s*:\s*(.+)$', content, re.IGNORECASE | re.MULTILINE)) + release_match = (re.search(r'^Release:\s+(.+)$', content, re.MULTILINE) or + re.search(r'^release\s*:\s*(.+)$', content, re.IGNORECASE | re.MULTILINE)) + license_match = (re.search(r'^License:\s+(.+)$', content, re.MULTILINE) or + re.search(r'^license\s*:\s*(.+)$', content, re.IGNORECASE | re.MULTILINE)) + + metadata["name"] = name_match.group(1).strip() if name_match else "" + metadata["version"] = version_match.group(1).strip() if version_match else "" + metadata["release"] = release_match.group(1).strip() if release_match else "" + metadata["license"] = license_match.group(1).strip() if license_match else "" + + # Simple source/patch extraction from expanded spec + source_matches = re.finditer(r'^(Source|Patch)\d*:\s+(.+)$', content, re.MULTILINE) + for sm in source_matches: + loc = sm.group(2).strip() + filename = os.path.basename(loc.partition('#')[0]) + # Avoid duplicates + if not any(s['filename'] == filename for s in sources): + sources.append({ + "filename": filename, + "sha256": None, + "digital_signature": None + }) + except Exception as e: + self.buildroot.root_log.debug(f"Failed to parse spec file {spec_path}: {e}") + self.buildroot.root_log.debug(traceback.format_exc()) + + return metadata, sources + + def detect_chroot_distribution(self): + """Detects the distribution ID (e.g., 'fedora', 'centos', 'rhel') from inside the chroot.""" + try: + import distro + try: + distro_id = distro.id(root_dir=self.buildroot.rootdir) + except (TypeError, AttributeError): + # Fallback for older python-distro versions (<1.6.0) + os_release = os.path.join(self.buildroot.rootdir, "etc/os-release") + distro_id = "unknown" + if os.path.isfile(os_release): + with open(os_release, 'r') as f: + for line in f: + if line.startswith("ID="): + distro_id = line.split("=")[1].strip().strip('"').strip("'") + break + + if distro_id: + return distro_id.lower() + return "unknown" + except Exception as e: + self.buildroot.root_log.debug(f"Failed to detect chroot distribution: {e}") + return "unknown" + + def get_build_toolchain_packages(self): + """Returns the list of packages installed in the build toolchain + with detailed signature information collected in a single batch query.""" + try: + fields = [ + "%{NAME}", "%{VERSION}-%{RELEASE}", "%{ARCH}", "%{LICENSE}", + "%{BUILDTIME}", "%{RSAHEADER:pgpsig}", "%{DSAHEADER:pgpsig}", + "%{SIGGPG:pgpsig}", "%{SIGPGP:pgpsig}", "%{SHA256HEADER}", + "%{SOURCERPM}" + ] + query = "|".join(fields) + "\n" + cmd = ["rpm", "-qa", "--qf", query] + output, _ = self.buildroot.doChroot( + cmd, shell=False, returnOutput=True, printOutput=False + ) + + packages = [] + cpe_vendor_default = self.detect_chroot_distribution() or "unknown" + + for line in output.splitlines(): + parts = line.split("|") + if len(parts) < 6: + continue + + package_name = parts[0].strip() + package_version = parts[1].strip() + package_arch = parts[2].strip() + package_license = parts[3].strip() + build_time = parts[4].strip() + + raw_sig = None + for sig_candidate in parts[5:9]: + sig_candidate = sig_candidate.strip() + if sig_candidate and sig_candidate != "(none)": + raw_sig = sig_candidate + break + + package_checksum = parts[9].strip() if len(parts) > 9 else None + if package_checksum == "(none)": + package_checksum = None + + source_rpm = parts[10].strip() if len(parts) > 10 else None + if source_rpm == "(none)": + source_rpm = None + + if ( + package_name.startswith('gpg-pubkey') or + package_name == '(none)' or + not package_name + ): + continue + + digital_signature = { + "signature_type": "unsigned", + "signature_key": None, + "signature_date": None, + "signature_algorithm": None, + "signature_valid": False, + "raw_signature_data": raw_sig, + "build_date": None + } + + if raw_sig: + self._parse_signature_data(raw_sig, digital_signature) + + if build_time and build_time.isdigit(): + try: + dt = datetime.fromtimestamp(int(build_time), tz=timezone.utc) + digital_signature["build_date"] = dt.isoformat() + except (ValueError, TypeError, OverflowError): + pass + + cpe = self.generate_cpe(package_name, package_version, vendor=cpe_vendor_default) + + packages.append({ + "name": package_name, + "version": package_version, + "arch": package_arch, + "licenseDeclared": package_license, + "digital_signature": digital_signature, + "sourcerpm": source_rpm, + "cpe": cpe, + "checksum": package_checksum + }) + + self.buildroot.root_log.debug(f"Found {len(packages)} build toolchain packages") + return packages + except Exception as e: + self.buildroot.root_log.debug(f"Failed to get build environment packages: {e}") + return [] + + def get_distribution(self): + """Detects the distribution from the chroot environment (human readable).""" + try: + os_release = os.path.join(self.buildroot.rootdir, "etc/os-release") + distro_name = "Unknown" + version = "" + if os.path.isfile(os_release): + with open(os_release, 'r') as f: + for line in f: + if line.startswith("NAME="): + distro_name = line.strip().split("=", 1)[1].strip('"') + elif line.startswith("VERSION_ID="): + version = line.strip().split("=", 1)[1].strip('"') + if distro_name and version: + return f"{distro_name} {version}" + return distro_name or "Unknown" + except OSError as e: + return f"Unknown ({e})" + + + + diff --git a/mock/tests/test_from_chroot_path.py b/mock/tests/test_from_chroot_path.py new file mode 100644 index 000000000..d922255ca --- /dev/null +++ b/mock/tests/test_from_chroot_path.py @@ -0,0 +1,67 @@ +""" Tests for from_chroot_path in buildroot.py """ + +import pytest +from unittest.mock import MagicMock +from mockbuild import buildroot + +def test_from_chroot_path(): + """ test from_chroot_path method """ + config = MagicMock() + uid_manager = MagicMock() + state = MagicMock() + plugins = MagicMock() + + # Mock config and rootdir + config_dict = { + 'root': 'fedora-rawhide-x86_64', + 'basedir': '/var/lib/mock', + 'rootdir': '/var/lib/mock/fedora-rawhide-x86_64/root', + 'resultdir': 'results', + 'chroothome': '/builddir', + 'cache_topdir': '/var/cache/mock', + 'plugin_conf': {'selinux_enable': False}, + 'chrootuid': 1000, + 'chrootuser': 'mockbuild', + 'chrootgid': 1000, + 'chrootgroup': 'mock', + 'environment': {}, + 'use_buildroot_image': False, + 'buildroot_image': None, + 'buildroot_image_skip_pull': False, + 'buildroot_image_keep_getting': False, + 'additional_packages': [], + 'version': '1.0', + 'files': {}, + 'extra_chroot_dirs': [], + 'macros': {}, + 'package_manager': 'dnf', + 'tar_binary': 'tar', + 'image_fallback': True, + 'nspawn_args': [], + 'rpm_command': 'rpm', + 'unique-ext': 'none' + } + config.__getitem__.side_effect = lambda key: config_dict.get(key) + config.__contains__.side_effect = lambda key: key in config_dict + config.get.side_effect = lambda key, default=None: config_dict.get(key, default) + + # Initialize Buildroot + br = buildroot.Buildroot(config, uid_manager, state, plugins) + br.rootdir = "/var/lib/mock/fedora-rawhide-x86_64/root" + + # Test cases + host_path = "/var/lib/mock/fedora-rawhide-x86_64/root/builddir/build/SPECS/test.spec" + expected_chroot_path = "/builddir/build/SPECS/test.spec" + assert br.from_chroot_path(host_path) == expected_chroot_path + + # Test path not in rootdir + other_path = "/tmp/test.spec" + assert br.from_chroot_path(other_path) == other_path + + # Test rootdir without trailing slash + br.rootdir = "/myroot" + assert br.from_chroot_path("/myroot/etc/passwd") == "/etc/passwd" + + # Test rootdir with trailing slash (should handle it gracefully) + br.rootdir = "/myroot/" + assert br.from_chroot_path("/myroot/etc/passwd") == "/etc/passwd" diff --git a/releng/release-notes-next/sbom-generator.feature b/releng/release-notes-next/sbom-generator.feature new file mode 100644 index 000000000..a1b6f04a2 --- /dev/null +++ b/releng/release-notes-next/sbom-generator.feature @@ -0,0 +1,10 @@ +The new SBOM generator plugin provides comprehensive visibility into the build +environment by capturing the **complete build toolchain** installed in the +chroot, including per-package GPG signatures and vendor metadata. It establishes +full audit traceability by linking built RPMs with their original source +tarballs and patches, including SHA-256 hashes. Supporting both CycloneDX 1.5 +and SPDX 2.3 formats, the plugin leverages a chroot-native analysis model to +ensure high metadata accuracy for cross-distribution builds and compatibility +with modern security scanners, File Integrity Monitoring (FIM), and Supply Chain +forensic analysis. +