-
-
Notifications
You must be signed in to change notification settings - Fork 311
The EMBA book ‐ Chapter 5: SBOM and vulnerability aggregation
In our journey through EMBA so far, we've seen how it meticulously extracts all the hidden parts from a firmware image (Chapter 1: Firmware Extraction Layer). Then, we learned how it thoroughly scrutinizes these extracted files without running them (Chapter 2: The Analysis Core). Finally, we explored how EMBA virtually "boots up" the firmware to observe its live behavior and uncover runtime issues (Chapter 3: Dynamic Analysis with user-mode Emulation and Chapter 4: Booting up the system via System emulation).
By now, EMBA has gathered a mountain of data: a list of every file found, software versions detected, potentially weak configurations, and more. But what do you do with all this raw information? How do you get a clear, organized picture of what's inside the firmware and what security risks it might face?
This is precisely where EMBA's Software Bill of Materials (SBOM) & Vulnerability Aggregation layer comes into play! Imagine you've finished baking a complicated cake. You have all the ingredients laid out, you've mixed and baked them, and now you want to present a clear "recipe card" listing everything that went into it, along with any potential "allergy warnings" for those ingredients. This chapter will explain how EMBA creates such a detailed "recipe card" for your firmware, highlighting any known security "allergies."
Our central goal in this chapter is to understand how EMBA takes all the fragmented information it has collected and organizes it into a comprehensive inventory of software components (an SBOM), then cross-references this inventory with vast databases of known security flaws to identify and present potential vulnerabilities.
At its core, a Software Bill of Materials (SBOM) is like an "ingredients list" for software. Just as a food product lists all its components (flour, sugar, eggs), an SBOM lists every single software component, library, and module that makes up a piece of software.
Why is this important for firmware?
- Transparency: It tells you exactly what software is inside your device, even if the manufacturer doesn't publicly disclose it.
- Security Posture: If you know the ingredients, you can check for known problems with those ingredients.
- Compliance: Many regulations and industry standards are now requiring SBOMs for better supply chain security.
An SBOM typically includes:
| Component Type | Analogy | Example in Firmware |
|---|---|---|
| Name | Ingredient Name | busybox |
| Version | Quantity/Specific Type | 1.30.1 |
| Supplier | Manufacturer | busybox.net |
| License | Allergen Information | GPL-2.0 |
| Dependencies | Sub-ingredients |
busybox might depend on glibc
|
| Hashes | Unique Barcode | MD5, SHA256 of the binary |
| Paths | Where it's found in the package |
/bin/busybox, /usr/local/bin/busybox
|
Having an SBOM is crucial for supply chain security. If a vulnerability is discovered in OpenSSL version 1.0.2, you can quickly check your SBOMs to see if any of your devices use that specific version.
Once we have our "ingredients list" (the SBOM), the next crucial step is to check each ingredient for known security flaws. This process is called Vulnerability Aggregation. It's like taking your cake's ingredient list and cross-referencing it with a massive database of food allergies or recalls.
Key concepts in vulnerability aggregation:
- CVEs (Common Vulnerabilities and Exposures): These are unique identifiers for publicly disclosed cybersecurity vulnerabilities (e.g., CVE-2023-12345). Each CVE describes a specific flaw in a specific piece of software.
- Exploit Databases: These are collections of publicly available "proof-of-concept" code or actual exploit tools that demonstrate how to take advantage of a vulnerability (e.g., Exploit-DB, Metasploit). EMBA checks these to see if an identified CVE has readily available attack code.
- EPSS (Exploit Prediction Scoring System): A data-driven score that estimates the likelihood of a vulnerability being exploited in the wild.
- Verification: This is EMBA's special sauce! Instead of just saying "this version has CVE-X," EMBA tries to verify if the vulnerable feature or code path is actually present in the specific binary found in the firmware. This significantly reduces false positives.
- VEX (Vulnerability Exploitability eXchange): A VEX document goes a step further than just listing vulnerabilities. It provides a clear statement about whether a vulnerability identified in a component actually affects the product or system in question. For example, a library might have a CVE, but if the firmware uses it in a way that avoids the vulnerable function, the VEX can state that the vulnerability is "not affected."
As a user, you don't need special commands for this phase. Once you run EMBA, after the extraction, static, and dynamic analysis steps are complete, EMBA automatically proceeds to build the SBOM and aggregate vulnerabilities. It's an integrated part of the overall analysis pipeline.
You'll see output similar to this:
[*] SBOM - main package SBOM environment
[+] Debian packages SBOM results
[*] Analyzing 10 Python pipfile.lock archives:
[+] Python pipfile.lock SBOM results
[*] Generating final VEX vulnerability json ...
[+] VEX data in json format is available
[+] CycloneDX SBOM with VEX data in JSON format is ready
This indicates EMBA is actively identifying packages (like Debian or Python Pipfile locks) to build the SBOM, and then generating VEX (Vulnerability Exploitability eXchange) data and a final CycloneDX SBOM.
Let's break down some key aspects of how EMBA builds the SBOM and aggregates vulnerability data:
EMBA has many specialized "sub-modules" dedicated to finding software components from different sources. This is handled primarily by S08_main_package_sbom.sh and its sub-modules. It's like having different experts looking for different kinds of "ingredients" in your extracted firmware.
-
Package Managers: If the firmware uses a Linux distribution like Debian, OpenWrt, or Alpine, EMBA checks their package management databases (e.g.,
/var/lib/dpkg/statusfor Debian,/var/lib/opkg/info/*.controlfor OpenWrt,/lib/apk/db/installedfor Alpine). These files list installed software, their versions, and sometimes even their dependencies and licenses.For example, to parse Alpine APK packages, EMBA might look into
.PKGINFOfiles after extracting a.apkarchive:# Simplified from modules/S08_main_package_sbom_modules/S08_submodule_alpine_apk_package_parser.sh # After extracting an .apk file to /tmp/apk lAPP_NAME=$(grep '^pkgname = ' "${TMP_DIR}"/apk/.PKGINFO || true) lAPP_NAME=${lAPP_NAME/pkgname\ =\ } # ... more parsing for version, license, etc. echo "Found Alpine package: Name=${lAPP_NAME}, Version=${lAPP_VERS}"
This snippet shows how EMBA extracts details like the package name from a specific file (
.PKGINFO) found within an Alpine.apkpackage. -
Programming Language Lock/Requirement Files: EMBA also looks for files common in software development projects that list dependencies, such as Python's
requirements.txtorPipfile.lock, Node.js'spackage-lock.json, PHP'scomposer.lock, or Rust'sCargo.lock.Here's a simplified example of parsing a Python
requirements.txt:# Simplified from modules/S08_main_package_sbom_modules/S08_submodule_python_requirements_parser.sh # Reads a line like "requests==2.28.1" if [[ "${lRES_ENTRY}" == *"=="* ]]; then lAPP_NAME=${lRES_ENTRY/==*} lAPP_VERS=${lRES_ENTRY/*==} fi echo "Python requirement: Name=${lAPP_NAME}, Version=${lAPP_VERS}"
This demonstrates how EMBA extracts the package name and version from a line in a
requirements.txtfile. -
Metadata in Binary Files: For Windows executables, EMBA can sometimes extract software names and versions from metadata embedded in the binary (like EXIF data in image files, but for executables).
# Simplified from modules/S08_main_package_sbom_modules/S08_submodule_windows_exifparser.sh # Extracts product name and version from a Windows EXE using exiftool exiftool "${lEXE_ARCHIVE}" > "${lEXIF_LOG}" lAPP_NAME=$(grep "Product Name" "${lEXIF_LOG}" || true) lAPP_NAME=${lAPP_NAME/*:\ } lAPP_VERS=$(grep "Product Version Number" "${lEXIF_LOG}" || true) lAPP_VERS=${lAPP_VERS/*:\ }
exiftoolis a utility that can read various metadata, including that found in Windows executables. -
Generic Version Detection: As seen in Chapter 2: Static Analysis Core, EMBA's
S09_firmware_base_version_check.shandS115_usermode_emulator.sh(user-mode emulation) directly identify software versions from binaries (like BusyBox or OpenSSL) by looking for specific strings or running the binary. These findings also contribute to the SBOM.
Once a component's details are extracted, helper functions (helpers/helpers_emba_sbom_helpers.sh) standardize the data and store it as individual JSON files in EMBA's internal SBOM_LOG_PATH directory.
# Simplified from helpers/helpers_emba_sbom_helpers.sh
# Function to build a JSON fragment for a component
build_sbom_json_component_arr() {
local lPACKAGING_SYSTEM="${1:-}" # e.g., "debian_pkg_mgmt"
local lAPP_NAME="${2:-}" # e.g., "libc6"
local lAPP_VERS="${3:-}" # e.g., "2.31-0ubuntu9.9"
# ... other parameters like license, maintainer, description
# Construct a unique ID for this component
SBOM_COMP_BOM_REF="$(uuidgen)"
# Use 'jo' (JSON output) to create a JSON object
jo -n -- \
type="library" \
name="${lAPP_NAME}" \
version="${lAPP_VERS}" \
group="${lPACKAGING_SYSTEM}" \
bom-ref="${SBOM_COMP_BOM_REF}" \
properties="$(jo -a "${PROPERTIES_JSON_ARR[@]}")" \
hashes="$(jo -a "${HASHES_ARR[@]}")" \
> "${SBOM_LOG_PATH}/${lPACKAGING_SYSTEM}_${lAPP_NAME}_${SBOM_COMP_BOM_REF}.json"
}This function build_sbom_json_component_arr is central to creating the individual JSON entries for each identified software component. It takes the parsed details (name, version, packaging system) and organizes them into a structured format, also incorporating calculated file hashes and other "properties" collected during analysis.
After all individual components are identified, the F15_cyclonedx_sbom.sh module takes all these small JSON fragments and merges them into one comprehensive SBOM document, typically in the widely-used CycloneDX format.
# Simplified from modules/F15_cyclonedx_sbom.sh
# Aggregates all individual component JSONs into a single CycloneDX SBOM file
# ...
echo -n "[" > "${SBOM_LOG_PATH}/sbom_components_tmp.json"
for lCOMP_FILE in "${lCOMP_FILES_ARR[@]}"; do
# Reads each component's JSON file and appends it to a temporary file
cat "${lCOMP_FILE}" >> "${SBOM_LOG_PATH}/sbom_components_tmp.json"
# Adds a comma if it's not the last entry
echo -n "," >> "${SBOM_LOG_PATH}/sbom_components_tmp.json"
done
echo -n "]" >> "${SBOM_LOG_PATH}/sbom_components_tmp.json"
# ...
# Finally, constructs the full SBOM with metadata, components, and dependencies
jo -p -n -- \
\$schema="http://cyclonedx.org/schema/bom-1.5.schema.json" \
bomFormat="CycloneDX" \
specVersion="1.5" \
components=:"${lSBOM_LOG_FILE}_components.json" \
dependencies=:"${lSBOM_LOG_FILE}_dependencies.json" \
vulnerabilities="[]" \
> "${lSBOM_LOG_FILE}.json"This simplified code snippet shows the high-level process of F15_cyclonedx_sbom.sh: it gathers all the individual component JSONs (generated by S08 sub-modules) and combines them into a single, well-formed CycloneDX SBOM JSON file. It also sets up the basic SBOM metadata.
The following screenshots are taken from a typical SBOM:


The F17_cve_bin_tool.sh module is the central orchestrator for vulnerability aggregation. It uses the generated SBOM as its starting point.
-
CVE Lookup: For every component identified in the SBOM,
F17usescve-bin-tool(an external tool integrated by EMBA) to query known vulnerability databases (like the National Vulnerability Database - NVD) based on the component's name and version.# Simplified from modules/F17_cve_bin_tool.sh # Iterates through SBOM entries and runs cve-bin-tool python3 "${lCVE_BIN_TOOL}" -i "${LOG_PATH_MODULE}/${lBOM_REF}.tmp.csv" \ --disable-version-check --offline -f csv \ -o "${LOG_PATH_MODULE}/${lBOM_REF}_${lPRODUCT_NAME}_${lVERS}" || true
This
python3command invokescve-bin-tool, telling it to read component information from a temporary CSV file (created from the SBOM entry) and output found vulnerabilities to a new CSV file. The--offlineflag means it uses locally cached vulnerability databases. -
Exploit Information: For each CVE found, EMBA then checks various exploit databases to see if public exploit code or a Proof-of-Concept (PoC) exists. This includes:
- Exploit-DB: A public archive of exploits and shellcode.
- Metasploit Framework: A popular penetration testing framework with many exploit modules.
- CISA Known Exploited Vulnerabilities (KEV): A catalog maintained by the U.S. Cybersecurity and Infrastructure Security Agency for vulnerabilities known to be actively exploited.
- Packetstormsecurity & Snyk: Other public sources for PoCs and advisories.
- Routersploit: A framework specifically for embedded device exploitation.
EMBA regularly updates its local copies of these exploit databases using helper scripts (e.g.,
helpers/known_exploited_vulns_update.sh,helpers/metasploit_db_update.sh,helpers/packet_storm_crawler.sh,helpers/snyk_crawler.sh).The
tear_down_cve_threaderfunction withinF17performs these checks for each CVE:# Simplified from modules/F17_cve_bin_tool.sh (tear_down_cve_threader function) # Check if exploit exists in Exploit-DB mapfile -t lEXPLOIT_AVAIL_EDB_ARR < <(cve_searchsploit "${lCVE_ID}" 2>/dev/null || true) if [[ " ${lEXPLOIT_AVAIL_EDB_ARR[*]} " =~ "Exploit DB Id:" ]]; then lEXPLOIT="Exploit (EDB ID: ${lEXPLOIT_ID})" fi # Check if exploit exists in Metasploit mapfile -t lEXPLOIT_AVAIL_MSF_ARR < <(grep -E "${lCVE_ID}"$ "${MSF_DB_PATH}" 2>/dev/null || true) if [[ ${#lEXPLOIT_AVAIL_MSF_ARR[@]} -gt 0 ]]; then lEXPLOIT+=" / MSF: ${lEXPLOIT_NAME}" fi # ... similar checks for Packetstorm, Snyk, Routersploit, KEV
This code illustrates how EMBA
grep(searches) its local copies of exploit databases (MSF_DB_PATH, etc.) for the identifiedlCVE_IDto determine if a public exploit exists. -
EPSS (Exploit Prediction Scoring System): EMBA also fetches EPSS scores, which indicate the likelihood of a vulnerability being exploited in the wild.
# Simplified from modules/F17_cve_bin_tool.sh (get_epss_data function) get_epss_data() { local lCVE_ID="${1:-}" local lCVE_YEAR="$(echo "${lCVE_ID}" | cut -d '-' -f2)" lCVE_EPSS_PATH="${EPSS_DATA_PATH}/CVE_${lCVE_YEAR}_EPSS.csv" if [[ -f "${lCVE_EPSS_PATH}" ]]; then lEPSS_DATA=$(grep "^${lCVE_ID};" "${lCVE_EPSS_PATH}" || true) lEPSS_EPSS=$(echo "${lEPSS_DATA}" | cut -d ';' -f2) # The EPSS score # ... further processing for percentage fi echo "${lEPSS_EPSS}" # Returns the EPSS score }
This function reads locally stored EPSS data (which is updated by EMBA's internal helper scripts) to provide a probabilistic score for a given CVE.
-
Vulnerability Verification (Reducing False Positives): A key differentiator for EMBA is its ability to verify if a CVE truly affects the firmware. Simply finding a CVE for a software version doesn't mean it's exploitable in this specific firmware. EMBA does this through modules like:
-
S26_kernel_vuln_verifier.sh: For Linux kernels, this module checks if vulnerable kernel functions are actually present and used in the firmware's kernel binary by analyzing kernel symbols or the kernel's configuration (.configfile).# Simplified from modules/S26_kernel_vuln_verifier.sh (symbol_verifier function) # Checks if a vulnerable function (from a CVE description) is present # as an exported symbol in the kernel's compiled binary or modules. for lCHUNK_FILE in "${LOG_PATH_MODULE}"/symbols_uniq.split.* ; do if grep -q -f "${lCHUNK_FILE}" "${lKERNEL_DIR}/${lK_PATH}" ; then echo "Vulnerability ${lCVE} verified via exported symbol in ${lK_PATH}" lVULN_FOUND=1 break fi done
This snippet demonstrates how EMBA searches for a vulnerable function's "symbol" (like a function name) within the kernel's compiled code. If found, it increases confidence that the vulnerability is indeed present.
-
S118_busybox_verifier.sh: For BusyBox, this module goes a step further. BusyBox is a collection of many small utilities (called "applets"). A CVE might affect BusyBox in general, but only if a specific applet (e.g.,telnetd) is compiled into this firmware. EMBA identifies which applets are actually present (from static analysis or user-mode emulation) and only flags CVEs relevant to those active applets.# Simplified from modules/S118_busybox_verifier.sh (busybox_vuln_testing_threader function) # Checks if a CVE's summary mentions an applet that is actually present for lBB_APPLET in "${BB_VERIFIED_APPLETS[@]}"; do if [[ "${lSUMMARY}" == *" ${lBB_APPLET} "* ]]; then echo "Verified BusyBox vulnerability ${lCVE} - applet ${lBB_APPLET}" # Log this as a *verified* vulnerability fi done
This shows how EMBA checks if an identified applet (like
telnetd) is mentioned in the summary of a BusyBox CVE. If so, it confirms that this specific vulnerability is relevant to the firmware's BusyBox configuration.
-
Finally, all this aggregated and verified vulnerability data is incorporated back into the SBOM, usually as a VEX document, providing a holistic view of the firmware's security posture.
The SBOM & Vulnerability Aggregation functionality is the cornerstone of EMBA's security analysis, transforming raw findings into actionable intelligence. By meticulously building a "nutrition label" (SBOM) for your firmware and cross-referencing it with "recall notices" (vulnerabilities), EMBA provides unparalleled transparency into your device's software supply chain. Its advanced verification capabilities help you cut through the noise, focusing only on the vulnerabilities that truly affect your firmware.
With EMBA's detailed analysis complete, the final step is to present all these findings in a clear, concise, and user-friendly manner. In the next chapter, Reporting & User Experience, you will discover how EMBA compiles all its discoveries into comprehensive reports that make sense for both technical and non-technical stakeholders.
EMBA - firmware security scanning at its best
Sponsor EMBA and EMBArk:
The EMBA environment is free and open source!
We put a lot of time and energy into these tools and related research to make this happen. It's now possible for you to contribute as a sponsor!
If you like EMBA you have the chance to support future development by becoming a Sponsor
Thank You ❤️ Get a Sponsor
You can also buy us some beer here ❤️ Buy me a coffee
To show your love for EMBA with nice shirts or other merch you can check our Spreadshop
EMBA - firmware security scanning at its best
- Home
- The EMBA book
- Feature overview
- Installation
- Usage
- Development
- Sponsoring EMBA
- EMBA Merchandise
- FAQ
- EMBArk enterprise environment