Description
Overview
Ubuntu provides a tool called Chisel, which allows developers to remove unnecessary files from a image and only keep the absolutely minimum required files.
This makes it difficult for existing scanners which look at manifest data to accurately determine what packages are on the container image.
Chisel however provides a manifest format which records exactly which packages (and which files belonging to each package) exists on the image: https://documentation.ubuntu.com/chisel/en/latest/reference/manifest/
Details
To support the Chisel manifest file, we need to create a new Extractor Plugin. See https://github.com/google/osv-scalibr/blob/main/docs/new_extractor.md for a guide on how the current extractor interface works and tips on implementing a new extractor.
Once an extractor is created and working, we can enable it in osv-scanner for container scanning, and it should automatically pipe through the manifest data as inventories.
Open questions:
- Is there an easy way to identify manifest files from the file name and path? Looking at the documentation it seems like we can just find all the
.wall
files under/var/lib/chisel/
, is this correct? - Will there be manifest files with overlapping packages that need to be deduplicated? E.g. Multiple packages in 1 .wall file, and multiple packages in another, both referring to some of the same packages.
Implementation Notes:
Note
It looks like no new dependencies are needed, as we already import https://github.com/klauspost/compress required for zstd decompression.
As a start we can just extract the package names and versions (which I believe maps onto the Ubuntu ecosystem in OSV), and in the future add metadata specifically about what files each package owns.