Is your feature request related to a problem? Please describe.
There is no shared error vocabulary, no defined interface contract for metadata extraction, no concrete implementation against actual firmware files, and no front validation on firmware uploads. Without these foundations, the coming phases of the extraction pipeline have no strong base to build on. Firmware images that carry no fwtool JSON trailer (e.g. sunxi targets) are silently unhandled, and there is no resource guard preventing a malformed image from expanding into an arbitrarily large buffer during decompression, leaving Celery workers exposed to OOM kills.
Describe the solution you'd like
1. Exception Hierarchy
ExtractionError as the base, with UnsupportedImageError (nothing extractable found, triggers fallback or clean failure) and DecompressionLimitExceeded as subclasses. These live in extractors/exceptions.py, separate from the upgrade-related exceptions in exceptions.py.
2. BaseMetadataExtractor ABC
A public extract() orchestrator that calls extract_from_image() and falls back to extract_from_dtb() on ExtractionError, re-raising UnsupportedImageError immediately if raised by extract_from_image() (no fallback attempted). extract_from_image() is abstract. extract_from_dtb() raises UnsupportedImageError by default, making DTB support opt-in. The base class stays technology-agnostic, no subprocess calls, no binary imports. Defines the normalized return dict as the binding contract for all downstream code:
{
"model": str,
"compatible": list[str],
"target": str,
"version": str,
"compat_version": str,
"source": str, # "fwtool" | "dtb"
}
3. metadata_extractor_class seam on the upgrader hierarchy
None on the base upgrader, OpenWrtMetadataExtractor on OpenWrt. This isolates firmware family logic, adding a new family requires only subclassing BaseMetadataExtractor; the task and admin layers are never touched.
4. OpenWrtMetadataExtractor: fwtool fast path
_run_command(): subprocess wrapper enforcing a strict 30-second timeout, decoding stdout with errors="replace" to handle non-UTF-8 binary garbage, raising ExtractionError on non-zero exit codes or TimeoutExpired.
_extract_from_fwtool(): invokes fwtool -q -i - <image_path>, parses the JSON trailer, and returns the normalized dict. All fields accessed via .get() to tolerate schema variations across OpenWrt versions 18.06–24.10. Raises ExtractionError on missing or malformed JSON so extract() can trigger the fallback cleanly.
_parse_supported_devices(): handles the compat_version difference introduced during the swconfig→DSA migration. If compat_version != "1.0", board identifiers are read from new_supported_devices instead of supported_devices.
_detect_image_type(): called before fwtool to identify types that will never have a fwtool trailer: x86 disk images (.img, .vdi, .vmdk suffix) and armsr targets raise UnsupportedImageError immediately.
5. OpenWrtMetadataExtractor: DTB fallback path and OOM protection
_check_limits(): validates raw file size against a configurable cap (OPENWISP_FIRMWARE_UPGRADER_MAX_KERNEL_BYTES, default 256 MB) and raises DecompressionLimitExceeded before any decompression begins. During gzip decompression, chunk-by-chunk reading enforces OPENWISP_FIRMWARE_UPGRADER_MAX_DECOMPRESSED_BYTES (default 512 MB) and OPENWISP_FIRMWARE_UPGRADER_MAX_DECOMPRESSED_RATIO (default 100×) to catch compression bombs before they expand.
extract_from_dtb(): calls _check_limits() first, then attempts kernel decompression sequentially across gzip, xz, lzma, bz2, and lz4, stopping at the first that succeeds. Strips uImage headers before decompression. Locates the DTB within the decompressed kernel (FIT image scan or raw magic search), parses it with fdt, and returns the normalized dict with "source": "dtb". Raises UnsupportedImageError if no compression format succeeds or no DTB is found.
extract() override: runs fwtool first; on success, optionally enriches an empty compatible list from the DTB path without changing "source". On ExtractionError, falls back to DTB. Re-raises UnsupportedImageError from either path.
6. Pre-upload validation on FirmwareImage via clean()
_validate_file_header(): reads the first 16 bytes and rejects known non-firmware magic bytes (JPEG \xff\xd8\xff, PDF %PDF, PNG \x89PNG, ZIP PK\x03\x04, ELF \x7fELF) with a translated ValidationError({"file": ...}). Fails gracefully on IOError and on missing file.
_validate_rootfs(): rejects filenames ending in -rootfs.img with a translated ValidationError({"file": ...}).
Is your feature request related to a problem? Please describe.
There is no shared error vocabulary, no defined interface contract for metadata extraction, no concrete implementation against actual firmware files, and no front validation on firmware uploads. Without these foundations, the coming phases of the extraction pipeline have no strong base to build on. Firmware images that carry no fwtool JSON trailer (e.g. sunxi targets) are silently unhandled, and there is no resource guard preventing a malformed image from expanding into an arbitrarily large buffer during decompression, leaving Celery workers exposed to OOM kills.
Describe the solution you'd like
1. Exception Hierarchy
ExtractionErroras the base, withUnsupportedImageError(nothing extractable found, triggers fallback or clean failure) andDecompressionLimitExceededas subclasses. These live inextractors/exceptions.py, separate from the upgrade-related exceptions inexceptions.py.2.
BaseMetadataExtractorABCA public
extract()orchestrator that callsextract_from_image()and falls back toextract_from_dtb()onExtractionError, re-raisingUnsupportedImageErrorimmediately if raised byextract_from_image()(no fallback attempted).extract_from_image()is abstract.extract_from_dtb()raisesUnsupportedImageErrorby default, making DTB support opt-in. The base class stays technology-agnostic, no subprocess calls, no binary imports. Defines the normalized return dict as the binding contract for all downstream code:{ "model": str, "compatible": list[str], "target": str, "version": str, "compat_version": str, "source": str, # "fwtool" | "dtb" }3.
metadata_extractor_classseam on the upgrader hierarchyNoneon the base upgrader,OpenWrtMetadataExtractoronOpenWrt. This isolates firmware family logic, adding a new family requires only subclassingBaseMetadataExtractor; the task and admin layers are never touched.4.
OpenWrtMetadataExtractor: fwtool fast path_run_command(): subprocess wrapper enforcing a strict 30-second timeout, decoding stdout witherrors="replace"to handle non-UTF-8 binary garbage, raisingExtractionErroron non-zero exit codes orTimeoutExpired._extract_from_fwtool(): invokesfwtool -q -i - <image_path>, parses the JSON trailer, and returns the normalized dict. All fields accessed via.get()to tolerate schema variations across OpenWrt versions 18.06–24.10. RaisesExtractionErroron missing or malformed JSON soextract()can trigger the fallback cleanly._parse_supported_devices(): handles thecompat_versiondifference introduced during the swconfig→DSA migration. Ifcompat_version != "1.0", board identifiers are read fromnew_supported_devicesinstead ofsupported_devices._detect_image_type(): called before fwtool to identify types that will never have a fwtool trailer: x86 disk images (.img,.vdi,.vmdksuffix) and armsr targets raiseUnsupportedImageErrorimmediately.5.
OpenWrtMetadataExtractor: DTB fallback path and OOM protection_check_limits(): validates raw file size against a configurable cap (OPENWISP_FIRMWARE_UPGRADER_MAX_KERNEL_BYTES, default 256 MB) and raisesDecompressionLimitExceededbefore any decompression begins. During gzip decompression, chunk-by-chunk reading enforcesOPENWISP_FIRMWARE_UPGRADER_MAX_DECOMPRESSED_BYTES(default 512 MB) andOPENWISP_FIRMWARE_UPGRADER_MAX_DECOMPRESSED_RATIO(default 100×) to catch compression bombs before they expand.extract_from_dtb(): calls_check_limits()first, then attempts kernel decompression sequentially across gzip, xz, lzma, bz2, and lz4, stopping at the first that succeeds. Strips uImage headers before decompression. Locates the DTB within the decompressed kernel (FIT image scan or raw magic search), parses it withfdt, and returns the normalized dict with"source": "dtb". RaisesUnsupportedImageErrorif no compression format succeeds or no DTB is found.extract()override: runs fwtool first; on success, optionally enriches an emptycompatiblelist from the DTB path without changing"source". OnExtractionError, falls back to DTB. Re-raisesUnsupportedImageErrorfrom either path.6. Pre-upload validation on
FirmwareImageviaclean()_validate_file_header(): reads the first 16 bytes and rejects known non-firmware magic bytes (JPEG\xff\xd8\xff, PDF%PDF, PNG\x89PNG, ZIPPK\x03\x04, ELF\x7fELF) with a translatedValidationError({"file": ...}). Fails gracefully onIOErrorand on missing file._validate_rootfs(): rejects filenames ending in-rootfs.imgwith a translatedValidationError({"file": ...}).