Open
Description
We should check the extracted file size during chunk parsing to avoid filling up the disk when extracting malicious nested archives.
Samples can be found here: https://www.bamsoftware.com/hacks/zipbomb/
For zip bombs, a check can be implemented in is_valid
:
def is_valid(self, file: io.BufferedIOBase) -> bool:
has_encrypted_files = False
try:
with zipfile.ZipFile(file) as zip: # type: ignore
for zipinfo in zip.infolist():
if zipinfo.flag_bits & ENCRYPTED_FLAG:
has_encrypted_files = True
if has_encrypted_files:
logger.warning("There are encrypted files in the ZIP")
return True
except (zipfile.BadZipFile, UnicodeDecodeError, ValueError):
return False
Something similar to this could work:
with zipfile.ZipFile(file) as zip: # type: ignore
extracted_size = sum(e.file_size for e in zip.infolist())
if extracted_size > SOME_CONSTANT:
# bail out
I'll check if similar behavior (ie. "let's fill the whole disk") can be triggered with other formats.