Skip to content

Commit ae2ce97

Browse files
committed
fix(processing): adapt is_padding to fix potential MemoryError
If an unknown chunk size is larger than available RAM on the system where unblob is run, the previous is_padding implementation could lead to MemoryError as it tries to load everything in memory. Fixed by iterating over the unknown chunk with iterate_file and using all() so we break as soon as we have different bytes.
1 parent c428feb commit ae2ce97

File tree

1 file changed

+8
-1
lines changed

1 file changed

+8
-1
lines changed

python/unblob/processing.py

+8-1
Original file line numberDiff line numberDiff line change
@@ -462,7 +462,14 @@ def _iterate_directory(self, extract_dirs, processed_paths):
462462

463463

464464
def is_padding(file: File, chunk: UnknownChunk):
465-
return len(set(file[chunk.start_offset : chunk.end_offset])) == 1
465+
first_byte = file[chunk.start_offset]
466+
return all(
467+
current_byte == first_byte
468+
for chunk in iterate_file(
469+
file, chunk.start_offset, chunk.end_offset - chunk.start_offset
470+
)
471+
for current_byte in chunk
472+
)
466473

467474

468475
def process_patterns(

0 commit comments

Comments
 (0)