Skip to content

Hash is unexpectedly None for empty files #69

Open
@stefan6419846

Description

@stefan6419846

Generating hashes for empty files will always return None, which is not documented and different from the usual hashing algorithms as well as contradicting the SPDX standard.

Example:

from commoncode.hash import sha1
from hashlib import sha1 as sha1_hashlib
from tempfile import NamedTemporaryFile

with NamedTemporaryFile() as temporary_file:
    temporary_file.write(b'')
    temporary_file.seek(0)
    print(sha1(location=temporary_file.name))
    print(sha1_hashlib(string=temporary_file.read(), used_for_security=False).hexdigest())

The reason seems to be that

self.h = msg and hmodule(msg).digest()[: self.digest_size] or None
does not use msg is not None, but basically bool(msg), which is False for empty inputs as well.

Replacing the line with

            self.h = msg is not None and hmodule(msg).digest()[:self.digest_size] or None

(as well as replacing the same pattern in sha1_git_hasher) seems to fix this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions