Skip to content

WARC dump not deterministic #3240

@villesundell

Description

@villesundell

Hello, and first of all thank you for Perma, it's a great service! 🙏

I am trying to save a WARC hash on the Bitcoin blockchain for temporal provenance. The original plan was to provide GUID with the SHA3-512 hash of the WARC. However, I noticed that the downloaded WARC is not deterministic: the two additional warcinfos with download date and time makes each download unique.

However, in my understanding WARC dumps should be deterministic (https://flowvella.com/s/3e9w/0B1C19C0-D882-41D3-910D-0A77D47F4C58).

Is there a way to get a deterministic WARC (in this particular case: a file that is always the same in each download)?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions