Skip to content

Allow CRC32c in place of MD5 #496

@zbjornson

Description

@zbjornson

The ASDF standard allows an optional MD5 hash for each block. While that has high sensitivity, it's a very slow function compared to a CRC (<1GB/s vs. >20 GB/s, slow enough to have a significant impact on large data sets), and a 32-bit CRC is adequate for many use cases.

Would you be open to something like this?

Block header

...

  • checksum (16-byte string): An optional CRC32c checksum or MD5 hash of the used data in the block. See flags: If bit 2 is unset, then the value is the MD5. If bit 2 is set, then the value is the CRC32c followed by eight 0-bytes. The special value of all zeros indicates that no verification should be performed.

Flags

The following bit flags are understood in the flags field:

  • STREAMED (0x1): ...
  • CRC32C (0x2): The block header checksum is the Castagnoli CRC32 followed by eight 0-bytes.

Thanks!


I'm working in a task force to define the next version of the FCS file format, and ASDF has a lot of attractive qualities. I might have a few more comments/questions coming over the following weeks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions