The ASDF standard allows an optional MD5 hash for each block. While that has high sensitivity, it's a very slow function compared to a CRC (<1GB/s vs. >20 GB/s, slow enough to have a significant impact on large data sets), and a 32-bit CRC is adequate for many use cases.
Would you be open to something like this?
Block header
...
checksum (16-byte string): An optional CRC32c checksum or MD5 hash of the used data in the block. See flags: If bit 2 is unset, then the value is the MD5. If bit 2 is set, then the value is the CRC32c followed by eight 0-bytes. The special value of all zeros indicates that no verification should be performed.
Flags
The following bit flags are understood in the flags field:
STREAMED (0x1): ...
CRC32C (0x2): The block header checksum is the Castagnoli CRC32 followed by eight 0-bytes.
Thanks!
I'm working in a task force to define the next version of the FCS file format, and ASDF has a lot of attractive qualities. I might have a few more comments/questions coming over the following weeks.
The ASDF standard allows an optional MD5 hash for each block. While that has high sensitivity, it's a very slow function compared to a CRC (<1GB/s vs. >20 GB/s, slow enough to have a significant impact on large data sets), and a 32-bit CRC is adequate for many use cases.
Would you be open to something like this?
Thanks!
I'm working in a task force to define the next version of the FCS file format, and ASDF has a lot of attractive qualities. I might have a few more comments/questions coming over the following weeks.