@@ -706,3 +706,70 @@ Ideas for future features include:
706706 packager
707707
708708- Semantic signatures for package revocation, invalidation, or replacement
709+
710+ Design
711+ ------
712+
713+ # ## Manifest Format
714+
715+ A filepack manifest contains all information needed to verify the contents of a
716+ directory. The `files` key of the manifest is a directory object mapping
717+ filenames to directory entries, which may themselves be directories, or files,
718+ in which case they contain the hash of the file contents, as well as the length
719+ of the file.
720+
721+ The length of the file is not strictly necessary for verification, but is
722+ included so that truncated, empty, and overlong files can be identified, which
723+ may help in understanding verification failures.
724+
725+ # ## File Hashes
726+
727+ The contents of files are hashed with
728+ [BLAKE3](https://github.com/BLAKE3-team/BLAKE3) using the official Rust
729+ implementation. BLAKE3 was chosen both for its speed, and for the fact that it
730+ utilizes a Merkle tree construction. A Merkle tree allows for verified file
731+ streaming and subrange inclusion proofs, which both seem useful in the context
732+ of file hashing and verification.
733+
734+ # ## Signatures
735+
736+ Filepack allows for the creation of Ed25519 signatures over the contents of a
737+ manifest, which thus commit to the contents of the directory covered by the
738+ manifest. Signatures are made not over serialized manifest, but over a message
739+ containing a "fingerprint" hash, a Merkle tree hash created from the contents
740+ of the manifest. This keeps signatures independent of the manifest format,
741+ avoids issues with canonicalization of the manifest JSON, avoids hash loops due
742+ to the inclusion of signatures in the manifest itself, and allows proving the
743+ inclusion of files covered by a signature using a Merkle receipt.
744+
745+ # ## Fingerprints
746+
747+ Although only package fingerprints are exposed externally, several types of
748+ fingerprints are used internally, namely directory, entry, file, and message
749+ fingerprints.
750+
751+ Fingerprints are constructed to be unique, both between and within types,
752+ meaning that it is impossible two different values with different types or
753+ contents but which have the same fingerprint.
754+
755+ Fingerprints are BLAKE3 hashes. To guarantee that fingerprints are unique
756+ between types, the hasher is first initialized with a length-prefixed string
757+ unique to each type.
758+
759+ After the prefix, the value is hashed as a sequence of TLV fields.
760+
761+ Fields are hashed in order, but may be skipped, in the case of optional fields,
762+ or repeated, in the case of fields containing multiple values.
763+
764+ Currently, no fingerprint test vectors exist, and the best documentation is the
765+ code itself.
766+
767+ In particular, see :
768+
769+ - [FingerprintHasher](src/fingerprint_hasher.rs)
770+ - [FingerprintPrefix](src/fingerprint_prefix.rs)
771+ - [Manifest](src/manifest.rs)
772+ - [Directory](src/directory.rs)
773+ - [Entry](src/entry.rs)
774+ - [Files](src/file.rs)
775+ - [Message](src/message.rs)
0 commit comments