Description
I think the formalization of the collision-free property of Merkle Tree might have a defect.
The collision-free property of Merkle Hash is proved by the function extract
, whose type is defined as follows.
https://github.com/project-everest/hacl-star/blob/144c44e1fa6e8062b2c50d4cd7ad0e41e7f0fe29/secure_api/merkle_tree/MerkleTree.Spec.fst#L485-L487
In the proof, a collision instance of the Merkle Hash (mt_collide #_ #f n i
) is reduced to a collision instance of the base hash function (hash2_raw_collide
).
The collision of the base hash function is formalized as the instance of hash_2_raw_collide
defined as fllows.
https://github.com/project-everest/hacl-star/blob/144c44e1fa6e8062b2c50d4cd7ad0e41e7f0fe29/secure_api/merkle_tree/MerkleTree.Spec.fst#L425-L430
And the hash
and hash_fun_t
is defined as follows.
https://github.com/project-everest/hacl-star/blob/144c44e1fa6e8062b2c50d4cd7ad0e41e7f0fe29/secure_api/merkle_tree/MerkleTree.Spec.fst#L13-L15
However, the instance of the hash_2_raw_collide
can be constructed without a collision instance of the Merkle Hash. Let #f:hash_fun_t #hsz
be a hash function used in the hash2_raw_collide
. The domain of #f
is strictly greater than the codomain of it, and both of them are finite set, therefore there should be a collision instance of the hash function #f
(by the pigeonhole principle).
Although the pigeonhole principle isn't used in the function extract
and currently Z3 is not powerful enough to prove the pigeonhole principle by itself, this formalization might lead to incorrect proof.