Description
What is the problem this feature will solve?
MD5 hashing is quite commonly used for file handling in HTTP. However, the algorithm is quite slow and is not able to fully utilize modern hardware.
What is the feature you are proposing to solve the problem?
While it's not possible to do much more to optimize a single hashing instance there are techniques such as https://github.com/minio/md5-simd which is able to run multiple hashing instances over SIMD which can process 8 MD5 hashes in parallel using SIMD instructions.
In a real world application such as a http file server/client (using Content-MD5) with many parallel request this would provide a 2-6x real-world performance improvement. In some of our applications the MD5 hash takes more than 50% of cpu time.
This should be possible to implement without changing our API's.
What alternatives have you considered?
Run hashing in a thread pool. One does not necessarily exclude the other. Using a thread pool would be more about avoiding latency spikes as in terms of throughput just forking the http server provides similar results.
Metadata
Metadata
Assignees
Type
Projects
Status