SHA512 implementations could be optimized two ways: - add inline assembly as much as possible (as initiated in the experimental branch) - use a 256 bit representation instead of a 64 bit one (kind of "bitslicing" the operations).