[openssl-dev] [openssl.org #3897] request: add BLAKE2 hash function (let's kill md5sum!)

Samuel Neves sneves at dei.uc.pt
Thu Jun 11 00:22:16 UTC 2015


On 11-06-2015 00:36, Bill Cox wrote:
> Samuel Neves' SSE version is the one we all played with in the Password
> Hashing Competition.  The speed is amazing.  Is there a faster version
> available now?  Which version should we integrate into OpenSSL?

The problem with my implementation is that it relies on compiler intrinsics. This means that depending on the compiler
version, flags, and the current moon phase performance can degrade significantly for no apparent reason. Some older
compilers do not have those intrinsics at all. This is not desirable for inclusion in a library such as OpenSSL, which
is meant to be compiled in all sorts of setups.

Of course, we can take a known-good compilation output and place the assembly directly into the library. Andrew Moon's
code, linked  by Zooko, appears to be that (though the implementation is different from mine) plus some extra massaging
to handle different calling conventions and ABIs. It is pretty good.

Regarding which version of BLAKE2 to include, I think people are conflating two different use cases here, which is
complicating things:

 - Use case #1 is for packet authentication. Seeing that BLAKE2 has now an IETF draft going meant primarily for that
purpose---in which the BLAKE2*p variants are not included---it seems obvious to me that the IETF versions are the ones
to include, since if new ciphersuites show up with BLAKE2 OpenSSL is forced to include them anyway.

 - Use case #2 is for large file integrity checks. This is where the parallel versions shine, largely benefiting from
multi-threading. OpenSSL does not natively support threading, as I understand it, so this is one downside. Another
downside is that the parallel implementations are slower for small messages, as Bill points out, which is unavoidable by
their tree structure. BLAKE2sp is likely the best option for large-scale hashing (as the WinRAR author also concluded),
but I'm not so sure that OpenSSL is ready for that.

Now, let's recall that the initial impetus for including BLAKE2 in OpenSSL---from Zooko's point of view---was so that
coreutils could trivially link to it and get great performance out of it. Since OpenSSL does not do threads, it's
unlikely that will happen without significant wrangling.

My personal recommendation is BLAKE2b. It is the fastest sequential variant on x86, ARMv7 with NEON, and also ARMv8. Are
there any supported OpenSSL platforms where this kind of speed is essential, but where BLAKE2s is significantly faster
than BLAKE2b (a reasonable proxy data point for this would be where SHA256 convincingly beats SHA512)?

Best regards,
Samuel Neves





More information about the openssl-dev mailing list