[openssl-dev] [openssl.org #3843] OpenSSL 1.0.1* and below: incorrect use of _lrotl()

Andy Polyakov via RT rt at openssl.org
Wed May 20 10:55:10 UTC 2015


Hi,

For reference. icc was not cared for for quite some time. Initially it
was possible for me, by then university employee, to use it, but then
they changes terms and it became impossible for me to maintain it. But
I've just noticed they provide some starter version of something, I'll
see...

> Lei Zhang (re)discovered that OpenSSL 1.0.1* and below gets miscompiled,
> resulting in incorrect computation of at least SHA-1 hashes (and probably
> SHA-0, MD4, MD5) when it's compiled with icc for 64-bit Linux (x86_64 or
> mic), but not for Windows. The problem is already fixed in 1.0.2 and in
> LibreSSL.
> 
> The problem is that OpenSSL uses the _lrotl() intrinsic to rotate 32-bit
> integers, whereas it is defined to operate on "unsigned long", which
> obviously is 64-bit on many platforms.
> 
> Lei's report:
> 
> http://www.openwall.com/lists/john-dev/2015/03/26/1
> 
> A previous report (from 2011):
> 
> https://software.intel.com/en-us/articles/openssl-generates-incorrect-shamd5-value-if-built-with-icc-compiler
> 
> I suggest that this be fixed for all currently supported branches of
> OpenSSL.  For now, Lei switched to using LibreSSL in our John the Ripper
> -jumbo builds for Xeon Phi, but we'd like to (re-)include instructions
> for building with OpenSSL as well.

But linux-x86_64-icc is not present in and was never supported in
pre-1.0.2. So you ought to provide custom line. This remark doesn't mean
that fix can't be backported, but out of curiosity, what's your config
line? Is assembly engaged? If so, how fast is it? Or is it so that you
count on compiler to produce vector code that would process multiple
inputs in parallel with SIMD?

On related note. What's Xeon Phi in this context? I mean are we talking
about Knights Corner (that features own compatible-with-nothing SIMD
instruction set) or Knights Landing (that features AVX512)? If latter,
it might be interesting to extend multi-block SHA support(*), which
should allow to achieve pretty cool results (with vector rotate and
ternary logic instructions, not to mention 16 lanes:-). [As for
"interesting". It's possible but not really interesting in Knights
Corner case, because effort is too specific, just a single obscure and
hardly available CPU, while AVX512 is planned even for other processors
so that code will be reusable.]

(*) BTW, did you try existing one?




More information about the openssl-dev mailing list