[openssl-dev] Usage of assembler code on ARM architectures

John Foley foleyj at cisco.com
Mon Mar 16 15:47:40 UTC 2015


My mistake, it looks like my memory was wrong on two accounts.  First,
it was AES, not SHA, where I observed the no-asm was faster.  Second, it
was on the PowerPC cross-compiled target, not ARM.  The results from
"openssl speed aes-128-cbc" are:

type             16 bytes     64 bytes    256 bytes   1024 bytes   8192
bytes
w/o no-asm       31010.47k    32988.82k    33549.41k    33693.05k   
33825.67k
no-asm           42431.46k    46485.14k    47479.20k    47874.86k   
47829.36k

This is using a Freescale 8548.


On 03/12/2015 03:37 PM, Andy Polyakov wrote:
>> I can't speak directly to your question on the iphone-cross target, but
>> can warn you that your mileage will vary when using the ARM assembly
>> modules.  We observed that some algorithms actually run slower when
>> using the ARM assembly modules.  It's been a couple of years and I don't
>> recall the details, but want to say some of the hash algorithms were
>> actually faster when using no-asm.
> Well, I can imagine compiler succeeding to generate code better than
> sha1-armv4-large, but I can't imagine compiler beating sha256 or sha512.
> Was it really some of algorithm*s* or just one? Anyway, why
> sha1-amrv4-large? Two reasons: a) inner loops are not unrolled; b)
> over-reliance on merged rotate-n-arithmetic. "Over-reliance" means that
> it uses more such instructions than actually necessary, which can
> negatively affect performance. I realized this after having hard time
> getting sha256/512 to work well on Cortex-A53 (see sha512-armv8.pl, it's
> 64-bit module, but principle of merged rotate-n-arithmetic is same). It
> should also be noted that now there are additional code paths in
> sha1-armv4-large, namely NEON and ARMv8.
>
>> The results are likely to vary
>> depending on the actual chipset used.
> Right, ARM universe is very diverse. Assembly modules, i.e. all, not
> only ARM, are maintained to provide near-optimal performance across
> range of platforms, but sometimes optimizations conflict. In either case
> prerequisite is access to wide range of platforms and feedback. In order
> words, bring it up.
>
>> You'll probably want to test the
>> performance on the target hardware using the "openssl speed" command. 
>> You can do this on a jailbroken iOS device via SSH.
> For the record. I do development on non-jailbroken unit, so that it's
> not hard requirement.
>
> _______________________________________________
> openssl-dev mailing list
> To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
>




More information about the openssl-dev mailing list