[openssl-dev] Usage of assembler code on ARM architectures
Andy Polyakov
appro at openssl.org
Tue Mar 17 12:37:17 UTC 2015
> My mistake, it looks like my memory was wrong on two accounts. First,
> it was AES, not SHA, where I observed the no-asm was faster. Second, it
> was on the PowerPC cross-compiled target, not ARM. The results from
> "openssl speed aes-128-cbc" are:
>
> type 16 bytes 64 bytes 256 bytes 1024 bytes 8192
> bytes
> w/o no-asm 31010.47k 32988.82k 33549.41k 33693.05k
> 33825.67k
> no-asm 42431.46k 46485.14k 47479.20k 47874.86k
> 47829.36k
>
> This is using a Freescale 8548.
This is no mystery at all, and kind of intentional. If you examine
commentary in aes-ppc.pl you'll notice that that it relies on "compact"
subroutines, those that are using 256-byte S-boxes, which require more
computations. It mentions that "compact" encrypt is ~2 times slower than
"traditional" encrypt. On the other side of scales is insecurity of
"traditional" subroutine which is susceptible to cache-timing attacks.
Well, it's not like "compact" is not susceptible, but it's *much* more
resistant. Indeed, vulnerability is quantified by probability of a cache
line not being accessed as result of block operation, and in "compact"
case is as low as (1-32/256)^160=5e-10 vs. (1-4/256)^160=0.08 for
processor in question. Note that C version is even worse than
"non-compact" assembly subroutine.
You might argue that there is no room for adversary in *your*
application and performance should be favoured. By "no room" I mean that
it's probably locked down embedded system and adversary having ability
to execute own code is considered big enough problem. Yes, but you have
to *argue* in favour. Maybe it should be a compile option...
More information about the openssl-dev
mailing list