[openssl-users] BN_MUL_MONT for ARM64 v8

Vijay Chander vijay.chander at gmail.com
Tue Feb 7 22:26:10 UTC 2017


Andy,
   1:2.5 is pretty in my opinion for ARM !

   We  will check out Mongoose.

   Hmm - will try to get to the bottom of those cache misses (at a lower
priority).

Thanks,
-vijay



On Tue, Feb 7, 2017 at 11:07 AM, Andy Polyakov <appro at openssl.org> wrote:

> > A72 is running 1GHz compared to x86 at 2.1Ghz. So that should hopefully
> > get down to -1:5.
>
> And Mongoose will take you to ~1:2.5 (scaled to same frequency that is).
> Which I'd say is a fair result. Well, still could have been a bit
> better, but it's not unreasonable given ISA differences. Keep in mind
> that presented x86_64 result is for code utilizing Intel-specific code
> extensions.
>
> > There is no L3 cache on the A72 eval board and performance counters do
> > show 9x more DRAM accesses for ARM compared to x86.
>
> This is unexpected, because it takes *less* references to memory to
> perform it on ARMv8. Because it has larger register bank. And cache
> requirement is not that high for L3 to kick in... But at any case memory
> is not bottleneck here...
>
> --
> openssl-users mailing list
> To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mta.openssl.org/pipermail/openssl-users/attachments/20170207/9f6be56f/attachment.html>


More information about the openssl-users mailing list