[openssl-users] BN_MUL_MONT for ARM64 v8
Mike Mohr
akihana at gmail.com
Wed Feb 8 11:26:17 UTC 2017
Of course OpenSSL contains hand-optimized assembly routines. However, GMP
has been around since at least 1993 and the library specifically targets
heavily optimized multiple precision arithmetic. OpenSSL is a TLS/SSL
toolkit, and necessarily focuses on implementing SSL/TLS correctly - I'd
argue that the bigint subsystem is almost tangential to the other parts of
any SSL library. A less optimized bigint subsystem should be reasonably
expected. I would be surprised if the native bigint code could compete
against GMP performance-wise, even when OpenSSL's optimized assembly code
is used. I haven't benchmarked OpenSSL's bigint subsystem and would be
interested in seeing a comparison against a correctly configured GMP.
On Tue, Feb 7, 2017 at 4:46 PM, Jakob Bohm <jb-openssl at wisemo.com> wrote:
> OpenSSL also has a lot of handwritten assembly language for ARM,
> x86 etc. Most of it written by Andy Polyakov.
>
> His response about what can and cannot be done on various ARM CPU
> models is most probably a result of this work.
>
> Also, OpenSSL has a more permissive license than the GMP, so using
> GMP in OpenSSL would cause problems for many OpenSSL using
> applications.
>
> On 08/02/2017 00:31, Mike Mohr wrote:
>
>> Have you considered using GMP as a big integer backed for openssl? It
>> has support for several arm variants using handwritten assembly code
>> and the developers go to great lengths to find optimize runtime on all
>> supported platforms.
>>
>> On Feb 7, 2017 2:26 PM, "Vijay Chander" <vijay.chander at gmail.com
>> <mailto:vijay.chander at gmail.com>> wrote:
>>
>> Andy,
>> 1:2.5 is pretty in my opinion for ARM !
>>
>> We will check out Mongoose.
>>
>> Hmm - will try to get to the bottom of those cache misses (at a
>> lower priority).
>>
>> Thanks,
>> -vijay
>>
>>
>> On Tue, Feb 7, 2017 at 11:07 AM, Andy Polyakov <appro at openssl.org
>> <mailto:appro at openssl.org>> wrote:
>>
>> > A72 is running 1GHz compared to x86 at 2.1Ghz. So that should
>> hopefully
>> > get down to -1:5.
>>
>> And Mongoose will take you to ~1:2.5 (scaled to same frequency
>> that is).
>> Which I'd say is a fair result. Well, still could have been a bit
>> better, but it's not unreasonable given ISA differences. Keep
>> in mind
>> that presented x86_64 result is for code utilizing
>> Intel-specific code
>> extensions.
>>
>> > There is no L3 cache on the A72 eval board and performance
>> counters do
>> > show 9x more DRAM accesses for ARM compared to x86.
>>
>> This is unexpected, because it takes *less* references to
>> memory to
>> perform it on ARMv8. Because it has larger register bank. And
>> cache
>> requirement is not that high for L3 to kick in... But at any
>> case memory
>> is not bottleneck here...
>>
>>
>
> --
> Jakob Bohm, CIO, partner, WiseMo A/S. https://www.wisemo.com
> Transformervej 29, 2860 Soborg, Denmark. direct: +45 31 13 16 10 <tel:
> +4531131610>
> This message is only for its intended recipient, delete if misaddressed.
> WiseMo - Remote Service Management for PCs, Phones and Embedded
>
>
> Enjoy
>
> Jakob
> --
> Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
> Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
> This public discussion message is non-binding and may contain errors.
> WiseMo - Remote Service Management for PCs, Phones and Embedded
>
> --
> openssl-users mailing list
> To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mta.openssl.org/pipermail/openssl-users/attachments/20170208/46c563b8/attachment.html>
More information about the openssl-users
mailing list