[openssl-dev] ARM optimised montgomery multiplication (armv4-mont)

Ravichandra mynidiravichandra at gmail.com
Wed Jun 17 11:18:05 UTC 2015


Hi Andy,
    When using on armv8 architecture, does this mont mul ASM code have any
optimization with linux-aarch64 configuration?

Thanks
Ravichandra

On Wed, Jun 17, 2015 at 3:06 PM, Andy Polyakov <appro at openssl.org> wrote:

> Hi,
>
> >>>>> With some experimentation, it turns out that if I *stop* using the
> >>>>> crypto/bn/asm/bn/armv4-mont.pl generated asm "optimised" version,
> the time for
> >>>>> a simplish test to establish and close a simple SSL connection went
> from 28
> >>>>> seconds to 18. (It's quite a slow target at any time).
> >>>>>
> >>>>> In other words, this "optimised" version has slowed things down
> dramatically.
> >>>>> Has anyone queried the value of the asm of armv4-mont.pl any time
> in the last
> >>>>> few years?
> > [snip]
> >
> > I found the cause - although OPENSSL_BN_ASM_MONT was defined, I hadn't
> noticed
> > that a colleague had put a #define OPENSSL_NO_ASM somewhere else (this
> isn't
> > linux but a port to our own OS). It turns out that (surprisingly) this
> > combination changes behaviour rather than barfing - it's even explicitly
> > catered for in bn_asm.c.
>
> In other words sanity restored. Phew! Incidentally, as next step I was
> going to ask for copy of your bn_asm.o (yes, binary .o, yes, bn_asm.o,
> not armv4-mont.o), and bn_mul_mont should have shown up and presumably
> noticed as unexpected...
>
> > Regardless, the effect is that a different bn_mul_mont implementation
> gets
> > used, and the armv4-mont.pl implementation gets ignored entirely.
>
> Right. And as mentioned in commentary bn_mul_mont in bn_asm.c is just a
> template with no performance promises attached. Note that it still
> exhibits previously mentioned breaking point...
>
> > With that fixed, I now have greatly improved performance as expected.
>
> So that with armv4-mont actually in the loop, the breaking point is
> still beyond practical key lengths, even on ARM9.
>
> > An
> > unfortunate waste of time for us both, but thanks for the assistance.
>
> Given that presented timings for ARM9 are kind of astronomic you might
> want to consider if it's possible to use other algorithms. EC is getting
> wider adoption now and can perform better. Not to mention that optimized
> NIST P-256 EC was recently added...
>
> Cheers.
>
> _______________________________________________
> openssl-dev mailing list
> To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mta.openssl.org/pipermail/openssl-dev/attachments/20150617/faca03e6/attachment-0001.html>


More information about the openssl-dev mailing list