[openssl-dev] Making assembly language optimizations working on Cortex-M3

Brian Smith brian at briansmith.org
Wed May 25 21:55:34 UTC 2016


[Sorry for the **long** delay in responding.]

Andy Polyakov <appro at openssl.org> wrote:

>
> http://git.openssl.org/gitweb/?p=openssl.git;a=commitdiff;h=11208dcfb9105e8afa37233185decefd45e89e17
> made whole assembly pack Thumb2-friendly, so that now you should be able
> to compile all modules. Please, double-check.


This is awesome!

I have a question about the `it` and `itt` instructions you inserted. You
wrapped them in `#ifdef __thumb2__`, which is not wrong, but AFAICT is
usually unnecessary. Is this to support some old assemblers that don't
compile `it` (etc.) into nothing for non-Thumb builds?


> There is no option to
> disable NEON (yet?), because a) I want to expose it to more build cases
> to catch eventual bugs; b) would like to suggest idea of supporting
> Cortex-M with -march=armv6t2 -mthumb. Latter means that you'll loose
> some performance, because it won't utilize word load instruction's
> capability to handle misaligned access in ARMv7. But on the other hand
> it won't have ideas about compiling NEON, and you'll be excused to think
> about which particular Cortex-M is targeted, one will be able to cover
> all with single config/buid. Can it be viable compromise? One would
> still be able to tune for favorite Mx...


For Cortex-M4 and friends, one would really want to use the full
ARMv7-M instruction
set (i.e. not compile for armv6t2). In general Cortex-M platforms are so
limited that every bit of performance and space savings matters. So, I
think it is definitely worthwhile to support the non-NEON ARMv7-M
configuration. One easy way to do this would be to avoid building NEON code
when __TARGET_PROFILE_M is defined. Alternatively, similar to what
BoringSSL did, you could have an option that says "instead of doing runtime
feature detection, instead detect features at compile time based on
__ARM_NEON__ and the like." I think such a configuration would also help
the C compiler do whole-program optimization better.

Again, thanks for doing this!

Cheers,
Brian
-- 
https://briansmith.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mta.openssl.org/pipermail/openssl-dev/attachments/20160525/8b8168dd/attachment.html>


More information about the openssl-dev mailing list