[openssl-commits] [openssl] OpenSSL-fips-2_0-stable update
Andy Polyakov
appro at openssl.org
Wed May 13 14:49:59 UTC 2015
The branch OpenSSL-fips-2_0-stable has been updated
via 34f39b062c76fbd3082521b26edee7f53afc061d (commit)
via 6db8e3bdc9ef83d83b83f3eec9722c96daa91f82 (commit)
via 50e2a0ea4615124aa159e8f43317dedcf0cfcaa2 (commit)
via 3f137e6f1d326fee773a8af363f051d331c46fd2 (commit)
via 97fbb0c88c2f601f98e25e57b9f6f9679d14f3a8 (commit)
via 5837e90f08ffcf5ad84933793bc285630018ce26 (commit)
via 874faf2ffb22187ad5483d9691a3a2eb7112f161 (commit)
via 0b45df73d2b4cd52a390f2345ff52fb6705f2eba (commit)
via 2bd3976ed01e76496a509ecd3443559f2be6f60c (commit)
via c6d109051d1c2b9a453427a2a53ad3d40acc9276 (commit)
via 083ed53defb42ab4d3488bc7f80d9170d22293e7 (commit)
via b84813ec017cb03b8dd0b85bce2bb3e021c45685 (commit)
from 7447e65fccc95fa2ee97b40e43dc46f97e7b958b (commit)
- Log -----------------------------------------------------------------
commit 34f39b062c76fbd3082521b26edee7f53afc061d
Author: Andy Polyakov <appro at openssl.org>
Date: Mon May 11 12:16:01 2015 +0200
util/incore update that allows FINGERPRINT_premain-free build.
As for complementary fips.c modification. Goal is to ensure that
FIPS_signature does not end up in .bss segment, one guaranteed to
be zeroed upon program start-up. One would expect explicitly
initialized values to end up in .data segment, but it turned out
that values explicitly initialized with zeros can end up in .bss.
The modification does not affect program flow, because first byte
was the only one of significance [to FINGERPRINT_premain].
Reviewed-by: Dr. Stephen Henson <steve at openssl.org>
commit 6db8e3bdc9ef83d83b83f3eec9722c96daa91f82
Author: Andy Polyakov <appro at openssl.org>
Date: Mon May 11 12:04:12 2015 +0200
Add support for Android 5, both 32- and 64-bit cases.
Special note about additional -pie flag in android-armv7. The initial
reason for adding it is that Android 5 refuses to execute non-PIE
binaries. But what about older systems and previously validated
platforms? It should be noted that flag is not used when compiling
object code, fipscanister.o in this context, only when linking
applications, *supplementary* fips_algvs used during validation
procedure.
Reviewed-by: Dr. Stephen Henson <steve at openssl.org>
commit 50e2a0ea4615124aa159e8f43317dedcf0cfcaa2
Author: Andy Polyakov <appro at openssl.org>
Date: Mon May 11 11:56:30 2015 +0200
Additional vxWorks target.
Reviewed-by: Dr. Stephen Henson <steve at openssl.org>
commit 3f137e6f1d326fee773a8af363f051d331c46fd2
Author: Andy Polyakov <appro at openssl.org>
Date: Mon May 11 11:55:19 2015 +0200
fipsalgtest.pl update.
Reviewed-by: Dr. Stephen Henson <steve at openssl.org>
commit 97fbb0c88c2f601f98e25e57b9f6f9679d14f3a8
Author: Andy Polyakov <appro at openssl.org>
Date: Mon May 11 11:53:41 2015 +0200
Configure: add ios-cross target with ARM assembly support.
Reviewed-by: Dr. Stephen Henson <steve at openssl.org>
commit 5837e90f08ffcf5ad84933793bc285630018ce26
Author: Andy Polyakov <appro at openssl.org>
Date: Mon May 11 11:50:29 2015 +0200
Add iOS-specific armv4cpud.S module.
Normally it would be generated from a perlasm module, but doing so
would affect existing armv4cpuid.S, which in turn would formally void
previously validated platforms. Hense separate module is generated.
Reviewed-by: Dr. Stephen Henson <steve at openssl.org>
commit 874faf2ffb22187ad5483d9691a3a2eb7112f161
Author: Andy Polyakov <appro at openssl.org>
Date: Mon May 11 11:43:55 2015 +0200
Adapt ARM assembly pack for iOS.
This is achieved by filtering perlasm output through arm-xlate.pl. But note
that it's done only if "flavour" argument is not 'void'. As 'void' is
default value for other ARM targets, permasm output is not actually
filtered on previously validated platforms.
Reviewed-by: Dr. Stephen Henson <steve at openssl.org>
commit 0b45df73d2b4cd52a390f2345ff52fb6705f2eba
Author: Andy Polyakov <appro at openssl.org>
Date: Mon May 11 11:20:52 2015 +0200
crypto/modes/modes_lcl.h: let STRICT_ALIGNMENT be on iOS.
While ARMv7 in general is capable of unaligned access, not all instructions
actually are. And trouble is that compiler doesn't seem to differentiate
those capable and incapable of unaligned access. As result exceptions could
be observed in xts128.c and ccm128.c modules. Contemporary Linux kernels
handle such exceptions by performing requested operation and resuming
execution as is if it succeeded. While on iOS exception is fatal.
Correct solution is to let STRICT_ALIGNMENT be on all ARM platforms,
but doing so is in formal conflict with FIPS maintenance policy.
Reviewed-by: Dr. Stephen Henson <steve at openssl.org>
commit 2bd3976ed01e76496a509ecd3443559f2be6f60c
Author: Andy Polyakov <appro at openssl.org>
Date: Mon May 11 11:39:04 2015 +0200
Add iOS-specific fips_algvs application.
Reviewed-by: Dr. Stephen Henson <steve at openssl.org>
commit c6d109051d1c2b9a453427a2a53ad3d40acc9276
Author: Andy Polyakov <appro at openssl.org>
Date: Mon May 11 11:36:48 2015 +0200
Configure: engage ARMv8 assembly pack in ios64-cross target.
Reviewed-by: Dr. Stephen Henson <steve at openssl.org>
commit 083ed53defb42ab4d3488bc7f80d9170d22293e7
Author: Andy Polyakov <appro at openssl.org>
Date: Mon May 11 11:34:56 2015 +0200
Engage ARMv8 assembly pack.
Reviewed-by: Dr. Stephen Henson <steve at openssl.org>
commit b84813ec017cb03b8dd0b85bce2bb3e021c45685
Author: Andy Polyakov <appro at openssl.org>
Date: Mon May 11 11:18:04 2015 +0200
Add ARMv8 assembly pack.
Reviewed-by: Dr. Stephen Henson <steve at openssl.org>
-----------------------------------------------------------------------
Summary of changes:
Configure | 12 +-
config | 11 +-
crypto/Makefile | 1 +
crypto/aes/Makefile | 4 +
crypto/aes/asm/aes-armv4.pl | 31 +-
crypto/aes/asm/aesv8-armx.pl | 968 ++++++++++++++++++++++++++++++
crypto/arm64cpuid.pl | 68 +++
crypto/arm_arch.h | 17 +-
crypto/armcap.c | 26 +
crypto/armv4cpuid_ios.S | 210 +++++++
crypto/bn/asm/armv4-gf2m.pl | 23 +-
crypto/bn/asm/armv4-mont.pl | 16 +-
crypto/evp/e_aes.c | 113 ++++
crypto/modes/Makefile | 3 +
crypto/modes/asm/ghash-armv4.pl | 33 +-
crypto/modes/asm/ghashv8-armx.pl | 376 ++++++++++++
crypto/modes/gcm128.c | 27 +-
crypto/modes/modes_lcl.h | 17 +-
crypto/perlasm/arm-xlate.pl | 165 ++++++
crypto/sha/Makefile | 3 +
crypto/sha/asm/sha1-armv4-large.pl | 16 +-
crypto/sha/asm/sha1-armv8.pl | 343 +++++++++++
crypto/sha/asm/sha256-armv4.pl | 16 +-
crypto/sha/asm/sha512-armv4.pl | 22 +-
crypto/sha/asm/sha512-armv8.pl | 428 ++++++++++++++
fips/fips.c | 2 +-
fips/fips_canister.c | 1 +
fips/fips_test_suite.c | 6 +
fips/fipsalgtest.pl | 38 +-
fips/fipssyms.h | 44 ++
iOS/Makefile | 76 +++
iOS/fips_algvs.app/Entitlements.plist | 8 +
iOS/fips_algvs.app/Info.plist | 24 +
iOS/fips_algvs.app/ResourceRules.plist | 25 +
iOS/fopen.m | 93 +++
iOS/incore_macho.c | 1016 ++++++++++++++++++++++++++++++++
test/fips_algvs.c | 71 +++
util/incore | 7 +-
38 files changed, 4280 insertions(+), 80 deletions(-)
create mode 100644 crypto/aes/asm/aesv8-armx.pl
create mode 100644 crypto/arm64cpuid.pl
create mode 100644 crypto/armv4cpuid_ios.S
create mode 100644 crypto/modes/asm/ghashv8-armx.pl
create mode 100644 crypto/perlasm/arm-xlate.pl
create mode 100644 crypto/sha/asm/sha1-armv8.pl
create mode 100644 crypto/sha/asm/sha512-armv8.pl
create mode 100644 iOS/Makefile
create mode 100644 iOS/fips_algvs.app/Entitlements.plist
create mode 100644 iOS/fips_algvs.app/Info.plist
create mode 100644 iOS/fips_algvs.app/ResourceRules.plist
create mode 100644 iOS/fopen.m
create mode 100644 iOS/incore_macho.c
diff --git a/Configure b/Configure
index 8fc25f4..613f829 100755
--- a/Configure
+++ b/Configure
@@ -136,6 +136,7 @@ my $mips32_asm=":bn-mips.o::aes_cbc.o aes-mips.o:::sha1-mips.o sha256-mips.o::::
my $mips64_asm=":bn-mips.o mips-mont.o::aes_cbc.o aes-mips.o:::sha1-mips.o sha256-mips.o sha512-mips.o::::::::";
my $s390x_asm="s390xcap.o s390xcpuid.o:bn-s390x.o s390x-mont.o s390x-gf2m.o::aes_ctr.o aes-s390x.o:::sha1-s390x.o sha256-s390x.o sha512-s390x.o::rc4-s390x.o:::::ghash-s390x.o:";
my $armv4_asm="armcap.o armv4cpuid.o:bn_asm.o armv4-mont.o armv4-gf2m.o::aes_cbc.o aes-armv4.o:::sha1-armv4-large.o sha256-armv4.o sha512-armv4.o:::::::ghash-armv4.o::void";
+my $aarch64_asm="armcap.o arm64cpuid.o mem_clr.o:::aes_core.o aes_cbc.o aesv8-armx.o:::sha1-armv8.o sha256-armv8.o sha512-armv8.o:::::::ghashv8-armx.o:";
my $parisc11_asm="pariscid.o:bn_asm.o parisc-mont.o::aes_core.o aes_cbc.o aes-parisc.o:::sha1-parisc.o sha256-parisc.o sha512-parisc.o::rc4-parisc.o:::::ghash-parisc.o::32";
my $parisc20_asm="pariscid.o:pa-risc2W.o parisc-mont.o::aes_core.o aes_cbc.o aes-parisc.o:::sha1-parisc.o sha256-parisc.o sha512-parisc.o::rc4-parisc.o:::::ghash-parisc.o::64";
my $ppc32_asm="ppccpuid.o ppccap.o:bn-ppc.o ppc-mont.o ppc64-mont.o::aes_core.o aes_cbc.o aes-ppc.o:::sha1-ppc.o sha256-ppc.o::::::::";
@@ -404,7 +405,8 @@ my %table=(
# Android: linux-* but without -DTERMIO and pointers to headers and libs.
"android","gcc:-mandroid -I\$(ANDROID_DEV)/include -B\$(ANDROID_DEV)/lib -O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${no_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"android-x86","gcc:-mandroid -I\$(ANDROID_DEV)/include -B\$(ANDROID_DEV)/lib -O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:".eval{my $asm=${x86_elf_asm};$asm=~s/:elf/:android/;$asm}.":dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
-"android-armv7","gcc:-march=armv7-a -mandroid -I\$(ANDROID_DEV)/include -B\$(ANDROID_DEV)/lib -O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${armv4_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+"android-armv7","gcc:-march=armv7-a -mandroid -I\$(ANDROID_DEV)/include -B\$(ANDROID_DEV)/lib -O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-pie%-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${armv4_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
+"android64-aarch64","gcc:-mandroid -fPIC -I\$(ANDROID_DEV)/include -B\$(ANDROID_DEV)/lib -O3 -Wall::-D_REENTRANT::-pie%-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${aarch64_asm}:linux64:dlfcn:linux-shared:::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
#### *BSD [do see comment about ${BSDthreads} above!]
"BSD-generic32","gcc:-DTERMIOS -O3 -fomit-frame-pointer -Wall::${BSDthreads}:::BN_LLONG RC2_CHAR RC4_INDEX DES_INT DES_UNROLL:${no_asm}:dlfcn:bsd-gcc-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
@@ -586,7 +588,8 @@ my %table=(
"debug-darwin-ppc-cc","cc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DCRYPTO_MDEBUG -DB_ENDIAN -g -Wall -O::-D_REENTRANT:MACOSX::BN_LLONG RC4_CHAR RC4_CHUNK DES_UNROLL BF_PTR:${ppc32_asm}:osx32:dlfcn:darwin-shared:-fPIC:-dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
# iPhoneOS/iOS
"iphoneos-cross","llvm-gcc:-O3 -isysroot \$(CROSS_TOP)/SDKs/\$(CROSS_SDK) -fomit-frame-pointer -fno-common::-D_REENTRANT:iOS:-Wl,-search_paths_first%:BN_LLONG RC4_CHAR RC4_CHUNK DES_UNROLL BF_PTR:${no_asm}:dlfcn:darwin-shared:-fPIC -fno-common:-dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
-"ios64-cross","clang:-O3 -arch arm64 -mios-version-min=7.0.0 -isysroot \$(CROSS_TOP)/SDKs/\$(CROSS_SDK) -fno-common::-D_REENTRANT:iOS:-Wl,-search_paths_first%:SIXTY_FOUR_BIT_LONG RC4_CHAR -RC4_CHUNK DES_INT DES_UNROLL -BF_PTR:${no_asm}:dlfcn:darwin-shared:-fPIC -fno-common:-dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
+"ios-cross","cc:-O3 -arch armv7 -mios-version-min=7.0.0 -isysroot \$(CROSS_TOP)/SDKs/\$(CROSS_SDK) -fno-common::-D_REENTRANT:iOS:-Wl,-search_paths_first%:BN_LLONG RC4_CHAR RC4_CHUNK DES_UNROLL BF_PTR:armcap.o armv4cpuid_ios.o:bn_asm.o armv4-mont.o armv4-gf2m.o::aes_cbc.o aes-armv4.o:::sha1-armv4-large.o sha256-armv4.o sha512-armv4.o:::::::ghash-armv4.o::ios32:dlfcn:darwin-shared:-fPIC -fno-common:-dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
+"ios64-cross","cc:-O3 -arch arm64 -mios-version-min=7.0.0 -isysroot \$(CROSS_TOP)/SDKs/\$(CROSS_SDK) -fno-common::-D_REENTRANT:iOS:-Wl,-search_paths_first%:SIXTY_FOUR_BIT_LONG RC4_CHAR -RC4_CHUNK DES_INT DES_UNROLL -BF_PTR:${aarch64_asm}:ios64:dlfcn:darwin-shared:-fPIC -fno-common:-dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
##### A/UX
"aux3-gcc","gcc:-O2 -DTERMIO::(unknown):AUX:-lbsd:RC4_CHAR RC4_CHUNK DES_UNROLL BF_PTR:::",
@@ -603,6 +606,7 @@ my %table=(
##### VxWorks for various targets
"vxworks-ppc60x","ccppc:-D_REENTRANT -mrtp -mhard-float -mstrict-align -fno-implicit-fp -DPPC32_fp60x -O2 -fstrength-reduce -fno-builtin -fno-strict-aliasing -Wall -DCPU=PPC32 -DTOOL_FAMILY=gnu -DTOOL=gnu -I\$(WIND_BASE)/target/usr/h -I\$(WIND_BASE)/target/usr/h/wrn/coreip:::VXWORKS:-Wl,--defsym,__wrs_rtp_base=0xe0000000 -L \$(WIND_BASE)/target/usr/lib/ppc/PPC32/common:::::",
"vxworks-ppcgen","ccppc:-D_REENTRANT -mrtp -msoft-float -mstrict-align -O1 -fno-builtin -fno-strict-aliasing -Wall -DCPU=PPC32 -DTOOL_FAMILY=gnu -DTOOL=gnu -I\$(WIND_BASE)/target/usr/h -I\$(WIND_BASE)/target/usr/h/wrn/coreip:::VXWORKS:-Wl,--defsym,__wrs_rtp_base=0xe0000000 -L \$(WIND_BASE)/target/usr/lib/ppc/PPC32/sfcommon:::::",
+"vxworks-ppcgen-kernel","ccppc:-D_REENTRANT -msoft-float -mstrict-align -O1 -fno-builtin -fno-strict-aliasing -Wall -DCPU=PPC32 -DTOOL_FAMILY=gnu -DTOOL=gnu -I\$(WIND_BASE)/target/h -I\$(WIND_BASE)/target/h/wrn/coreip:::VXWORKS::::::",
"vxworks-ppc405","ccppc:-g -msoft-float -mlongcall -DCPU=PPC405 -I\$(WIND_BASE)/target/h:::VXWORKS:-r:::::",
"vxworks-ppc750","ccppc:-ansi -nostdinc -DPPC750 -D_REENTRANT -fvolatile -fno-builtin -fno-for-scope -fsigned-char -Wall -msoft-float -mlongcall -DCPU=PPC604 -I\$(WIND_BASE)/target/h \$(DEBUG_FLAG):::VXWORKS:-r:::::",
"vxworks-ppc750-debug","ccppc:-ansi -nostdinc -DPPC750 -D_REENTRANT -fvolatile -fno-builtin -fno-for-scope -fsigned-char -Wall -msoft-float -mlongcall -DCPU=PPC604 -I\$(WIND_BASE)/target/h -DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DBN_CTX_DEBUG -DCRYPTO_MDEBUG -DPEDANTIC -DDEBUG_SAFESTACK -DDEBUG -g:::VXWORKS:-r:::::",
@@ -1565,7 +1569,7 @@ if ($rmd160_obj =~ /\.o$/)
}
if ($aes_obj =~ /\.o$/)
{
- $cflags.=" -DAES_ASM";
+ $cflags.=" -DAES_ASM" if ($aes_obj =~ m/\baes\-/);
# aes_ctr.o is not a real file, only indication that assembler
# module implements AES_ctr32_encrypt...
$cflags.=" -DAES_CTR_ASM" if ($aes_obj =~ s/\s*aes_ctr\.o//);
@@ -1586,7 +1590,7 @@ else {
$wp_obj="wp_block.o";
}
$cmll_obj=$cmll_enc unless ($cmll_obj =~ /.o$/);
-if ($modes_obj =~ /ghash/)
+if ($modes_obj =~ /ghash\-/)
{
$cflags.=" -DGHASH_ASM";
}
diff --git a/config b/config
index b858d80..9d0383e 100755
--- a/config
+++ b/config
@@ -383,6 +383,10 @@ case "${SYSTEM}:${RELEASE}:${VERSION}:${MACHINE}" in
echo "nsr-tandem-nsk"; exit 0;
;;
+ vxworks:kernel*)
+ echo "${MACHINE}-kernel-vxworks"; exit 0;
+ ;;
+
vxworks*)
echo "${MACHINE}-whatever-vxworks"; exit 0;
;;
@@ -584,8 +588,9 @@ case "$GUESSOS" in
*-*-iphoneos)
options="$options -arch%20${MACHINE}"
OUT="iphoneos-cross" ;;
- arm64-*-ios64)
- options="$options -arch%20${MACHINE}"
+ armv7-*-ios)
+ OUT="ios-cross" ;;
+ arm64-*-ios*)
OUT="ios64-cross" ;;
alpha-*-linux2)
ISA=`awk '/cpu model/{print$4;exit(0);}' /proc/cpuinfo`
@@ -612,6 +617,7 @@ case "$GUESSOS" in
;;
ppc-*-linux2) OUT="linux-ppc" ;;
ppc60x-*-vxworks*) OUT="vxworks-ppc60x" ;;
+ ppcgen-kernel-vxworks*) OUT="vxworks-ppcgen-kernel" ;;
ppcgen-*-vxworks*) OUT="vxworks-ppcgen" ;;
pentium-*-vxworks*) OUT="vxworks-pentium" ;;
simlinux-*-vxworks*) OUT="vxworks-simlinux" ;;
@@ -866,6 +872,7 @@ case "$GUESSOS" in
*-*-qnx6) OUT="QNX6" ;;
x86-*-android|i?86-*-android) OUT="android-x86" ;;
armv[7-9]*-*-android) OUT="android-armv7" ;;
+ aarch64-*-android) OUT="android64-aarch64" ;;
*) OUT=`echo $GUESSOS | awk -F- '{print $3}'`;;
esac
diff --git a/crypto/Makefile b/crypto/Makefile
index 22cb2a5..7304684 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -87,6 +87,7 @@ ppccpuid.s: ppccpuid.pl; $(PERL) ppccpuid.pl $(PERLASM_SCHEME) $@
pariscid.s: pariscid.pl; $(PERL) pariscid.pl $(PERLASM_SCHEME) $@
alphacpuid.s: alphacpuid.pl
$(PERL) $< | $(CC) -E - | tee $@ > /dev/null
+arm64cpuid.S: arm64cpuid.pl; $(PERL) arm64cpuid.pl $(PERLASM_SCHEME) > $@
subdirs:
@target=all; $(RECURSIVE_MAKE)
diff --git a/crypto/aes/Makefile b/crypto/aes/Makefile
index 8edd358..1d9e82a 100644
--- a/crypto/aes/Makefile
+++ b/crypto/aes/Makefile
@@ -78,6 +78,10 @@ aes-parisc.s: asm/aes-parisc.pl
aes-mips.S: asm/aes-mips.pl
$(PERL) asm/aes-mips.pl $(PERLASM_SCHEME) $@
+aesv8-armx.S: asm/aesv8-armx.pl
+ $(PERL) asm/aesv8-armx.pl $(PERLASM_SCHEME) $@
+aesv8-armx.o: aesv8-armx.S
+
# GNU make "catch all"
aes-%.S: asm/aes-%.pl; $(PERL) $< $(PERLASM_SCHEME) $@
aes-armv4.o: aes-armv4.S
diff --git a/crypto/aes/asm/aes-armv4.pl b/crypto/aes/asm/aes-armv4.pl
index 55b6e04..ed51258 100644
--- a/crypto/aes/asm/aes-armv4.pl
+++ b/crypto/aes/asm/aes-armv4.pl
@@ -32,8 +32,20 @@
# Profiler-assisted and platform-specific optimization resulted in 16%
# improvement on Cortex A8 core and ~21.5 cycles per byte.
-while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {}
-open STDOUT,">$output";
+$flavour = shift;
+if ($flavour=~/^\w[\w\-]*\.\w+$/) { $output=$flavour; undef $flavour; }
+else { while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {} }
+
+if ($flavour && $flavour ne "void") {
+ $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
+ ( $xlate="${dir}arm-xlate.pl" and -f $xlate ) or
+ ( $xlate="${dir}../../perlasm/arm-xlate.pl" and -f $xlate) or
+ die "can't locate arm-xlate.pl";
+
+ open STDOUT,"| \"$^X\" $xlate $flavour $output";
+} else {
+ open STDOUT,">$output";
+}
$s0="r0";
$s1="r1";
@@ -171,7 +183,12 @@ AES_encrypt:
stmdb sp!,{r1,r4-r12,lr}
mov $rounds,r0 @ inp
mov $key,r2
+#ifdef __APPLE__
+ mov $tbl,#AES_encrypt-AES_Te
+ sub $tbl,r3,$tbl @ Te
+#else
sub $tbl,r3,#AES_encrypt-AES_Te @ Te
+#endif
#if __ARM_ARCH__<7
ldrb $s0,[$rounds,#3] @ load input data in endian-neutral
ldrb $t1,[$rounds,#2] @ manner...
@@ -425,7 +442,12 @@ AES_set_encrypt_key:
bne .Labrt
.Lok: stmdb sp!,{r4-r12,lr}
+#ifdef __APPLE__
+ mov $tbl,#AES_set_encrypt_key-AES_Te-1024
+ sub $tbl,r3,$tbl @ Te4
+#else
sub $tbl,r3,#AES_set_encrypt_key-AES_Te-1024 @ Te4
+#endif
mov $rounds,r0 @ inp
mov lr,r1 @ bits
@@ -886,7 +908,12 @@ AES_decrypt:
stmdb sp!,{r1,r4-r12,lr}
mov $rounds,r0 @ inp
mov $key,r2
+#ifdef __APPLE__
+ mov $tbl,#AES_decrypt-AES_Td
+ sub $tbl,r3,$tbl @ Td
+#else
sub $tbl,r3,#AES_decrypt-AES_Td @ Td
+#endif
#if __ARM_ARCH__<7
ldrb $s0,[$rounds,#3] @ load input data in endian-neutral
ldrb $t1,[$rounds,#2] @ manner...
diff --git a/crypto/aes/asm/aesv8-armx.pl b/crypto/aes/asm/aesv8-armx.pl
new file mode 100644
index 0000000..104f417
--- /dev/null
+++ b/crypto/aes/asm/aesv8-armx.pl
@@ -0,0 +1,968 @@
+#!/usr/bin/env perl
+#
+# ====================================================================
+# Written by Andy Polyakov <appro at openssl.org> for the OpenSSL
+# project. The module is, however, dual licensed under OpenSSL and
+# CRYPTOGAMS licenses depending on where you obtain it. For further
+# details see http://www.openssl.org/~appro/cryptogams/.
+# ====================================================================
+#
+# This module implements support for ARMv8 AES instructions. The
+# module is endian-agnostic in sense that it supports both big- and
+# little-endian cases. As does it support both 32- and 64-bit modes
+# of operation. Latter is achieved by limiting amount of utilized
+# registers to 16, which implies additional NEON load and integer
+# instructions. This has no effect on mighty Apple A7, where results
+# are literally equal to the theoretical estimates based on AES
+# instruction latencies and issue rates. On Cortex-A53, an in-order
+# execution core, this costs up to 10-15%, which is partially
+# compensated by implementing dedicated code path for 128-bit
+# CBC encrypt case. On Cortex-A57 parallelizable mode performance
+# seems to be limited by sheer amount of NEON instructions...
+#
+# Performance in cycles per byte processed with 128-bit key:
+#
+# CBC enc CBC dec CTR
+# Apple A7 2.39 1.20 1.20
+# Cortex-A53 2.45 1.87 1.94
+# Cortex-A57 3.64 1.34 1.32
+
+$flavour = shift;
+$output = shift;
+
+$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
+( $xlate="${dir}arm-xlate.pl" and -f $xlate ) or
+( $xlate="${dir}../../perlasm/arm-xlate.pl" and -f $xlate) or
+die "can't locate arm-xlate.pl";
+
+open OUT,"| \"$^X\" $xlate $flavour $output";
+*STDOUT=*OUT;
+
+$prefix="aes_v8";
+
+$code=<<___;
+#include "arm_arch.h"
+
+#if __ARM_ARCH__>=7
+.text
+___
+$code.=".arch armv8-a+crypto\n" if ($flavour =~ /64/);
+$code.=".fpu neon\n.code 32\n" if ($flavour !~ /64/);
+
+# Assembler mnemonics are an eclectic mix of 32- and 64-bit syntax,
+# NEON is mostly 32-bit mnemonics, integer - mostly 64. Goal is to
+# maintain both 32- and 64-bit codes within single module and
+# transliterate common code to either flavour with regex vodoo.
+#
+{{{
+my ($inp,$bits,$out,$ptr,$rounds)=("x0","w1","x2","x3","w12");
+my ($zero,$rcon,$mask,$in0,$in1,$tmp,$key)=
+ $flavour=~/64/? map("q$_",(0..6)) : map("q$_",(0..3,8..10));
+
+
+$code.=<<___;
+.align 5
+.Lrcon:
+.long 0x01,0x01,0x01,0x01
+.long 0x0c0f0e0d,0x0c0f0e0d,0x0c0f0e0d,0x0c0f0e0d // rotate-n-splat
+.long 0x1b,0x1b,0x1b,0x1b
+
+.globl ${prefix}_set_encrypt_key
+.type ${prefix}_set_encrypt_key,%function
+.align 5
+${prefix}_set_encrypt_key:
+.Lenc_key:
+___
+$code.=<<___ if ($flavour =~ /64/);
+ stp x29,x30,[sp,#-16]!
+ add x29,sp,#0
+___
+$code.=<<___;
+ mov $ptr,#-1
+ cmp $inp,#0
+ b.eq .Lenc_key_abort
+ cmp $out,#0
+ b.eq .Lenc_key_abort
+ mov $ptr,#-2
+ cmp $bits,#128
+ b.lt .Lenc_key_abort
+ cmp $bits,#256
+ b.gt .Lenc_key_abort
+ tst $bits,#0x3f
+ b.ne .Lenc_key_abort
+
+ adr $ptr,.Lrcon
+ cmp $bits,#192
+
+ veor $zero,$zero,$zero
+ vld1.8 {$in0},[$inp],#16
+ mov $bits,#8 // reuse $bits
+ vld1.32 {$rcon,$mask},[$ptr],#32
+
+ b.lt .Loop128
+ b.eq .L192
+ b .L256
+
+.align 4
+.Loop128:
+ vtbl.8 $key,{$in0},$mask
+ vext.8 $tmp,$zero,$in0,#12
+ vst1.32 {$in0},[$out],#16
+ aese $key,$zero
+ subs $bits,$bits,#1
+
+ veor $in0,$in0,$tmp
+ vext.8 $tmp,$zero,$tmp,#12
+ veor $in0,$in0,$tmp
+ vext.8 $tmp,$zero,$tmp,#12
+ veor $key,$key,$rcon
+ veor $in0,$in0,$tmp
+ vshl.u8 $rcon,$rcon,#1
+ veor $in0,$in0,$key
+ b.ne .Loop128
+
+ vld1.32 {$rcon},[$ptr]
+
+ vtbl.8 $key,{$in0},$mask
+ vext.8 $tmp,$zero,$in0,#12
+ vst1.32 {$in0},[$out],#16
+ aese $key,$zero
+
+ veor $in0,$in0,$tmp
+ vext.8 $tmp,$zero,$tmp,#12
+ veor $in0,$in0,$tmp
+ vext.8 $tmp,$zero,$tmp,#12
+ veor $key,$key,$rcon
+ veor $in0,$in0,$tmp
+ vshl.u8 $rcon,$rcon,#1
+ veor $in0,$in0,$key
+
+ vtbl.8 $key,{$in0},$mask
+ vext.8 $tmp,$zero,$in0,#12
+ vst1.32 {$in0},[$out],#16
+ aese $key,$zero
+
+ veor $in0,$in0,$tmp
+ vext.8 $tmp,$zero,$tmp,#12
+ veor $in0,$in0,$tmp
+ vext.8 $tmp,$zero,$tmp,#12
+ veor $key,$key,$rcon
+ veor $in0,$in0,$tmp
+ veor $in0,$in0,$key
+ vst1.32 {$in0},[$out]
+ add $out,$out,#0x50
+
+ mov $rounds,#10
+ b .Ldone
+
+.align 4
+.L192:
+ vld1.8 {$in1},[$inp],#8
+ vmov.i8 $key,#8 // borrow $key
+ vst1.32 {$in0},[$out],#16
+ vsub.i8 $mask,$mask,$key // adjust the mask
+
+.Loop192:
+ vtbl.8 $key,{$in1},$mask
+ vext.8 $tmp,$zero,$in0,#12
+ vst1.32 {$in1},[$out],#8
+ aese $key,$zero
+ subs $bits,$bits,#1
+
+ veor $in0,$in0,$tmp
+ vext.8 $tmp,$zero,$tmp,#12
+ veor $in0,$in0,$tmp
+ vext.8 $tmp,$zero,$tmp,#12
+ veor $in0,$in0,$tmp
+
+ vdup.32 $tmp,${in0}[3]
+ veor $tmp,$tmp,$in1
+ veor $key,$key,$rcon
+ vext.8 $in1,$zero,$in1,#12
+ vshl.u8 $rcon,$rcon,#1
+ veor $in1,$in1,$tmp
+ veor $in0,$in0,$key
+ veor $in1,$in1,$key
+ vst1.32 {$in0},[$out],#16
+ b.ne .Loop192
+
+ mov $rounds,#12
+ add $out,$out,#0x20
+ b .Ldone
+
+.align 4
+.L256:
+ vld1.8 {$in1},[$inp]
+ mov $bits,#7
+ mov $rounds,#14
+ vst1.32 {$in0},[$out],#16
+
+.Loop256:
+ vtbl.8 $key,{$in1},$mask
+ vext.8 $tmp,$zero,$in0,#12
+ vst1.32 {$in1},[$out],#16
+ aese $key,$zero
+ subs $bits,$bits,#1
+
+ veor $in0,$in0,$tmp
+ vext.8 $tmp,$zero,$tmp,#12
+ veor $in0,$in0,$tmp
+ vext.8 $tmp,$zero,$tmp,#12
+ veor $key,$key,$rcon
+ veor $in0,$in0,$tmp
+ vshl.u8 $rcon,$rcon,#1
+ veor $in0,$in0,$key
+ vst1.32 {$in0},[$out],#16
+ b.eq .Ldone
+
+ vdup.32 $key,${in0}[3] // just splat
+ vext.8 $tmp,$zero,$in1,#12
+ aese $key,$zero
+
+ veor $in1,$in1,$tmp
+ vext.8 $tmp,$zero,$tmp,#12
+ veor $in1,$in1,$tmp
+ vext.8 $tmp,$zero,$tmp,#12
+ veor $in1,$in1,$tmp
+
+ veor $in1,$in1,$key
+ b .Loop256
+
+.Ldone:
+ str $rounds,[$out]
+ mov $ptr,#0
+
+.Lenc_key_abort:
+ mov x0,$ptr // return value
+ `"ldr x29,[sp],#16" if ($flavour =~ /64/)`
+ ret
+.size ${prefix}_set_encrypt_key,.-${prefix}_set_encrypt_key
+
+.globl ${prefix}_set_decrypt_key
+.type ${prefix}_set_decrypt_key,%function
+.align 5
+${prefix}_set_decrypt_key:
+___
+$code.=<<___ if ($flavour =~ /64/);
+ stp x29,x30,[sp,#-16]!
+ add x29,sp,#0
+___
+$code.=<<___ if ($flavour !~ /64/);
+ stmdb sp!,{r4,lr}
+___
+$code.=<<___;
+ bl .Lenc_key
+
+ cmp x0,#0
+ b.ne .Ldec_key_abort
+
+ sub $out,$out,#240 // restore original $out
+ mov x4,#-16
+ add $inp,$out,x12,lsl#4 // end of key schedule
+
+ vld1.32 {v0.16b},[$out]
+ vld1.32 {v1.16b},[$inp]
+ vst1.32 {v0.16b},[$inp],x4
+ vst1.32 {v1.16b},[$out],#16
+
+.Loop_imc:
+ vld1.32 {v0.16b},[$out]
+ vld1.32 {v1.16b},[$inp]
+ aesimc v0.16b,v0.16b
+ aesimc v1.16b,v1.16b
+ vst1.32 {v0.16b},[$inp],x4
+ vst1.32 {v1.16b},[$out],#16
+ cmp $inp,$out
+ b.hi .Loop_imc
+
+ vld1.32 {v0.16b},[$out]
+ aesimc v0.16b,v0.16b
+ vst1.32 {v0.16b},[$inp]
+
+ eor x0,x0,x0 // return value
+.Ldec_key_abort:
+___
+$code.=<<___ if ($flavour !~ /64/);
+ ldmia sp!,{r4,pc}
+___
+$code.=<<___ if ($flavour =~ /64/);
+ ldp x29,x30,[sp],#16
+ ret
+___
+$code.=<<___;
+.size ${prefix}_set_decrypt_key,.-${prefix}_set_decrypt_key
+___
+}}}
+{{{
+sub gen_block () {
+my $dir = shift;
+my ($e,$mc) = $dir eq "en" ? ("e","mc") : ("d","imc");
+my ($inp,$out,$key)=map("x$_",(0..2));
+my $rounds="w3";
+my ($rndkey0,$rndkey1,$inout)=map("q$_",(0..3));
+
+$code.=<<___;
+.globl ${prefix}_${dir}crypt
+.type ${prefix}_${dir}crypt,%function
+.align 5
+${prefix}_${dir}crypt:
+ ldr $rounds,[$key,#240]
+ vld1.32 {$rndkey0},[$key],#16
+ vld1.8 {$inout},[$inp]
+ sub $rounds,$rounds,#2
+ vld1.32 {$rndkey1},[$key],#16
+
+.Loop_${dir}c:
+ aes$e $inout,$rndkey0
+ vld1.32 {$rndkey0},[$key],#16
+ aes$mc $inout,$inout
+ subs $rounds,$rounds,#2
+ aes$e $inout,$rndkey1
+ vld1.32 {$rndkey1},[$key],#16
+ aes$mc $inout,$inout
+ b.gt .Loop_${dir}c
+
+ aes$e $inout,$rndkey0
+ vld1.32 {$rndkey0},[$key]
+ aes$mc $inout,$inout
+ aes$e $inout,$rndkey1
+ veor $inout,$inout,$rndkey0
+
+ vst1.8 {$inout},[$out]
+ ret
+.size ${prefix}_${dir}crypt,.-${prefix}_${dir}crypt
+___
+}
+&gen_block("en");
+&gen_block("de");
+}}}
+{{{
+my ($inp,$out,$len,$key,$ivp)=map("x$_",(0..4)); my $enc="w5";
+my ($rounds,$cnt,$key_,$step,$step1)=($enc,"w6","x7","x8","x12");
+my ($dat0,$dat1,$in0,$in1,$tmp0,$tmp1,$ivec,$rndlast)=map("q$_",(0..7));
+
+my ($dat,$tmp,$rndzero_n_last)=($dat0,$tmp0,$tmp1);
+
+### q8-q15 preloaded key schedule
+
+$code.=<<___;
+.globl ${prefix}_cbc_encrypt
+.type ${prefix}_cbc_encrypt,%function
+.align 5
+${prefix}_cbc_encrypt:
+___
+$code.=<<___ if ($flavour =~ /64/);
+ stp x29,x30,[sp,#-16]!
+ add x29,sp,#0
+___
+$code.=<<___ if ($flavour !~ /64/);
+ mov ip,sp
+ stmdb sp!,{r4-r8,lr}
+ vstmdb sp!,{d8-d15} @ ABI specification says so
+ ldmia ip,{r4-r5} @ load remaining args
+___
+$code.=<<___;
+ subs $len,$len,#16
+ mov $step,#16
+ b.lo .Lcbc_abort
+ cclr $step,eq
+
+ cmp $enc,#0 // en- or decrypting?
+ ldr $rounds,[$key,#240]
+ and $len,$len,#-16
+ vld1.8 {$ivec},[$ivp]
+ vld1.8 {$dat},[$inp],$step
+
+ vld1.32 {q8-q9},[$key] // load key schedule...
+ sub $rounds,$rounds,#6
+ add $key_,$key,x5,lsl#4 // pointer to last 7 round keys
+ sub $rounds,$rounds,#2
+ vld1.32 {q10-q11},[$key_],#32
+ vld1.32 {q12-q13},[$key_],#32
+ vld1.32 {q14-q15},[$key_],#32
+ vld1.32 {$rndlast},[$key_]
+
+ add $key_,$key,#32
+ mov $cnt,$rounds
+ b.eq .Lcbc_dec
+
+ cmp $rounds,#2
+ veor $dat,$dat,$ivec
+ veor $rndzero_n_last,q8,$rndlast
+ b.eq .Lcbc_enc128
+
+.Loop_cbc_enc:
+ aese $dat,q8
+ vld1.32 {q8},[$key_],#16
+ aesmc $dat,$dat
+ subs $cnt,$cnt,#2
+ aese $dat,q9
+ vld1.32 {q9},[$key_],#16
+ aesmc $dat,$dat
+ b.gt .Loop_cbc_enc
+
+ aese $dat,q8
+ aesmc $dat,$dat
+ subs $len,$len,#16
+ aese $dat,q9
+ aesmc $dat,$dat
+ cclr $step,eq
+ aese $dat,q10
+ aesmc $dat,$dat
+ add $key_,$key,#16
+ aese $dat,q11
+ aesmc $dat,$dat
+ vld1.8 {q8},[$inp],$step
+ aese $dat,q12
+ aesmc $dat,$dat
+ veor q8,q8,$rndzero_n_last
+ aese $dat,q13
+ aesmc $dat,$dat
+ vld1.32 {q9},[$key_],#16 // re-pre-load rndkey[1]
+ aese $dat,q14
+ aesmc $dat,$dat
+ aese $dat,q15
+
+ mov $cnt,$rounds
+ veor $ivec,$dat,$rndlast
+ vst1.8 {$ivec},[$out],#16
+ b.hs .Loop_cbc_enc
+
+ b .Lcbc_done
+
+.align 5
+.Lcbc_enc128:
+ vld1.32 {$in0-$in1},[$key_]
+ aese $dat,q8
+ aesmc $dat,$dat
+ b .Lenter_cbc_enc128
+.Loop_cbc_enc128:
+ aese $dat,q8
+ aesmc $dat,$dat
+ vst1.8 {$ivec},[$out],#16
+.Lenter_cbc_enc128:
+ aese $dat,q9
+ aesmc $dat,$dat
+ subs $len,$len,#16
+ aese $dat,$in0
+ aesmc $dat,$dat
+ cclr $step,eq
+ aese $dat,$in1
+ aesmc $dat,$dat
+ aese $dat,q10
+ aesmc $dat,$dat
+ aese $dat,q11
+ aesmc $dat,$dat
+ vld1.8 {q8},[$inp],$step
+ aese $dat,q12
+ aesmc $dat,$dat
+ aese $dat,q13
+ aesmc $dat,$dat
+ aese $dat,q14
+ aesmc $dat,$dat
+ veor q8,q8,$rndzero_n_last
+ aese $dat,q15
+ veor $ivec,$dat,$rndlast
+ b.hs .Loop_cbc_enc128
+
+ vst1.8 {$ivec},[$out],#16
+ b .Lcbc_done
+___
+{
+my ($dat2,$in2,$tmp2)=map("q$_",(10,11,9));
+$code.=<<___;
+.align 5
+.Lcbc_dec:
+ vld1.8 {$dat2},[$inp],#16
+ subs $len,$len,#32 // bias
+ add $cnt,$rounds,#2
+ vorr $in1,$dat,$dat
+ vorr $dat1,$dat,$dat
+ vorr $in2,$dat2,$dat2
+ b.lo .Lcbc_dec_tail
+
+ vorr $dat1,$dat2,$dat2
+ vld1.8 {$dat2},[$inp],#16
+ vorr $in0,$dat,$dat
+ vorr $in1,$dat1,$dat1
+ vorr $in2,$dat2,$dat2
+
+.Loop3x_cbc_dec:
+ aesd $dat0,q8
+ aesd $dat1,q8
+ aesd $dat2,q8
+ vld1.32 {q8},[$key_],#16
+ aesimc $dat0,$dat0
+ aesimc $dat1,$dat1
+ aesimc $dat2,$dat2
+ subs $cnt,$cnt,#2
+ aesd $dat0,q9
+ aesd $dat1,q9
+ aesd $dat2,q9
+ vld1.32 {q9},[$key_],#16
+ aesimc $dat0,$dat0
+ aesimc $dat1,$dat1
+ aesimc $dat2,$dat2
+ b.gt .Loop3x_cbc_dec
+
+ aesd $dat0,q8
+ aesd $dat1,q8
+ aesd $dat2,q8
+ veor $tmp0,$ivec,$rndlast
+ aesimc $dat0,$dat0
+ aesimc $dat1,$dat1
+ aesimc $dat2,$dat2
+ veor $tmp1,$in0,$rndlast
+ aesd $dat0,q9
+ aesd $dat1,q9
+ aesd $dat2,q9
+ veor $tmp2,$in1,$rndlast
+ subs $len,$len,#0x30
+ aesimc $dat0,$dat0
+ aesimc $dat1,$dat1
+ aesimc $dat2,$dat2
+ vorr $ivec,$in2,$in2
+ mov.lo x6,$len // x6, $cnt, is zero at this point
+ aesd $dat0,q12
+ aesd $dat1,q12
+ aesd $dat2,q12
+ add $inp,$inp,x6 // $inp is adjusted in such way that
+ // at exit from the loop $dat1-$dat2
+ // are loaded with last "words"
+ aesimc $dat0,$dat0
+ aesimc $dat1,$dat1
+ aesimc $dat2,$dat2
+ mov $key_,$key
+ aesd $dat0,q13
+ aesd $dat1,q13
+ aesd $dat2,q13
+ vld1.8 {$in0},[$inp],#16
+ aesimc $dat0,$dat0
+ aesimc $dat1,$dat1
+ aesimc $dat2,$dat2
+ vld1.8 {$in1},[$inp],#16
+ aesd $dat0,q14
+ aesd $dat1,q14
+ aesd $dat2,q14
+ vld1.8 {$in2},[$inp],#16
+ aesimc $dat0,$dat0
+ aesimc $dat1,$dat1
+ aesimc $dat2,$dat2
+ vld1.32 {q8},[$key_],#16 // re-pre-load rndkey[0]
+ aesd $dat0,q15
+ aesd $dat1,q15
+ aesd $dat2,q15
+
+ add $cnt,$rounds,#2
+ veor $tmp0,$tmp0,$dat0
+ veor $tmp1,$tmp1,$dat1
+ veor $dat2,$dat2,$tmp2
+ vld1.32 {q9},[$key_],#16 // re-pre-load rndkey[1]
+ vorr $dat0,$in0,$in0
+ vst1.8 {$tmp0},[$out],#16
+ vorr $dat1,$in1,$in1
+ vst1.8 {$tmp1},[$out],#16
+ vst1.8 {$dat2},[$out],#16
+ vorr $dat2,$in2,$in2
+ b.hs .Loop3x_cbc_dec
+
+ cmn $len,#0x30
+ b.eq .Lcbc_done
+ nop
+
+.Lcbc_dec_tail:
+ aesd $dat1,q8
+ aesd $dat2,q8
+ vld1.32 {q8},[$key_],#16
+ aesimc $dat1,$dat1
+ aesimc $dat2,$dat2
+ subs $cnt,$cnt,#2
+ aesd $dat1,q9
+ aesd $dat2,q9
+ vld1.32 {q9},[$key_],#16
+ aesimc $dat1,$dat1
+ aesimc $dat2,$dat2
+ b.gt .Lcbc_dec_tail
+
+ aesd $dat1,q8
+ aesd $dat2,q8
+ aesimc $dat1,$dat1
+ aesimc $dat2,$dat2
+ aesd $dat1,q9
+ aesd $dat2,q9
+ aesimc $dat1,$dat1
+ aesimc $dat2,$dat2
+ aesd $dat1,q12
+ aesd $dat2,q12
+ aesimc $dat1,$dat1
+ aesimc $dat2,$dat2
+ cmn $len,#0x20
+ aesd $dat1,q13
+ aesd $dat2,q13
+ aesimc $dat1,$dat1
+ aesimc $dat2,$dat2
+ veor $tmp1,$ivec,$rndlast
+ aesd $dat1,q14
+ aesd $dat2,q14
+ aesimc $dat1,$dat1
+ aesimc $dat2,$dat2
+ veor $tmp2,$in1,$rndlast
+ aesd $dat1,q15
+ aesd $dat2,q15
+ b.eq .Lcbc_dec_one
+ veor $tmp1,$tmp1,$dat1
+ veor $tmp2,$tmp2,$dat2
+ vorr $ivec,$in2,$in2
+ vst1.8 {$tmp1},[$out],#16
+ vst1.8 {$tmp2},[$out],#16
+ b .Lcbc_done
+
+.Lcbc_dec_one:
+ veor $tmp1,$tmp1,$dat2
+ vorr $ivec,$in2,$in2
+ vst1.8 {$tmp1},[$out],#16
+
+.Lcbc_done:
+ vst1.8 {$ivec},[$ivp]
+.Lcbc_abort:
+___
+}
+$code.=<<___ if ($flavour !~ /64/);
+ vldmia sp!,{d8-d15}
+ ldmia sp!,{r4-r8,pc}
+___
+$code.=<<___ if ($flavour =~ /64/);
+ ldr x29,[sp],#16
+ ret
+___
+$code.=<<___;
+.size ${prefix}_cbc_encrypt,.-${prefix}_cbc_encrypt
+___
+}}}
+{{{
+my ($inp,$out,$len,$key,$ivp)=map("x$_",(0..4));
+my ($rounds,$cnt,$key_)=("w5","w6","x7");
+my ($ctr,$tctr0,$tctr1,$tctr2)=map("w$_",(8..10,12));
+my $step="x12"; # aliases with $tctr2
+
+my ($dat0,$dat1,$in0,$in1,$tmp0,$tmp1,$ivec,$rndlast)=map("q$_",(0..7));
+my ($dat2,$in2,$tmp2)=map("q$_",(10,11,9));
+
+my ($dat,$tmp)=($dat0,$tmp0);
+
+### q8-q15 preloaded key schedule
+
+$code.=<<___;
+.globl ${prefix}_ctr32_encrypt_blocks
+.type ${prefix}_ctr32_encrypt_blocks,%function
+.align 5
+${prefix}_ctr32_encrypt_blocks:
+___
+$code.=<<___ if ($flavour =~ /64/);
+ stp x29,x30,[sp,#-16]!
+ add x29,sp,#0
+___
+$code.=<<___ if ($flavour !~ /64/);
+ mov ip,sp
+ stmdb sp!,{r4-r10,lr}
+ vstmdb sp!,{d8-d15} @ ABI specification says so
+ ldr r4, [ip] @ load remaining arg
+___
+$code.=<<___;
+ ldr $rounds,[$key,#240]
+
+ ldr $ctr, [$ivp, #12]
+ vld1.32 {$dat0},[$ivp]
+
+ vld1.32 {q8-q9},[$key] // load key schedule...
+ sub $rounds,$rounds,#4
+ mov $step,#16
+ cmp $len,#2
+ add $key_,$key,x5,lsl#4 // pointer to last 5 round keys
+ sub $rounds,$rounds,#2
+ vld1.32 {q12-q13},[$key_],#32
+ vld1.32 {q14-q15},[$key_],#32
+ vld1.32 {$rndlast},[$key_]
+ add $key_,$key,#32
+ mov $cnt,$rounds
+ cclr $step,lo
+#ifndef __ARMEB__
+ rev $ctr, $ctr
+#endif
+ vorr $dat1,$dat0,$dat0
+ add $tctr1, $ctr, #1
+ vorr $dat2,$dat0,$dat0
+ add $ctr, $ctr, #2
+ vorr $ivec,$dat0,$dat0
+ rev $tctr1, $tctr1
+ vmov.32 ${dat1}[3],$tctr1
+ b.ls .Lctr32_tail
+ rev $tctr2, $ctr
+ sub $len,$len,#3 // bias
+ vmov.32 ${dat2}[3],$tctr2
+ b .Loop3x_ctr32
+
+.align 4
+.Loop3x_ctr32:
+ aese $dat0,q8
+ aese $dat1,q8
+ aese $dat2,q8
+ vld1.32 {q8},[$key_],#16
+ aesmc $dat0,$dat0
+ aesmc $dat1,$dat1
+ aesmc $dat2,$dat2
+ subs $cnt,$cnt,#2
+ aese $dat0,q9
+ aese $dat1,q9
+ aese $dat2,q9
+ vld1.32 {q9},[$key_],#16
+ aesmc $dat0,$dat0
+ aesmc $dat1,$dat1
+ aesmc $dat2,$dat2
+ b.gt .Loop3x_ctr32
+
+ aese $dat0,q8
+ aese $dat1,q8
+ aese $dat2,q8
+ mov $key_,$key
+ aesmc $tmp0,$dat0
+ vld1.8 {$in0},[$inp],#16
+ aesmc $tmp1,$dat1
+ aesmc $dat2,$dat2
+ vorr $dat0,$ivec,$ivec
+ aese $tmp0,q9
+ vld1.8 {$in1},[$inp],#16
+ aese $tmp1,q9
+ aese $dat2,q9
+ vorr $dat1,$ivec,$ivec
+ aesmc $tmp0,$tmp0
+ vld1.8 {$in2},[$inp],#16
+ aesmc $tmp1,$tmp1
+ aesmc $tmp2,$dat2
+ vorr $dat2,$ivec,$ivec
+ add $tctr0,$ctr,#1
+ aese $tmp0,q12
+ aese $tmp1,q12
+ aese $tmp2,q12
+ veor $in0,$in0,$rndlast
+ add $tctr1,$ctr,#2
+ aesmc $tmp0,$tmp0
+ aesmc $tmp1,$tmp1
+ aesmc $tmp2,$tmp2
+ veor $in1,$in1,$rndlast
+ add $ctr,$ctr,#3
+ aese $tmp0,q13
+ aese $tmp1,q13
+ aese $tmp2,q13
+ veor $in2,$in2,$rndlast
+ rev $tctr0,$tctr0
+ aesmc $tmp0,$tmp0
+ vld1.32 {q8},[$key_],#16 // re-pre-load rndkey[0]
+ aesmc $tmp1,$tmp1
+ aesmc $tmp2,$tmp2
+ vmov.32 ${dat0}[3], $tctr0
+ rev $tctr1,$tctr1
+ aese $tmp0,q14
+ aese $tmp1,q14
+ aese $tmp2,q14
+ vmov.32 ${dat1}[3], $tctr1
+ rev $tctr2,$ctr
+ aesmc $tmp0,$tmp0
+ aesmc $tmp1,$tmp1
+ aesmc $tmp2,$tmp2
+ vmov.32 ${dat2}[3], $tctr2
+ subs $len,$len,#3
+ aese $tmp0,q15
+ aese $tmp1,q15
+ aese $tmp2,q15
+
+ mov $cnt,$rounds
+ veor $in0,$in0,$tmp0
+ veor $in1,$in1,$tmp1
+ veor $in2,$in2,$tmp2
+ vld1.32 {q9},[$key_],#16 // re-pre-load rndkey[1]
+ vst1.8 {$in0},[$out],#16
+ vst1.8 {$in1},[$out],#16
+ vst1.8 {$in2},[$out],#16
+ b.hs .Loop3x_ctr32
+
+ adds $len,$len,#3
+ b.eq .Lctr32_done
+ cmp $len,#1
+ mov $step,#16
+ cclr $step,eq
+
+.Lctr32_tail:
+ aese $dat0,q8
+ aese $dat1,q8
+ vld1.32 {q8},[$key_],#16
+ aesmc $dat0,$dat0
+ aesmc $dat1,$dat1
+ subs $cnt,$cnt,#2
+ aese $dat0,q9
+ aese $dat1,q9
+ vld1.32 {q9},[$key_],#16
+ aesmc $dat0,$dat0
+ aesmc $dat1,$dat1
+ b.gt .Lctr32_tail
+
+ aese $dat0,q8
+ aese $dat1,q8
+ aesmc $dat0,$dat0
+ aesmc $dat1,$dat1
+ aese $dat0,q9
+ aese $dat1,q9
+ aesmc $dat0,$dat0
+ aesmc $dat1,$dat1
+ vld1.8 {$in0},[$inp],$step
+ aese $dat0,q12
+ aese $dat1,q12
+ vld1.8 {$in1},[$inp]
+ aesmc $dat0,$dat0
+ aesmc $dat1,$dat1
+ aese $dat0,q13
+ aese $dat1,q13
+ aesmc $dat0,$dat0
+ aesmc $dat1,$dat1
+ aese $dat0,q14
+ aese $dat1,q14
+ veor $in0,$in0,$rndlast
+ aesmc $dat0,$dat0
+ aesmc $dat1,$dat1
+ veor $in1,$in1,$rndlast
+ aese $dat0,q15
+ aese $dat1,q15
+
+ cmp $len,#1
+ veor $in0,$in0,$dat0
+ veor $in1,$in1,$dat1
+ vst1.8 {$in0},[$out],#16
+ b.eq .Lctr32_done
+ vst1.8 {$in1},[$out]
+
+.Lctr32_done:
+___
+$code.=<<___ if ($flavour !~ /64/);
+ vldmia sp!,{d8-d15}
+ ldmia sp!,{r4-r10,pc}
+___
+$code.=<<___ if ($flavour =~ /64/);
+ ldr x29,[sp],#16
+ ret
+___
+$code.=<<___;
+.size ${prefix}_ctr32_encrypt_blocks,.-${prefix}_ctr32_encrypt_blocks
+___
+}}}
+$code.=<<___;
+#endif
+___
+########################################
+if ($flavour =~ /64/) { ######## 64-bit code
+ my %opcode = (
+ "aesd" => 0x4e285800, "aese" => 0x4e284800,
+ "aesimc"=> 0x4e287800, "aesmc" => 0x4e286800 );
+
+ local *unaes = sub {
+ my ($mnemonic,$arg)=@_;
+
+ $arg =~ m/[qv]([0-9]+)[^,]*,\s*[qv]([0-9]+)/o &&
+ sprintf ".inst\t0x%08x\t//%s %s",
+ $opcode{$mnemonic}|$1|($2<<5),
+ $mnemonic,$arg;
+ };
+
+ foreach(split("\n",$code)) {
+ s/\`([^\`]*)\`/eval($1)/geo;
+
+ s/\bq([0-9]+)\b/"v".($1<8?$1:$1+8).".16b"/geo; # old->new registers
+ s/@\s/\/\//o; # old->new style commentary
+
+ #s/[v]?(aes\w+)\s+([qv].*)/unaes($1,$2)/geo or
+ s/cclr\s+([wx])([^,]+),\s*([a-z]+)/csel $1$2,$1zr,$1$2,$3/o or
+ s/mov\.([a-z]+)\s+([wx][0-9]+),\s*([wx][0-9]+)/csel $2,$3,$2,$1/o or
+ s/vmov\.i8/movi/o or # fix up legacy mnemonics
+ s/vext\.8/ext/o or
+ s/vrev32\.8/rev32/o or
+ s/vtst\.8/cmtst/o or
+ s/vshr/ushr/o or
+ s/^(\s+)v/$1/o or # strip off v prefix
+ s/\bbx\s+lr\b/ret/o;
+
+ # fix up remainig legacy suffixes
+ s/\.[ui]?8//o;
+ m/\],#8/o and s/\.16b/\.8b/go;
+ s/\.[ui]?32//o and s/\.16b/\.4s/go;
+ s/\.[ui]?64//o and s/\.16b/\.2d/go;
+ s/\.[42]([sd])\[([0-3])\]/\.$1\[$2\]/o;
+
+ print $_,"\n";
+ }
+} else { ######## 32-bit code
+ my %opcode = (
+ "aesd" => 0xf3b00340, "aese" => 0xf3b00300,
+ "aesimc"=> 0xf3b003c0, "aesmc" => 0xf3b00380 );
+
+ local *unaes = sub {
+ my ($mnemonic,$arg)=@_;
+
+ if ($arg =~ m/[qv]([0-9]+)[^,]*,\s*[qv]([0-9]+)/o) {
+ my $word = $opcode{$mnemonic}|(($1&7)<<13)|(($1&8)<<19)
+ |(($2&7)<<1) |(($2&8)<<2);
+ # since ARMv7 instructions are always encoded little-endian.
+ # correct solution is to use .inst directive, but older
+ # assemblers don't implement it:-(
+ sprintf ".byte\t0x%02x,0x%02x,0x%02x,0x%02x\t@ %s %s",
+ $word&0xff,($word>>8)&0xff,
+ ($word>>16)&0xff,($word>>24)&0xff,
+ $mnemonic,$arg;
+ }
+ };
+
+ sub unvtbl {
+ my $arg=shift;
+
+ $arg =~ m/q([0-9]+),\s*\{q([0-9]+)\},\s*q([0-9]+)/o &&
+ sprintf "vtbl.8 d%d,{q%d},d%d\n\t".
+ "vtbl.8 d%d,{q%d},d%d", 2*$1,$2,2*$3, 2*$1+1,$2,2*$3+1;
+ }
+
+ sub unvdup32 {
+ my $arg=shift;
+
+ $arg =~ m/q([0-9]+),\s*q([0-9]+)\[([0-3])\]/o &&
+ sprintf "vdup.32 q%d,d%d[%d]",$1,2*$2+($3>>1),$3&1;
+ }
+
+ sub unvmov32 {
+ my $arg=shift;
+
+ $arg =~ m/q([0-9]+)\[([0-3])\],(.*)/o &&
+ sprintf "vmov.32 d%d[%d],%s",2*$1+($2>>1),$2&1,$3;
+ }
+
+ foreach(split("\n",$code)) {
+ s/\`([^\`]*)\`/eval($1)/geo;
+
+ s/\b[wx]([0-9]+)\b/r$1/go; # new->old registers
+ s/\bv([0-9])\.[12468]+[bsd]\b/q$1/go; # new->old registers
+ s/\/\/\s?/@ /o; # new->old style commentary
+
+ # fix up remainig new-style suffixes
+ s/\{q([0-9]+)\},\s*\[(.+)\],#8/sprintf "{d%d},[$2]!",2*$1/eo or
+ s/\],#[0-9]+/]!/o;
+
+ s/[v]?(aes\w+)\s+([qv].*)/unaes($1,$2)/geo or
+ s/cclr\s+([^,]+),\s*([a-z]+)/mov$2 $1,#0/o or
+ s/vtbl\.8\s+(.*)/unvtbl($1)/geo or
+ s/vdup\.32\s+(.*)/unvdup32($1)/geo or
+ s/vmov\.32\s+(.*)/unvmov32($1)/geo or
+ s/^(\s+)b\./$1b/o or
+ s/^(\s+)mov\./$1mov/o or
+ s/^(\s+)ret/$1bx\tlr/o;
+
+ print $_,"\n";
+ }
+}
+
+close STDOUT;
diff --git a/crypto/arm64cpuid.pl b/crypto/arm64cpuid.pl
new file mode 100644
index 0000000..bfec664
--- /dev/null
+++ b/crypto/arm64cpuid.pl
@@ -0,0 +1,68 @@
+#!/usr/bin/env perl
+
+$flavour = shift;
+$output = shift;
+
+$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
+( $xlate="${dir}arm-xlate.pl" and -f $xlate ) or
+( $xlate="${dir}perlasm/arm-xlate.pl" and -f $xlate) or
+die "can't locate arm-xlate.pl";
+
+open OUT,"| \"$^X\" $xlate $flavour $output";
+*STDOUT=*OUT;
+
+$code.=<<___;
+#include "arm_arch.h"
+
+.text
+.arch armv8-a+crypto
+
+.align 5
+.globl _armv7_neon_probe
+.type _armv7_neon_probe,%function
+_armv7_neon_probe:
+ orr v15.16b, v15.16b, v15.16b
+ ret
+.size _armv7_neon_probe,.-_armv7_neon_probe
+
+.globl _armv7_tick
+.type _armv7_tick,%function
+_armv7_tick:
+#ifdef __APPLE__
+ mrs x0, CNTPCT_EL0
+#else
+ mrs x0, CNTVCT_EL0
+#endif
+ ret
+.size _armv7_tick,.-_armv7_tick
+
+.globl _armv8_aes_probe
+.type _armv8_aes_probe,%function
+_armv8_aes_probe:
+ aese v0.16b, v0.16b
+ ret
+.size _armv8_aes_probe,.-_armv8_aes_probe
+
+.globl _armv8_sha1_probe
+.type _armv8_sha1_probe,%function
+_armv8_sha1_probe:
+ sha1h s0, s0
+ ret
+.size _armv8_sha1_probe,.-_armv8_sha1_probe
+
+.globl _armv8_sha256_probe
+.type _armv8_sha256_probe,%function
+_armv8_sha256_probe:
+ sha256su0 v0.4s, v0.4s
+ ret
+.size _armv8_sha256_probe,.-_armv8_sha256_probe
+.globl _armv8_pmull_probe
+.type _armv8_pmull_probe,%function
+_armv8_pmull_probe:
+ pmull v0.1q, v0.1d, v0.1d
+ ret
+.size _armv8_pmull_probe,.-_armv8_pmull_probe
+___
+
+print $code;
+close STDOUT;
diff --git a/crypto/arm_arch.h b/crypto/arm_arch.h
index a50c366..7a37775 100644
--- a/crypto/arm_arch.h
+++ b/crypto/arm_arch.h
@@ -10,13 +10,22 @@
# define __ARMEL__
# endif
# elif defined(__GNUC__)
+# if defined(__aarch64__)
+# define __ARM_ARCH__ 8
+# if __BYTE_ORDER__==__ORDER_BIG_ENDIAN__
+# define __ARMEB__
+# else
+# define __ARMEL__
+# endif
/*
* Why doesn't gcc define __ARM_ARCH__? Instead it defines
* bunch of below macros. See all_architectires[] table in
* gcc/config/arm/arm.c. On a side note it defines
* __ARMEL__/__ARMEB__ for little-/big-endian.
*/
-# if defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) || \
+# elif defined(__ARM_ARCH_8A__)
+# define __ARM_ARCH__ 8
+# elif defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) || \
defined(__ARM_ARCH_7R__)|| defined(__ARM_ARCH_7M__) || \
defined(__ARM_ARCH_7EM__)
# define __ARM_ARCH__ 7
@@ -42,10 +51,14 @@
#if !__ASSEMBLER__
extern unsigned int OPENSSL_armcap_P;
+#endif
#define ARMV7_NEON (1<<0)
#define ARMV7_TICK (1<<1)
-#endif
+#define ARMV8_AES (1<<2)
+#define ARMV8_SHA1 (1<<3)
+#define ARMV8_SHA256 (1<<4)
+#define ARMV8_PMULL (1<<5)
#endif
#endif
diff --git a/crypto/armcap.c b/crypto/armcap.c
index 5258d2f..2579389 100644
--- a/crypto/armcap.c
+++ b/crypto/armcap.c
@@ -20,6 +20,10 @@ static void ill_handler (int sig) { siglongjmp(ill_jmp,sig); }
*/
void _armv7_neon_probe(void);
unsigned int _armv7_tick(void);
+void _armv8_aes_probe(void);
+void _armv8_sha1_probe(void);
+void _armv8_sha256_probe(void);
+void _armv8_pmull_probe(void);
unsigned int OPENSSL_rdtsc(void)
{
@@ -68,6 +72,28 @@ void OPENSSL_cpuid_setup(void)
{
_armv7_neon_probe();
OPENSSL_armcap_P |= ARMV7_NEON;
+#ifdef __aarch64__
+ if (sigsetjmp(ill_jmp,1) == 0)
+ {
+ _armv8_pmull_probe();
+ OPENSSL_armcap_P |= ARMV8_PMULL|ARMV8_AES;
+ }
+ else if (sigsetjmp(ill_jmp,1) == 0)
+ {
+ _armv8_aes_probe();
+ OPENSSL_armcap_P |= ARMV8_AES;
+ }
+ if (sigsetjmp(ill_jmp,1) == 0)
+ {
+ _armv8_sha1_probe();
+ OPENSSL_armcap_P |= ARMV8_SHA1;
+ }
+ if (sigsetjmp(ill_jmp,1) == 0)
+ {
+ _armv8_sha256_probe();
+ OPENSSL_armcap_P |= ARMV8_SHA256;
+ }
+#endif
}
if (sigsetjmp(ill_jmp,1) == 0)
{
diff --git a/crypto/armv4cpuid_ios.S b/crypto/armv4cpuid_ios.S
new file mode 100644
index 0000000..cce9a79
--- /dev/null
+++ b/crypto/armv4cpuid_ios.S
@@ -0,0 +1,210 @@
+#include "arm_arch.h"
+
+.text
+.code 32
+
+.align 5
+.globl _OPENSSL_atomic_add
+
+_OPENSSL_atomic_add:
+#if __ARM_ARCH__>=6
+Ladd: ldrex r2,[r0]
+ add r3,r2,r1
+ strex r2,r3,[r0]
+ cmp r2,#0
+ bne Ladd
+ mov r0,r3
+ bx lr
+#else
+ stmdb sp!,{r4,r5,r6,lr}
+ ldr r2,Lspinlock
+ adr r3,Lspinlock
+ mov r4,r0
+ mov r5,r1
+ add r6,r3,r2 @ &spinlock
+ b .+8
+Lspin: bl sched_yield
+ mov r0,#-1
+ swp r0,r0,[r6]
+ cmp r0,#0
+ bne Lspin
+
+ ldr r2,[r4]
+ add r2,r2,r5
+ str r2,[r4]
+ str r0,[r6] @ release spinlock
+ ldmia sp!,{r4,r5,r6,lr}
+ tst lr,#1
+ moveq pc,lr
+.word 0xe12fff1e @ bx lr
+#endif
+
+
+.globl _OPENSSL_cleanse
+
+_OPENSSL_cleanse:
+ eor ip,ip,ip
+ cmp r1,#7
+ subhs r1,r1,#4
+ bhs Lot
+ cmp r1,#0
+ beq Lcleanse_done
+Little:
+ strb ip,[r0],#1
+ subs r1,r1,#1
+ bhi Little
+ b Lcleanse_done
+
+Lot: tst r0,#3
+ beq Laligned
+ strb ip,[r0],#1
+ sub r1,r1,#1
+ b Lot
+Laligned:
+ str ip,[r0],#4
+ subs r1,r1,#4
+ bhs Laligned
+ adds r1,r1,#4
+ bne Little
+Lcleanse_done:
+#if __ARM_ARCH__>=5
+ bx lr
+#else
+ tst lr,#1
+ moveq pc,lr
+.word 0xe12fff1e @ bx lr
+#endif
+
+
+
+
+.align 5
+.globl __armv7_neon_probe
+
+__armv7_neon_probe:
+ vorr q0,q0,q0
+ bx lr
+
+
+.globl __armv7_tick
+
+__armv7_tick:
+#ifdef __APPLE__
+ mrrc p15,0,r0,r1,c14 @ CNTPCT
+#else
+ mrrc p15,1,r0,r1,c14 @ CNTVCT
+#endif
+ bx lr
+
+
+.globl __armv8_aes_probe
+
+__armv8_aes_probe:
+.byte 0x00,0x03,0xb0,0xf3 @ aese.8 q0,q0
+ bx lr
+
+
+.globl __armv8_sha1_probe
+
+__armv8_sha1_probe:
+.byte 0x40,0x0c,0x00,0xf2 @ sha1c.32 q0,q0,q0
+ bx lr
+
+
+.globl __armv8_sha256_probe
+
+__armv8_sha256_probe:
+.byte 0x40,0x0c,0x00,0xf3 @ sha256h.32 q0,q0,q0
+ bx lr
+
+.globl __armv8_pmull_probe
+
+__armv8_pmull_probe:
+.byte 0x00,0x0e,0xa0,0xf2 @ vmull.p64 q0,d0,d0
+ bx lr
+
+.globl _OPENSSL_wipe_cpu
+
+_OPENSSL_wipe_cpu:
+ ldr r0,LOPENSSL_armcap
+ adr r1,LOPENSSL_armcap
+ ldr r0,[r1,r0]
+#ifdef __APPLE__
+ ldr r0,[r0]
+#endif
+ eor r2,r2,r2
+ eor r3,r3,r3
+ eor ip,ip,ip
+ tst r0,#1
+ beq Lwipe_done
+ veor q0, q0, q0
+ veor q1, q1, q1
+ veor q2, q2, q2
+ veor q3, q3, q3
+ veor q8, q8, q8
+ veor q9, q9, q9
+ veor q10, q10, q10
+ veor q11, q11, q11
+ veor q12, q12, q12
+ veor q13, q13, q13
+ veor q14, q14, q14
+ veor q15, q15, q15
+Lwipe_done:
+ mov r0,sp
+#if __ARM_ARCH__>=5
+ bx lr
+#else
+ tst lr,#1
+ moveq pc,lr
+.word 0xe12fff1e @ bx lr
+#endif
+
+
+.globl _OPENSSL_instrument_bus
+
+_OPENSSL_instrument_bus:
+ eor r0,r0,r0
+#if __ARM_ARCH__>=5
+ bx lr
+#else
+ tst lr,#1
+ moveq pc,lr
+.word 0xe12fff1e @ bx lr
+#endif
+
+
+.globl _OPENSSL_instrument_bus2
+
+_OPENSSL_instrument_bus2:
+ eor r0,r0,r0
+#if __ARM_ARCH__>=5
+ bx lr
+#else
+ tst lr,#1
+ moveq pc,lr
+.word 0xe12fff1e @ bx lr
+#endif
+
+
+.align 5
+LOPENSSL_armcap:
+.word OPENSSL_armcap_P-.
+#if __ARM_ARCH__>=6
+.align 5
+#else
+Lspinlock:
+.word atomic_add_spinlock-Lspinlock
+.align 5
+
+.data
+.align 2
+atomic_add_spinlock:
+.word
+#endif
+
+.comm _OPENSSL_armcap_P,4
+.non_lazy_symbol_pointer
+OPENSSL_armcap_P:
+.indirect_symbol _OPENSSL_armcap_P
+.long 0
+.private_extern _OPENSSL_armcap_P
diff --git a/crypto/bn/asm/armv4-gf2m.pl b/crypto/bn/asm/armv4-gf2m.pl
index c52e0b7..737659f 100644
--- a/crypto/bn/asm/armv4-gf2m.pl
+++ b/crypto/bn/asm/armv4-gf2m.pl
@@ -21,8 +21,20 @@
# runs in even less cycles, ~30, improvement is measurable only on
# longer keys. One has to optimize code elsewhere to get NEON glow...
-while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {}
-open STDOUT,">$output";
+$flavour = shift;
+if ($flavour=~/^\w[\w\-]*\.\w+$/) { $output=$flavour; undef $flavour; }
+else { while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {} }
+
+if ($flavour && $flavour ne "void") {
+ $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
+ ( $xlate="${dir}arm-xlate.pl" and -f $xlate ) or
+ ( $xlate="${dir}../../perlasm/arm-xlate.pl" and -f $xlate) or
+ die "can't locate arm-xlate.pl";
+
+ open STDOUT,"| \"$^X\" $xlate $flavour $output";
+} else {
+ open STDOUT,">$output";
+}
sub Dlo() { shift=~m|q([1]?[0-9])|?"d".($1*2):""; }
sub Dhi() { shift=~m|q([1]?[0-9])|?"d".($1*2+1):""; }
@@ -170,11 +182,18 @@ bn_GF2m_mul_2x2:
#if __ARM_ARCH__>=7
ldr r12,.LOPENSSL_armcap
.Lpic: ldr r12,[pc,r12]
+#ifdef __APPLE__
+ ldr r12,[r12]
+#endif
tst r12,#1
beq .Lialu
veor $A1,$A1
+#ifdef __APPLE__
+ vmov $B1,r3,r3 @ two copies of b1
+#else
vmov.32 $B1,r3,r3 @ two copies of b1
+#endif
vmov.32 ${A1}[0],r1 @ a1
veor $A0,$A0
diff --git a/crypto/bn/asm/armv4-mont.pl b/crypto/bn/asm/armv4-mont.pl
index f78a8b5..aa00f38 100644
--- a/crypto/bn/asm/armv4-mont.pl
+++ b/crypto/bn/asm/armv4-mont.pl
@@ -23,8 +23,20 @@
# than 1/2KB. Windows CE port would be trivial, as it's exclusively
# about decorations, ABI and instruction syntax are identical.
-while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {}
-open STDOUT,">$output";
+$flavour = shift;
+if ($flavour=~/^\w[\w\-]*\.\w+$/) { $output=$flavour; undef $flavour; }
+else { while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {} }
+
+if ($flavour && $flavour ne "void") {
+ $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
+ ( $xlate="${dir}arm-xlate.pl" and -f $xlate ) or
+ ( $xlate="${dir}../../perlasm/arm-xlate.pl" and -f $xlate) or
+ die "can't locate arm-xlate.pl";
+
+ open STDOUT,"| \"$^X\" $xlate $flavour $output";
+} else {
+ open STDOUT,">$output";
+}
$num="r0"; # starts as num argument, but holds &tp[num-1]
$ap="r1";
diff --git a/crypto/evp/e_aes.c b/crypto/evp/e_aes.c
index 45e8504..3854b51 100644
--- a/crypto/evp/e_aes.c
+++ b/crypto/evp/e_aes.c
@@ -471,6 +471,35 @@ const EVP_CIPHER *EVP_aes_##keylen##_##mode(void) \
{ return &aes_##keylen##_##mode; }
#endif
+#if defined(OPENSSL_CPUID_OBJ) && defined(__aarch64__)
+#include "arm_arch.h"
+#if __ARM_ARCH__>=7
+# define HWAES_CAPABLE (OPENSSL_armcap_P & ARMV8_AES)
+# define HWAES_set_encrypt_key aes_v8_set_encrypt_key
+# define HWAES_set_decrypt_key aes_v8_set_decrypt_key
+# define HWAES_encrypt aes_v8_encrypt
+# define HWAES_decrypt aes_v8_decrypt
+# define HWAES_cbc_encrypt aes_v8_cbc_encrypt
+# define HWAES_ctr32_encrypt_blocks aes_v8_ctr32_encrypt_blocks
+#endif
+#endif
+
+#if defined(HWAES_CAPABLE)
+int HWAES_set_encrypt_key(const unsigned char *userKey, const int bits,
+ AES_KEY *key);
+int HWAES_set_decrypt_key(const unsigned char *userKey, const int bits,
+ AES_KEY *key);
+void HWAES_encrypt(const unsigned char *in, unsigned char *out,
+ const AES_KEY *key);
+void HWAES_decrypt(const unsigned char *in, unsigned char *out,
+ const AES_KEY *key);
+void HWAES_cbc_encrypt(const unsigned char *in, unsigned char *out,
+ size_t length, const AES_KEY *key,
+ unsigned char *ivec, const int enc);
+void HWAES_ctr32_encrypt_blocks(const unsigned char *in, unsigned char *out,
+ size_t len, const AES_KEY *key, const unsigned char ivec[16]);
+#endif
+
#define BLOCK_CIPHER_generic_pack(nid,keylen,flags) \
BLOCK_CIPHER_generic(nid,keylen,16,16,cbc,cbc,CBC,flags|EVP_CIPH_FLAG_DEFAULT_ASN1) \
BLOCK_CIPHER_generic(nid,keylen,16,0,ecb,ecb,ECB,flags|EVP_CIPH_FLAG_DEFAULT_ASN1) \
@@ -489,6 +518,19 @@ static int aes_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key,
mode = ctx->cipher->flags & EVP_CIPH_MODE;
if ((mode == EVP_CIPH_ECB_MODE || mode == EVP_CIPH_CBC_MODE)
&& !enc)
+#ifdef HWAES_CAPABLE
+ if (HWAES_CAPABLE)
+ {
+ ret = HWAES_set_decrypt_key(key,ctx->key_len*8,&dat->ks);
+ dat->block = (block128_f)HWAES_decrypt;
+ dat->stream.cbc = NULL;
+#ifdef HWAES_cbc_encrypt
+ if (mode==EVP_CIPH_CBC_MODE)
+ dat->stream.cbc = (cbc128_f)HWAES_cbc_encrypt;
+#endif
+ }
+ else
+#endif
#ifdef BSAES_CAPABLE
if (BSAES_CAPABLE && mode==EVP_CIPH_CBC_MODE)
{
@@ -517,6 +559,26 @@ static int aes_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key,
NULL;
}
else
+#ifdef HWAES_CAPABLE
+ if (HWAES_CAPABLE)
+ {
+ ret = HWAES_set_encrypt_key(key,ctx->key_len*8,&dat->ks);
+ dat->block = (block128_f)HWAES_encrypt;
+ dat->stream.cbc = NULL;
+#ifdef HWAES_cbc_encrypt
+ if (mode==EVP_CIPH_CBC_MODE)
+ dat->stream.cbc = (cbc128_f)HWAES_cbc_encrypt;
+ else
+#endif
+#ifdef HWAES_ctr32_encrypt_blocks
+ if (mode==EVP_CIPH_CTR_MODE)
+ dat->stream.ctr = (ctr128_f)HWAES_ctr32_encrypt_blocks;
+ else
+#endif
+ (void)0; /* terminate potentially open 'else' */
+ }
+ else
+#endif
#ifdef BSAES_CAPABLE
if (BSAES_CAPABLE && mode==EVP_CIPH_CTR_MODE)
{
@@ -809,6 +871,21 @@ static int aes_gcm_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key,
return 1;
if (key)
{ do {
+#ifdef HWAES_CAPABLE
+ if (HWAES_CAPABLE)
+ {
+ HWAES_set_encrypt_key(key,ctx->key_len*8,&gctx->ks);
+ CRYPTO_gcm128_init(&gctx->gcm,&gctx->ks,
+ (block128_f)HWAES_encrypt);
+#ifdef HWAES_ctr32_encrypt_blocks
+ gctx->ctr = (ctr128_f)HWAES_ctr32_encrypt_blocks;
+#else
+ gctx->ctr = NULL;
+#endif
+ break;
+ }
+ else
+#endif
#ifdef BSAES_CAPABLE
if (BSAES_CAPABLE)
{
@@ -1047,6 +1124,29 @@ static int aes_xts_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key,
{
xctx->stream = NULL;
/* key_len is two AES keys */
+#ifdef HWAES_CAPABLE
+ if (HWAES_CAPABLE)
+ {
+ if (enc)
+ {
+ HWAES_set_encrypt_key(key, ctx->key_len * 4, &xctx->ks1);
+ xctx->xts.block1 = (block128_f)HWAES_encrypt;
+ }
+ else
+ {
+ HWAES_set_decrypt_key(key, ctx->key_len * 4, &xctx->ks1);
+ xctx->xts.block1 = (block128_f)HWAES_decrypt;
+ }
+
+ HWAES_set_encrypt_key(key + ctx->key_len/2,
+ ctx->key_len * 4, &xctx->ks2);
+ xctx->xts.block2 = (block128_f)HWAES_encrypt;
+
+ xctx->xts.key1 = &xctx->ks1;
+ break;
+ }
+ else
+#endif
#ifdef VPAES_CAPABLE
if (VPAES_CAPABLE)
{
@@ -1189,6 +1289,19 @@ static int aes_ccm_init_key(EVP_CIPHER_CTX *ctx, const unsigned char *key,
return 1;
if (key) do
{
+#ifdef HWAES_CAPABLE
+ if (HWAES_CAPABLE)
+ {
+ HWAES_set_encrypt_key(key,ctx->key_len*8,&cctx->ks);
+
+ CRYPTO_ccm128_init(&cctx->ccm, cctx->M, cctx->L,
+ &cctx->ks, (block128_f)HWAES_encrypt);
+ cctx->str = NULL;
+ cctx->key_set = 1;
+ break;
+ }
+ else
+#endif
#ifdef VPAES_CAPABLE
if (VPAES_CAPABLE)
{
diff --git a/crypto/modes/Makefile b/crypto/modes/Makefile
index 8119693..f4930c6 100644
--- a/crypto/modes/Makefile
+++ b/crypto/modes/Makefile
@@ -56,11 +56,14 @@ ghash-alpha.s: asm/ghash-alpha.pl
$(PERL) $< | $(CC) -E - | tee $@ > /dev/null
ghash-parisc.s: asm/ghash-parisc.pl
$(PERL) asm/ghash-parisc.pl $(PERLASM_SCHEME) $@
+ghashv8-armx.S: asm/ghashv8-armx.pl
+ $(PERL) asm/ghashv8-armx.pl $(PERLASM_SCHEME) $@
# GNU make "catch all"
ghash-%.S: asm/ghash-%.pl; $(PERL) $< $(PERLASM_SCHEME) $@
ghash-armv4.o: ghash-armv4.S
+ghashv8-armx.o: ghashv8-armx.S
files:
$(PERL) $(TOP)/util/files.pl Makefile >> $(TOP)/MINFO
diff --git a/crypto/modes/asm/ghash-armv4.pl b/crypto/modes/asm/ghash-armv4.pl
index d91586e..3799b2b 100644
--- a/crypto/modes/asm/ghash-armv4.pl
+++ b/crypto/modes/asm/ghash-armv4.pl
@@ -57,8 +57,20 @@
# *native* byte order on current platform. See gcm128.c for working
# example...
-while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {}
-open STDOUT,">$output";
+$flavour = shift;
+if ($flavour=~/^\w[\w\-]*\.\w+$/) { $output=$flavour; undef $flavour; }
+else { while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {} }
+
+if ($flavour && $flavour ne "void") {
+ $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
+ ( $xlate="${dir}arm-xlate.pl" and -f $xlate ) or
+ ( $xlate="${dir}../../perlasm/arm-xlate.pl" and -f $xlate) or
+ die "can't locate arm-xlate.pl";
+
+ open STDOUT,"| \"$^X\" $xlate $flavour $output";
+} else {
+ open STDOUT,">$output";
+}
$Xi="r0"; # argument block
$Htbl="r1";
@@ -112,6 +124,11 @@ $code=<<___;
.text
.code 32
+#ifdef __APPLE__
+#define ldrplb ldrbpl
+#define ldrneb ldrbne
+#endif
+
.type rem_4bit,%object
.align 5
rem_4bit:
@@ -326,9 +343,9 @@ $code.=<<___;
.align 4
gcm_gmult_neon:
sub $Htbl,#16 @ point at H in GCM128_CTX
- vld1.64 `&Dhi("$IN")`,[$Xi,:64]!@ load Xi
+ vld1.64 `&Dhi("$IN")`,[$Xi]! @ load Xi
vmov.i32 $mod,#0xe1 @ our irreducible polynomial
- vld1.64 `&Dlo("$IN")`,[$Xi,:64]!
+ vld1.64 `&Dlo("$IN")`,[$Xi]!
vshr.u64 $mod,#32
vldmia $Htbl,{$Hhi-$Hlo} @ load H
veor $zero,$zero
@@ -349,9 +366,9 @@ gcm_gmult_neon:
.type gcm_ghash_neon,%function
.align 4
gcm_ghash_neon:
- vld1.64 `&Dhi("$Z")`,[$Xi,:64]! @ load Xi
+ vld1.64 `&Dhi("$Z")`,[$Xi]! @ load Xi
vmov.i32 $mod,#0xe1 @ our irreducible polynomial
- vld1.64 `&Dlo("$Z")`,[$Xi,:64]!
+ vld1.64 `&Dlo("$Z")`,[$Xi]!
vshr.u64 $mod,#32
vldmia $Xi,{$Hhi-$Hlo} @ load H
veor $zero,$zero
@@ -410,8 +427,8 @@ gcm_ghash_neon:
vrev64.8 $Z,$Z
#endif
sub $Xi,#16
- vst1.64 `&Dhi("$Z")`,[$Xi,:64]! @ write out Xi
- vst1.64 `&Dlo("$Z")`,[$Xi,:64]
+ vst1.64 `&Dhi("$Z")`,[$Xi]! @ write out Xi
+ vst1.64 `&Dlo("$Z")`,[$Xi]
bx lr
.size gcm_ghash_neon,.-gcm_ghash_neon
diff --git a/crypto/modes/asm/ghashv8-armx.pl b/crypto/modes/asm/ghashv8-armx.pl
new file mode 100644
index 0000000..300e8d5
--- /dev/null
+++ b/crypto/modes/asm/ghashv8-armx.pl
@@ -0,0 +1,376 @@
+#!/usr/bin/env perl
+#
+# ====================================================================
+# Written by Andy Polyakov <appro at openssl.org> for the OpenSSL
+# project. The module is, however, dual licensed under OpenSSL and
+# CRYPTOGAMS licenses depending on where you obtain it. For further
+# details see http://www.openssl.org/~appro/cryptogams/.
+# ====================================================================
+#
+# GHASH for ARMv8 Crypto Extension, 64-bit polynomial multiplication.
+#
+# June 2014
+#
+# Initial version was developed in tight cooperation with Ard
+# Biesheuvel <ard.biesheuvel at linaro.org> from bits-n-pieces from
+# other assembly modules. Just like aesv8-armx.pl this module
+# supports both AArch32 and AArch64 execution modes.
+#
+# July 2014
+#
+# Implement 2x aggregated reduction [see ghash-x86.pl for background
+# information].
+#
+# Current performance in cycles per processed byte:
+#
+# PMULL[2] 32-bit NEON(*)
+# Apple A7 0.92 5.62
+# Cortex-A53 1.01 8.39
+# Cortex-A57 1.17 7.61
+#
+# (*) presented for reference/comparison purposes;
+
+$flavour = shift;
+$output = shift;
+
+$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
+( $xlate="${dir}arm-xlate.pl" and -f $xlate ) or
+( $xlate="${dir}../../perlasm/arm-xlate.pl" and -f $xlate) or
+die "can't locate arm-xlate.pl";
+
+open OUT,"| \"$^X\" $xlate $flavour $output";
+*STDOUT=*OUT;
+
+$Xi="x0"; # argument block
+$Htbl="x1";
+$inp="x2";
+$len="x3";
+
+$inc="x12";
+
+{
+my ($Xl,$Xm,$Xh,$IN)=map("q$_",(0..3));
+my ($t0,$t1,$t2,$xC2,$H,$Hhl,$H2)=map("q$_",(8..14));
+
+$code=<<___;
+#include "arm_arch.h"
+
+.text
+___
+$code.=".arch armv8-a+crypto\n" if ($flavour =~ /64/);
+$code.=".fpu neon\n.code 32\n" if ($flavour !~ /64/);
+
+$code.=<<___;
+.global gcm_init_v8
+.type gcm_init_v8,%function
+.align 4
+gcm_init_v8:
+ vld1.64 {$t1},[x1] @ load H
+ vmov.i8 $xC2,#0xe1
+ vshl.i64 $xC2,$xC2,#57 @ 0xc2.0
+ vext.8 $IN,$t1,$t1,#8
+ vshr.u64 $t2,$xC2,#63
+ vdup.32 $t1,${t1}[1]
+ vext.8 $t0,$t2,$xC2,#8 @ t0=0xc2....01
+ vshr.u64 $t2,$IN,#63
+ vshr.s32 $t1,$t1,#31 @ broadcast carry bit
+ vand $t2,$t2,$t0
+ vshl.i64 $IN,$IN,#1
+ vext.8 $t2,$t2,$t2,#8
+ vand $t0,$t0,$t1
+ vorr $IN,$IN,$t2 @ H<<<=1
+ veor $H,$IN,$t0 @ twisted H
+ vst1.64 {$H},[x0],#16
+
+ @ calculate H^2
+ vext.8 $t0,$H,$H,#8 @ Karatsuba pre-processing
+ vpmull.p64 $Xl,$H,$H
+ veor $t0,$t0,$H
+ vpmull2.p64 $Xh,$H,$H
+ vpmull.p64 $Xm,$t0,$t0
+
+ vext.8 $t1,$Xl,$Xh,#8 @ Karatsuba post-processing
+ veor $t2,$Xl,$Xh
+ veor $Xm,$Xm,$t1
+ veor $Xm,$Xm,$t2
+ vpmull.p64 $t2,$Xl,$xC2 @ 1st phase
+
+ vmov $Xh#lo,$Xm#hi @ Xh|Xm - 256-bit result
+ vmov $Xm#hi,$Xl#lo @ Xm is rotated Xl
+ veor $Xl,$Xm,$t2
+
+ vext.8 $t2,$Xl,$Xl,#8 @ 2nd phase
+ vpmull.p64 $Xl,$Xl,$xC2
+ veor $t2,$t2,$Xh
+ veor $H2,$Xl,$t2
+
+ vext.8 $t1,$H2,$H2,#8 @ Karatsuba pre-processing
+ veor $t1,$t1,$H2
+ vext.8 $Hhl,$t0,$t1,#8 @ pack Karatsuba pre-processed
+ vst1.64 {$Hhl-$H2},[x0]
+
+ ret
+.size gcm_init_v8,.-gcm_init_v8
+
+.global gcm_gmult_v8
+.type gcm_gmult_v8,%function
+.align 4
+gcm_gmult_v8:
+ vld1.64 {$t1},[$Xi] @ load Xi
+ vmov.i8 $xC2,#0xe1
+ vld1.64 {$H-$Hhl},[$Htbl] @ load twisted H, ...
+ vshl.u64 $xC2,$xC2,#57
+#ifndef __ARMEB__
+ vrev64.8 $t1,$t1
+#endif
+ vext.8 $IN,$t1,$t1,#8
+
+ vpmull.p64 $Xl,$H,$IN @ H.lo·Xi.lo
+ veor $t1,$t1,$IN @ Karatsuba pre-processing
+ vpmull2.p64 $Xh,$H,$IN @ H.hi·Xi.hi
+ vpmull.p64 $Xm,$Hhl,$t1 @ (H.lo+H.hi)·(Xi.lo+Xi.hi)
+
+ vext.8 $t1,$Xl,$Xh,#8 @ Karatsuba post-processing
+ veor $t2,$Xl,$Xh
+ veor $Xm,$Xm,$t1
+ veor $Xm,$Xm,$t2
+ vpmull.p64 $t2,$Xl,$xC2 @ 1st phase
+
+ vmov $Xh#lo,$Xm#hi @ Xh|Xm - 256-bit result
+ vmov $Xm#hi,$Xl#lo @ Xm is rotated Xl
+ veor $Xl,$Xm,$t2
+
+ vext.8 $t2,$Xl,$Xl,#8 @ 2nd phase
+ vpmull.p64 $Xl,$Xl,$xC2
+ veor $t2,$t2,$Xh
+ veor $Xl,$Xl,$t2
+
+#ifndef __ARMEB__
+ vrev64.8 $Xl,$Xl
+#endif
+ vext.8 $Xl,$Xl,$Xl,#8
+ vst1.64 {$Xl},[$Xi] @ write out Xi
+
+ ret
+.size gcm_gmult_v8,.-gcm_gmult_v8
+
+.global gcm_ghash_v8
+.type gcm_ghash_v8,%function
+.align 4
+gcm_ghash_v8:
+___
+$code.=<<___ if ($flavour !~ /64/);
+ vstmdb sp!,{d8-d15}
+___
+$code.=<<___;
+ vld1.64 {$Xl},[$Xi] @ load [rotated] Xi
+ subs $len,$len,#32
+ vmov.i8 $xC2,#0xe1
+ mov $inc,#16
+ vld1.64 {$H-$Hhl},[$Htbl],#32 @ load twisted H, ..., H^2
+ vld1.64 {$H2},[$Htbl]
+ cclr $inc,eq
+ vext.8 $Xl,$Xl,$Xl,#8
+ vld1.64 {$t0},[$inp],#16 @ load [rotated] I[0]
+ vshl.u64 $xC2,$xC2,#57 @ 0xc2.0
+#ifndef __ARMEB__
+ vrev64.8 $t0,$t0
+ vrev64.8 $Xl,$Xl
+#endif
+ vext.8 $IN,$t0,$t0,#8
+ b.lo .Lodd_tail_v8
+___
+{ my ($Xln,$Xmn,$Xhn,$In) = map("q$_",(4..7));
+ #######
+ # Xi+2 =[H*(Ii+1 + Xi+1)] mod P =
+ # [(H*Ii+1) + (H*Xi+1)] mod P =
+ # [(H*Ii+1) + H^2*(Ii+Xi)] mod P
+ #
+$code.=<<___;
+ vld1.64 {$t1},[$inp],$inc @ load [rotated] I[1]
+#ifndef __ARMEB__
+ vrev64.8 $t1,$t1
+#endif
+ vext.8 $In,$t1,$t1,#8
+ veor $IN,$IN,$Xl @ I[i]^=Xi
+ vpmull.p64 $Xln,$H,$In @ H·Ii+1
+ veor $t1,$t1,$In @ Karatsuba pre-processing
+ vpmull2.p64 $Xhn,$H,$In
+ b .Loop_mod2x_v8
+
+.align 4
+.Loop_mod2x_v8:
+ vext.8 $t2,$IN,$IN,#8
+ subs $len,$len,#32
+ vpmull.p64 $Xl,$H2,$IN @ H^2.lo·Xi.lo
+ cclr $inc,lo
+
+ vpmull.p64 $Xmn,$Hhl,$t1
+ veor $t2,$t2,$IN @ Karatsuba pre-processing
+ vpmull2.p64 $Xh,$H2,$IN @ H^2.hi·Xi.hi
+ veor $Xl,$Xl,$Xln @ accumulate
+ vpmull2.p64 $Xm,$Hhl,$t2 @ (H^2.lo+H^2.hi)·(Xi.lo+Xi.hi)
+ vld1.64 {$t0},[$inp],$inc @ load [rotated] I[i]
+
+ veor $Xh,$Xh,$Xhn
+ cclr $inc,eq
+ veor $Xm,$Xm,$Xmn
+
+ vext.8 $t1,$Xl,$Xh,#8 @ Karatsuba post-processing
+ veor $t2,$Xl,$Xh
+ veor $Xm,$Xm,$t1
+ vld1.64 {$t1},[$inp],$inc @ load [rotated] I[i+1]
+#ifndef __ARMEB__
+ vrev64.8 $t0,$t0
+#endif
+ veor $Xm,$Xm,$t2
+ vpmull.p64 $t2,$Xl,$xC2 @ 1st phase
+
+#ifndef __ARMEB__
+ vrev64.8 $t1,$t1
+#endif
+ vmov $Xh#lo,$Xm#hi @ Xh|Xm - 256-bit result
+ vmov $Xm#hi,$Xl#lo @ Xm is rotated Xl
+ vext.8 $In,$t1,$t1,#8
+ vext.8 $IN,$t0,$t0,#8
+ veor $Xl,$Xm,$t2
+ vpmull.p64 $Xln,$H,$In @ H·Ii+1
+ veor $IN,$IN,$Xh @ accumulate $IN early
+
+ vext.8 $t2,$Xl,$Xl,#8 @ 2nd phase
+ vpmull.p64 $Xl,$Xl,$xC2
+ veor $IN,$IN,$t2
+ veor $t1,$t1,$In @ Karatsuba pre-processing
+ veor $IN,$IN,$Xl
+ vpmull2.p64 $Xhn,$H,$In
+ b.hs .Loop_mod2x_v8
+
+ veor $Xh,$Xh,$t2
+ vext.8 $IN,$t0,$t0,#8 @ re-construct $IN
+ adds $len,$len,#32
+ veor $Xl,$Xl,$Xh @ re-construct $Xl
+ b.eq .Ldone_v8
+___
+}
+$code.=<<___;
+.Lodd_tail_v8:
+ vext.8 $t2,$Xl,$Xl,#8
+ veor $IN,$IN,$Xl @ inp^=Xi
+ veor $t1,$t0,$t2 @ $t1 is rotated inp^Xi
+
+ vpmull.p64 $Xl,$H,$IN @ H.lo·Xi.lo
+ veor $t1,$t1,$IN @ Karatsuba pre-processing
+ vpmull2.p64 $Xh,$H,$IN @ H.hi·Xi.hi
+ vpmull.p64 $Xm,$Hhl,$t1 @ (H.lo+H.hi)·(Xi.lo+Xi.hi)
+
+ vext.8 $t1,$Xl,$Xh,#8 @ Karatsuba post-processing
+ veor $t2,$Xl,$Xh
+ veor $Xm,$Xm,$t1
+ veor $Xm,$Xm,$t2
+ vpmull.p64 $t2,$Xl,$xC2 @ 1st phase
+
+ vmov $Xh#lo,$Xm#hi @ Xh|Xm - 256-bit result
+ vmov $Xm#hi,$Xl#lo @ Xm is rotated Xl
+ veor $Xl,$Xm,$t2
+
+ vext.8 $t2,$Xl,$Xl,#8 @ 2nd phase
+ vpmull.p64 $Xl,$Xl,$xC2
+ veor $t2,$t2,$Xh
+ veor $Xl,$Xl,$t2
+
+.Ldone_v8:
+#ifndef __ARMEB__
+ vrev64.8 $Xl,$Xl
+#endif
+ vext.8 $Xl,$Xl,$Xl,#8
+ vst1.64 {$Xl},[$Xi] @ write out Xi
+
+___
+$code.=<<___ if ($flavour !~ /64/);
+ vldmia sp!,{d8-d15}
+___
+$code.=<<___;
+ ret
+.size gcm_ghash_v8,.-gcm_ghash_v8
+___
+}
+$code.=<<___;
+.asciz "GHASH for ARMv8, CRYPTOGAMS by <appro\@openssl.org>"
+.align 2
+___
+
+if ($flavour =~ /64/) { ######## 64-bit code
+ sub unvmov {
+ my $arg=shift;
+
+ $arg =~ m/q([0-9]+)#(lo|hi),\s*q([0-9]+)#(lo|hi)/o &&
+ sprintf "ins v%d.d[%d],v%d.d[%d]",$1,($2 eq "lo")?0:1,$3,($4 eq "lo")?0:1;
+ }
+ foreach(split("\n",$code)) {
+ s/cclr\s+([wx])([^,]+),\s*([a-z]+)/csel $1$2,$1zr,$1$2,$3/o or
+ s/vmov\.i8/movi/o or # fix up legacy mnemonics
+ s/vmov\s+(.*)/unvmov($1)/geo or
+ s/vext\.8/ext/o or
+ s/vshr\.s/sshr\.s/o or
+ s/vshr/ushr/o or
+ s/^(\s+)v/$1/o or # strip off v prefix
+ s/\bbx\s+lr\b/ret/o;
+
+ s/\bq([0-9]+)\b/"v".($1<8?$1:$1+8).".16b"/geo; # old->new registers
+ s/@\s/\/\//o; # old->new style commentary
+
+ # fix up remainig legacy suffixes
+ s/\.[ui]?8(\s)/$1/o;
+ s/\.[uis]?32//o and s/\.16b/\.4s/go;
+ m/\.p64/o and s/\.16b/\.1q/o; # 1st pmull argument
+ m/l\.p64/o and s/\.16b/\.1d/go; # 2nd and 3rd pmull arguments
+ s/\.[uisp]?64//o and s/\.16b/\.2d/go;
+ s/\.[42]([sd])\[([0-3])\]/\.$1\[$2\]/o;
+
+ print $_,"\n";
+ }
+} else { ######## 32-bit code
+ sub unvdup32 {
+ my $arg=shift;
+
+ $arg =~ m/q([0-9]+),\s*q([0-9]+)\[([0-3])\]/o &&
+ sprintf "vdup.32 q%d,d%d[%d]",$1,2*$2+($3>>1),$3&1;
+ }
+ sub unvpmullp64 {
+ my ($mnemonic,$arg)=@_;
+
+ if ($arg =~ m/q([0-9]+),\s*q([0-9]+),\s*q([0-9]+)/o) {
+ my $word = 0xf2a00e00|(($1&7)<<13)|(($1&8)<<19)
+ |(($2&7)<<17)|(($2&8)<<4)
+ |(($3&7)<<1) |(($3&8)<<2);
+ $word |= 0x00010001 if ($mnemonic =~ "2");
+ # since ARMv7 instructions are always encoded little-endian.
+ # correct solution is to use .inst directive, but older
+ # assemblers don't implement it:-(
+ sprintf ".byte\t0x%02x,0x%02x,0x%02x,0x%02x\t@ %s %s",
+ $word&0xff,($word>>8)&0xff,
+ ($word>>16)&0xff,($word>>24)&0xff,
+ $mnemonic,$arg;
+ }
+ }
+
+ foreach(split("\n",$code)) {
+ s/\b[wx]([0-9]+)\b/r$1/go; # new->old registers
+ s/\bv([0-9])\.[12468]+[bsd]\b/q$1/go; # new->old registers
+ s/\/\/\s?/@ /o; # new->old style commentary
+
+ # fix up remainig new-style suffixes
+ s/\],#[0-9]+/]!/o;
+
+ s/cclr\s+([^,]+),\s*([a-z]+)/mov$2 $1,#0/o or
+ s/vdup\.32\s+(.*)/unvdup32($1)/geo or
+ s/v?(pmull2?)\.p64\s+(.*)/unvpmullp64($1,$2)/geo or
+ s/\bq([0-9]+)#(lo|hi)/sprintf "d%d",2*$1+($2 eq "hi")/geo or
+ s/^(\s+)b\./$1b/o or
+ s/^(\s+)ret/$1bx\tlr/o;
+
+ print $_,"\n";
+ }
+}
+
+close STDOUT; # enforce flush
diff --git a/crypto/modes/gcm128.c b/crypto/modes/gcm128.c
index 8dfeae5..a5b76c5 100644
--- a/crypto/modes/gcm128.c
+++ b/crypto/modes/gcm128.c
@@ -645,7 +645,7 @@ static void gcm_gmult_1bit(u64 Xi[2],const u64 H[2])
#endif
-#if TABLE_BITS==4 && defined(GHASH_ASM)
+#if TABLE_BITS==4 && (defined(GHASH_ASM) || defined(OPENSSL_CPUID_OBJ))
# if !defined(I386_ONLY) && \
(defined(__i386) || defined(__i386__) || \
defined(__x86_64) || defined(__x86_64__) || \
@@ -666,13 +666,22 @@ void gcm_ghash_4bit_mmx(u64 Xi[2],const u128 Htable[16],const u8 *inp,size_t len
void gcm_gmult_4bit_x86(u64 Xi[2],const u128 Htable[16]);
void gcm_ghash_4bit_x86(u64 Xi[2],const u128 Htable[16],const u8 *inp,size_t len);
# endif
-# elif defined(__arm__) || defined(__arm)
+# elif defined(__arm__) || defined(__arm) || defined(__aarch64__)
# include "arm_arch.h"
# if __ARM_ARCH__>=7
# define GHASH_ASM_ARM
# define GCM_FUNCREF_4BIT
+# if defined(__aarch64__)
+# define PMULL_CAPABLE (OPENSSL_armcap_P & ARMV8_PMULL)
+# endif
+# if defined(__arm__) || defined(__arm)
+# define NEON_CAPABLE (OPENSSL_armcap_P & ARMV7_NEON)
+# endif
void gcm_gmult_neon(u64 Xi[2],const u128 Htable[16]);
void gcm_ghash_neon(u64 Xi[2],const u128 Htable[16],const u8 *inp,size_t len);
+void gcm_init_v8(u128 Htable[16],const u64 Xi[2]);
+void gcm_gmult_v8(u64 Xi[2],const u128 Htable[16]);
+void gcm_ghash_v8(u64 Xi[2],const u128 Htable[16],const u8 *inp,size_t len);
# endif
# elif defined(_TMS320C6400_PLUS)
# define GHASH_ASM_C64Xplus
@@ -740,10 +749,20 @@ void CRYPTO_gcm128_init(GCM128_CONTEXT *ctx,void *key,block128_f block)
ctx->ghash = gcm_ghash_4bit;
# endif
# elif defined(GHASH_ASM_ARM)
- if (OPENSSL_armcap_P & ARMV7_NEON) {
+# ifdef PMULL_CAPABLE
+ if (PMULL_CAPABLE) {
+ gcm_init_v8(ctx->Htable,ctx->H.u);
+ ctx->gmult = gcm_gmult_v8;
+ ctx->ghash = gcm_ghash_v8;
+ } else
+# endif
+# ifdef NEON_CAPABLE
+ if (NEON_CAPABLE) {
ctx->gmult = gcm_gmult_neon;
ctx->ghash = gcm_ghash_neon;
- } else {
+ } else
+# endif
+ {
gcm_init_4bit(ctx->Htable,ctx->H.u);
ctx->gmult = gcm_gmult_4bit;
ctx->ghash = gcm_ghash_4bit;
diff --git a/crypto/modes/modes_lcl.h b/crypto/modes/modes_lcl.h
index 4dab6a6..01ad9f3 100644
--- a/crypto/modes/modes_lcl.h
+++ b/crypto/modes/modes_lcl.h
@@ -26,13 +26,16 @@ typedef unsigned int u32;
typedef unsigned char u8;
#define STRICT_ALIGNMENT 1
-#if defined(__i386) || defined(__i386__) || \
- defined(__x86_64) || defined(__x86_64__) || \
- defined(_M_IX86) || defined(_M_AMD64) || defined(_M_X64) || \
- defined(__s390__) || defined(__s390x__) || \
- ( (defined(__arm__) || defined(__arm)) && \
- (defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) || \
- defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__)) )
+#if defined(__i386) || defined(__i386__) || \
+ defined(__x86_64) || defined(__x86_64__) || \
+ defined(_M_IX86) || defined(_M_AMD64) || defined(_M_X64) || \
+ defined(__s390__) || defined(__s390x__) || \
+ ( \
+ ( (defined(__arm__) || defined(__arm)) && \
+ (defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) || \
+ defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__)) ) && \
+ !( defined(__arm__) && defined(__APPLE__) ) \
+ )
# undef STRICT_ALIGNMENT
#endif
diff --git a/crypto/perlasm/arm-xlate.pl b/crypto/perlasm/arm-xlate.pl
new file mode 100644
index 0000000..22dc7e4
--- /dev/null
+++ b/crypto/perlasm/arm-xlate.pl
@@ -0,0 +1,165 @@
+#!/usr/bin/env perl
+
+# ARM assembler distiller by <appro>.
+
+my $flavour = shift;
+my $output = shift;
+open STDOUT,">$output" || die "can't open $output: $!";
+
+$flavour = "linux32" if (!$flavour or $flavour eq "void");
+
+my %GLOBALS;
+my $dotinlocallabels=($flavour=~/linux/)?1:0;
+
+################################################################
+# directives which need special treatment on different platforms
+################################################################
+my $arch = sub {
+ if ($flavour =~ /linux/) { ".arch\t".join(',', at _); }
+ else { ""; }
+};
+my $fpu = sub {
+ if ($flavour =~ /linux/) { ".fpu\t".join(',', at _); }
+ else { ""; }
+};
+my $hidden = sub {
+ if ($flavour =~ /ios/) { ".private_extern\t".join(',', at _); }
+ else { ".hidden\t".join(',', at _); }
+};
+my $comm = sub {
+ my @args = split(/,\s*/,shift);
+ my $name = @args[0];
+ my $global = \$GLOBALS{$name};
+ my $ret;
+
+ if ($flavour =~ /ios32/) {
+ $ret = ".comm\t_$name, at args[1]\n";
+ $ret .= ".non_lazy_symbol_pointer\n";
+ $ret .= "$name:\n";
+ $ret .= ".indirect_symbol\t_$name\n";
+ $ret .= ".long\t0";
+ $name = "_$name";
+ } else { $ret = ".comm\t".join(',', at args); }
+
+ $$global = $name;
+ $ret;
+};
+my $globl = sub {
+ my $name = shift;
+ my $global = \$GLOBALS{$name};
+ my $ret;
+
+ SWITCH: for ($flavour) {
+ /ios/ && do { $name = "_$name";
+ last;
+ };
+ }
+
+ $ret = ".globl $name" if (!$ret);
+ $$global = $name;
+ $ret;
+};
+my $global = $globl;
+my $extern = sub {
+ &$globl(@_);
+ return; # return nothing
+};
+my $type = sub {
+ if ($flavour =~ /linux/) { ".type\t".join(',', at _); }
+ else { ""; }
+};
+my $size = sub {
+ if ($flavour =~ /linux/) { ".size\t".join(',', at _); }
+ else { ""; }
+};
+my $inst = sub {
+ if ($flavour =~ /linux/) { ".inst\t".join(',', at _); }
+ else { ".long\t".join(',', at _); }
+};
+my $asciz = sub {
+ my $line = join(",", at _);
+ if ($line =~ /^"(.*)"$/)
+ { ".byte " . join(",",unpack("C*",$1),0) . "\n.align 2"; }
+ else
+ { ""; }
+};
+
+sub range {
+ my ($r,$sfx,$start,$end) = @_;
+
+ join(",",map("$r$_$sfx",($start..$end)));
+}
+
+sub expand_line {
+ my $line = shift;
+ my @ret = ();
+
+ pos($line)=0;
+
+ while ($line =~ m/\G[^@\/\{\"]*/g) {
+ if ($line =~ m/\G(@|\/\/|$)/gc) {
+ last;
+ }
+ elsif ($line =~ m/\G\{/gc) {
+ my $saved_pos = pos($line);
+ $line =~ s/\G([rdqv])([0-9]+)([^\-]*)\-\1([0-9]+)\3/range($1,$3,$2,$4)/e;
+ pos($line) = $saved_pos;
+ $line =~ m/\G[^\}]*\}/g;
+ }
+ elsif ($line =~ m/\G\"/gc) {
+ $line =~ m/\G[^\"]*\"/g;
+ }
+ }
+
+ $line =~ s/\b(\w+)/$GLOBALS{$1} or $1/ge;
+
+ return $line;
+}
+
+while($line=<>) {
+
+ if ($line =~ m/^\s*(#|@|\/\/)/) { print $line; next; }
+
+ $line =~ s|/\*.*\*/||; # get rid of C-style comments...
+ $line =~ s|^\s+||; # ... and skip white spaces in beginning...
+ $line =~ s|\s+$||; # ... and at the end
+
+ {
+ $line =~ s|[\b\.]L(\w{2,})|L$1|g; # common denominator for Locallabel
+ $line =~ s|\bL(\w{2,})|\.L$1|g if ($dotinlocallabels);
+ }
+
+ {
+ $line =~ s|(^[\.\w]+)\:\s*||;
+ my $label = $1;
+ if ($label) {
+ printf "%s:",($GLOBALS{$label} or $label);
+ }
+ }
+
+ if ($line !~ m/^[#@]/) {
+ $line =~ s|^\s*(\.?)(\S+)\s*||;
+ my $c = $1; $c = "\t" if ($c eq "");
+ my $mnemonic = $2;
+ my $opcode;
+ if ($mnemonic =~ m/([^\.]+)\.([^\.]+)/) {
+ $opcode = eval("\$$1_$2");
+ } else {
+ $opcode = eval("\$$mnemonic");
+ }
+
+ my $arg=expand_line($line);
+
+ if (ref($opcode) eq 'CODE') {
+ $line = &$opcode($arg);
+ } elsif ($mnemonic) {
+ $line = $c.$mnemonic;
+ $line.= "\t$arg" if ($arg);
+ }
+ }
+
+ print $line if ($line);
+ print "\n";
+}
+
+close STDOUT;
diff --git a/crypto/sha/Makefile b/crypto/sha/Makefile
index b1582f2..63e1171 100644
--- a/crypto/sha/Makefile
+++ b/crypto/sha/Makefile
@@ -90,6 +90,9 @@ sha512-%.S: asm/sha512-%.pl; $(PERL) $< $(PERLASM_SCHEME) $@
sha1-armv4-large.o: sha1-armv4-large.S
sha256-armv4.o: sha256-armv4.S
sha512-armv4.o: sha512-armv4.S
+sha1-armv8.o: sha1-armv8.S
+sha256-armv8.o: sha256-armv8.S
+sha512-armv8.o: sha512-armv8.S
files:
$(PERL) $(TOP)/util/files.pl Makefile >> $(TOP)/MINFO
diff --git a/crypto/sha/asm/sha1-armv4-large.pl b/crypto/sha/asm/sha1-armv4-large.pl
index 33da3e0..6c0adb9 100644
--- a/crypto/sha/asm/sha1-armv4-large.pl
+++ b/crypto/sha/asm/sha1-armv4-large.pl
@@ -52,8 +52,20 @@
# Profiler-assisted and platform-specific optimization resulted in 10%
# improvement on Cortex A8 core and 12.2 cycles per byte.
-while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {}
-open STDOUT,">$output";
+$flavour = shift;
+if ($flavour=~/^\w[\w\-]*\.\w+$/) { $output=$flavour; undef $flavour; }
+else { while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {} }
+
+if ($flavour && $flavour ne "void") {
+ $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
+ ( $xlate="${dir}arm-xlate.pl" and -f $xlate ) or
+ ( $xlate="${dir}../../perlasm/arm-xlate.pl" and -f $xlate) or
+ die "can't locate arm-xlate.pl";
+
+ open STDOUT,"| \"$^X\" $xlate $flavour $output";
+} else {
+ open STDOUT,">$output";
+}
$ctx="r0";
$inp="r1";
diff --git a/crypto/sha/asm/sha1-armv8.pl b/crypto/sha/asm/sha1-armv8.pl
new file mode 100644
index 0000000..6be8624
--- /dev/null
+++ b/crypto/sha/asm/sha1-armv8.pl
@@ -0,0 +1,343 @@
+#!/usr/bin/env perl
+#
+# ====================================================================
+# Written by Andy Polyakov <appro at openssl.org> for the OpenSSL
+# project. The module is, however, dual licensed under OpenSSL and
+# CRYPTOGAMS licenses depending on where you obtain it. For further
+# details see http://www.openssl.org/~appro/cryptogams/.
+# ====================================================================
+#
+# SHA1 for ARMv8.
+#
+# Performance in cycles per processed byte and improvement coefficient
+# over code generated with "default" compiler:
+#
+# hardware-assisted software(*)
+# Apple A7 2.31 4.13 (+14%)
+# Cortex-A53 2.19 8.73 (+108%)
+# Cortex-A57 2.35 7.88 (+74%)
+#
+# (*) Software results are presented mostly for reference purposes.
+
+$flavour = shift;
+$output = shift;
+
+$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
+( $xlate="${dir}arm-xlate.pl" and -f $xlate ) or
+( $xlate="${dir}../../perlasm/arm-xlate.pl" and -f $xlate) or
+die "can't locate arm-xlate.pl";
+
+open OUT,"| \"$^X\" $xlate $flavour $output";
+*STDOUT=*OUT;
+
+($ctx,$inp,$num)=("x0","x1","x2");
+ at Xw=map("w$_",(3..17,19));
+ at Xx=map("x$_",(3..17,19));
+ at V=($A,$B,$C,$D,$E)=map("w$_",(20..24));
+($t0,$t1,$t2,$K)=map("w$_",(25..28));
+
+
+sub BODY_00_19 {
+my ($i,$a,$b,$c,$d,$e)=@_;
+my $j=($i+2)&15;
+
+$code.=<<___ if ($i<15 && !($i&1));
+ lsr @Xx[$i+1], at Xx[$i],#32
+___
+$code.=<<___ if ($i<14 && !($i&1));
+ ldr @Xx[$i+2],[$inp,#`($i+2)*4-64`]
+___
+$code.=<<___ if ($i<14 && ($i&1));
+#ifdef __ARMEB__
+ ror @Xx[$i+1], at Xx[$i+1],#32
+#else
+ rev32 @Xx[$i+1], at Xx[$i+1]
+#endif
+___
+$code.=<<___ if ($i<14);
+ bic $t0,$d,$b
+ and $t1,$c,$b
+ ror $t2,$a,#27
+ add $d,$d,$K // future e+=K
+ orr $t0,$t0,$t1
+ add $e,$e,$t2 // e+=rot(a,5)
+ ror $b,$b,#2
+ add $d,$d, at Xw[($i+1)&15] // future e+=X[i]
+ add $e,$e,$t0 // e+=F(b,c,d)
+___
+$code.=<<___ if ($i==19);
+ movz $K,#0xeba1
+ movk $K,#0x6ed9,lsl#16
+___
+$code.=<<___ if ($i>=14);
+ eor @Xw[$j], at Xw[$j], at Xw[($j+2)&15]
+ bic $t0,$d,$b
+ and $t1,$c,$b
+ ror $t2,$a,#27
+ eor @Xw[$j], at Xw[$j], at Xw[($j+8)&15]
+ add $d,$d,$K // future e+=K
+ orr $t0,$t0,$t1
+ add $e,$e,$t2 // e+=rot(a,5)
+ eor @Xw[$j], at Xw[$j], at Xw[($j+13)&15]
+ ror $b,$b,#2
+ add $d,$d, at Xw[($i+1)&15] // future e+=X[i]
+ add $e,$e,$t0 // e+=F(b,c,d)
+ ror @Xw[$j], at Xw[$j],#31
+___
+}
+
+sub BODY_40_59 {
+my ($i,$a,$b,$c,$d,$e)=@_;
+my $j=($i+2)&15;
+
+$code.=<<___ if ($i==59);
+ movz $K,#0xc1d6
+ movk $K,#0xca62,lsl#16
+___
+$code.=<<___;
+ orr $t0,$b,$c
+ and $t1,$b,$c
+ eor @Xw[$j], at Xw[$j], at Xw[($j+2)&15]
+ ror $t2,$a,#27
+ and $t0,$t0,$d
+ add $d,$d,$K // future e+=K
+ eor @Xw[$j], at Xw[$j], at Xw[($j+8)&15]
+ add $e,$e,$t2 // e+=rot(a,5)
+ orr $t0,$t0,$t1
+ ror $b,$b,#2
+ eor @Xw[$j], at Xw[$j], at Xw[($j+13)&15]
+ add $d,$d, at Xw[($i+1)&15] // future e+=X[i]
+ add $e,$e,$t0 // e+=F(b,c,d)
+ ror @Xw[$j], at Xw[$j],#31
+___
+}
+
+sub BODY_20_39 {
+my ($i,$a,$b,$c,$d,$e)=@_;
+my $j=($i+2)&15;
+
+$code.=<<___ if ($i==39);
+ movz $K,#0xbcdc
+ movk $K,#0x8f1b,lsl#16
+___
+$code.=<<___ if ($i<78);
+ eor @Xw[$j], at Xw[$j], at Xw[($j+2)&15]
+ eor $t0,$d,$b
+ ror $t2,$a,#27
+ add $d,$d,$K // future e+=K
+ eor @Xw[$j], at Xw[$j], at Xw[($j+8)&15]
+ eor $t0,$t0,$c
+ add $e,$e,$t2 // e+=rot(a,5)
+ ror $b,$b,#2
+ eor @Xw[$j], at Xw[$j], at Xw[($j+13)&15]
+ add $d,$d, at Xw[($i+1)&15] // future e+=X[i]
+ add $e,$e,$t0 // e+=F(b,c,d)
+ ror @Xw[$j], at Xw[$j],#31
+___
+$code.=<<___ if ($i==78);
+ ldp @Xw[1], at Xw[2],[$ctx]
+ eor $t0,$d,$b
+ ror $t2,$a,#27
+ add $d,$d,$K // future e+=K
+ eor $t0,$t0,$c
+ add $e,$e,$t2 // e+=rot(a,5)
+ ror $b,$b,#2
+ add $d,$d, at Xw[($i+1)&15] // future e+=X[i]
+ add $e,$e,$t0 // e+=F(b,c,d)
+___
+$code.=<<___ if ($i==79);
+ ldp @Xw[3], at Xw[4],[$ctx,#8]
+ eor $t0,$d,$b
+ ror $t2,$a,#27
+ eor $t0,$t0,$c
+ add $e,$e,$t2 // e+=rot(a,5)
+ ror $b,$b,#2
+ ldr @Xw[5],[$ctx,#16]
+ add $e,$e,$t0 // e+=F(b,c,d)
+___
+}
+
+$code.=<<___;
+#include "arm_arch.h"
+
+.text
+
+.extern OPENSSL_armcap_P
+.globl sha1_block_data_order
+.type sha1_block_data_order,%function
+.align 6
+sha1_block_data_order:
+ ldr x16,.LOPENSSL_armcap_P
+ adr x17,.LOPENSSL_armcap_P
+ add x16,x16,x17
+ ldr w16,[x16]
+ tst w16,#ARMV8_SHA1
+ b.ne .Lv8_entry
+
+ stp x29,x30,[sp,#-96]!
+ add x29,sp,#0
+ stp x19,x20,[sp,#16]
+ stp x21,x22,[sp,#32]
+ stp x23,x24,[sp,#48]
+ stp x25,x26,[sp,#64]
+ stp x27,x28,[sp,#80]
+
+ ldp $A,$B,[$ctx]
+ ldp $C,$D,[$ctx,#8]
+ ldr $E,[$ctx,#16]
+
+.Loop:
+ ldr @Xx[0],[$inp],#64
+ movz $K,#0x7999
+ sub $num,$num,#1
+ movk $K,#0x5a82,lsl#16
+#ifdef __ARMEB__
+ ror $Xx[0], at Xx[0],#32
+#else
+ rev32 @Xx[0], at Xx[0]
+#endif
+ add $E,$E,$K // warm it up
+ add $E,$E, at Xw[0]
+___
+for($i=0;$i<20;$i++) { &BODY_00_19($i, at V); unshift(@V,pop(@V)); }
+for(;$i<40;$i++) { &BODY_20_39($i, at V); unshift(@V,pop(@V)); }
+for(;$i<60;$i++) { &BODY_40_59($i, at V); unshift(@V,pop(@V)); }
+for(;$i<80;$i++) { &BODY_20_39($i, at V); unshift(@V,pop(@V)); }
+$code.=<<___;
+ add $B,$B, at Xw[2]
+ add $C,$C, at Xw[3]
+ add $A,$A, at Xw[1]
+ add $D,$D, at Xw[4]
+ add $E,$E, at Xw[5]
+ stp $A,$B,[$ctx]
+ stp $C,$D,[$ctx,#8]
+ str $E,[$ctx,#16]
+ cbnz $num,.Loop
+
+ ldp x19,x20,[sp,#16]
+ ldp x21,x22,[sp,#32]
+ ldp x23,x24,[sp,#48]
+ ldp x25,x26,[sp,#64]
+ ldp x27,x28,[sp,#80]
+ ldr x29,[sp],#96
+ ret
+.size sha1_block_data_order,.-sha1_block_data_order
+___
+{{{
+my ($ABCD,$E,$E0,$E1)=map("v$_.16b",(0..3));
+my @MSG=map("v$_.16b",(4..7));
+my @Kxx=map("v$_.4s",(16..19));
+my ($W0,$W1)=("v20.4s","v21.4s");
+my $ABCD_SAVE="v22.16b";
+
+$code.=<<___;
+.type sha1_block_armv8,%function
+.align 6
+sha1_block_armv8:
+.Lv8_entry:
+ stp x29,x30,[sp,#-16]!
+ add x29,sp,#0
+
+ adr x4,.Lconst
+ eor $E,$E,$E
+ ld1.32 {$ABCD},[$ctx],#16
+ ld1.32 {$E}[0],[$ctx]
+ sub $ctx,$ctx,#16
+ ld1.32 {@Kxx[0]- at Kxx[3]},[x4]
+
+.Loop_hw:
+ ld1 {@MSG[0]- at MSG[3]},[$inp],#64
+ sub $num,$num,#1
+ rev32 @MSG[0], at MSG[0]
+ rev32 @MSG[1], at MSG[1]
+
+ add.i32 $W0, at Kxx[0], at MSG[0]
+ rev32 @MSG[2], at MSG[2]
+ orr $ABCD_SAVE,$ABCD,$ABCD // offload
+
+ add.i32 $W1, at Kxx[0], at MSG[1]
+ rev32 @MSG[3], at MSG[3]
+ sha1h $E1,$ABCD
+ sha1c $ABCD,$E,$W0 // 0
+ add.i32 $W0, at Kxx[$j], at MSG[2]
+ sha1su0 @MSG[0], at MSG[1], at MSG[2]
+___
+for ($j=0,$i=1;$i<20-3;$i++) {
+my $f=("c","p","m","p")[$i/5];
+$code.=<<___;
+ sha1h $E0,$ABCD // $i
+ sha1$f $ABCD,$E1,$W1
+ add.i32 $W1, at Kxx[$j], at MSG[3]
+ sha1su1 @MSG[0], at MSG[3]
+___
+$code.=<<___ if ($i<20-4);
+ sha1su0 @MSG[1], at MSG[2], at MSG[3]
+___
+ ($E0,$E1)=($E1,$E0); ($W0,$W1)=($W1,$W0);
+ push(@MSG,shift(@MSG)); $j++ if ((($i+3)%5)==0);
+}
+$code.=<<___;
+ sha1h $E0,$ABCD // $i
+ sha1p $ABCD,$E1,$W1
+ add.i32 $W1, at Kxx[$j], at MSG[3]
+
+ sha1h $E1,$ABCD // 18
+ sha1p $ABCD,$E0,$W0
+
+ sha1h $E0,$ABCD // 19
+ sha1p $ABCD,$E1,$W1
+
+ add.i32 $E,$E,$E0
+ add.i32 $ABCD,$ABCD,$ABCD_SAVE
+
+ cbnz $num,.Loop_hw
+
+ st1.32 {$ABCD},[$ctx],#16
+ st1.32 {$E}[0],[$ctx]
+
+ ldr x29,[sp],#16
+ ret
+.size sha1_block_armv8,.-sha1_block_armv8
+.align 6
+.Lconst:
+.long 0x5a827999,0x5a827999,0x5a827999,0x5a827999 //K_00_19
+.long 0x6ed9eba1,0x6ed9eba1,0x6ed9eba1,0x6ed9eba1 //K_20_39
+.long 0x8f1bbcdc,0x8f1bbcdc,0x8f1bbcdc,0x8f1bbcdc //K_40_59
+.long 0xca62c1d6,0xca62c1d6,0xca62c1d6,0xca62c1d6 //K_60_79
+.LOPENSSL_armcap_P:
+.quad OPENSSL_armcap_P-.
+.asciz "SHA1 block transform for ARMv8, CRYPTOGAMS by <appro\@openssl.org>"
+.align 2
+.comm OPENSSL_armcap_P,4,4
+___
+}}}
+
+{ my %opcode = (
+ "sha1c" => 0x5e000000, "sha1p" => 0x5e001000,
+ "sha1m" => 0x5e002000, "sha1su0" => 0x5e003000,
+ "sha1h" => 0x5e280800, "sha1su1" => 0x5e281800 );
+
+ sub unsha1 {
+ my ($mnemonic,$arg)=@_;
+
+ $arg =~ m/[qv]([0-9]+)[^,]*,\s*[qv]([0-9]+)[^,]*(?:,\s*[qv]([0-9]+))?/o
+ &&
+ sprintf ".inst\t0x%08x\t//%s %s",
+ $opcode{$mnemonic}|$1|($2<<5)|($3<<16),
+ $mnemonic,$arg;
+ }
+}
+
+foreach(split("\n",$code)) {
+
+ s/\`([^\`]*)\`/eval($1)/geo;
+
+ s/\b(sha1\w+)\s+([qv].*)/unsha1($1,$2)/geo;
+
+ s/\.\w?32\b//o and s/\.16b/\.4s/go;
+ m/(ld|st)1[^\[]+\[0\]/o and s/\.4s/\.s/go;
+
+ print $_,"\n";
+}
+
+close STDOUT;
diff --git a/crypto/sha/asm/sha256-armv4.pl b/crypto/sha/asm/sha256-armv4.pl
index 9c84e8d..252a583 100644
--- a/crypto/sha/asm/sha256-armv4.pl
+++ b/crypto/sha/asm/sha256-armv4.pl
@@ -23,8 +23,20 @@
# Profiler-assisted and platform-specific optimization resulted in 16%
# improvement on Cortex A8 core and ~17 cycles per processed byte.
-while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {}
-open STDOUT,">$output";
+$flavour = shift;
+if ($flavour=~/^\w[\w\-]*\.\w+$/) { $output=$flavour; undef $flavour; }
+else { while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {} }
+
+if ($flavour && $flavour ne "void") {
+ $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
+ ( $xlate="${dir}arm-xlate.pl" and -f $xlate ) or
+ ( $xlate="${dir}../../perlasm/arm-xlate.pl" and -f $xlate) or
+ die "can't locate arm-xlate.pl";
+
+ open STDOUT,"| \"$^X\" $xlate $flavour $output";
+} else {
+ open STDOUT,">$output";
+}
$ctx="r0"; $t0="r0";
$inp="r1"; $t3="r1";
diff --git a/crypto/sha/asm/sha512-armv4.pl b/crypto/sha/asm/sha512-armv4.pl
index 7faf37b..c032afd 100644
--- a/crypto/sha/asm/sha512-armv4.pl
+++ b/crypto/sha/asm/sha512-armv4.pl
@@ -38,8 +38,20 @@ $hi="HI";
$lo="LO";
# ====================================================================
-while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {}
-open STDOUT,">$output";
+$flavour = shift;
+if ($flavour=~/^\w[\w\-]*\.\w+$/) { $output=$flavour; undef $flavour; }
+else { while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {} }
+
+if ($flavour && $flavour ne "void") {
+ $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
+ ( $xlate="${dir}arm-xlate.pl" and -f $xlate ) or
+ ( $xlate="${dir}../../perlasm/arm-xlate.pl" and -f $xlate) or
+ die "can't locate arm-xlate.pl";
+
+ open STDOUT,"| \"$^X\" $xlate $flavour $output";
+} else {
+ open STDOUT,">$output";
+}
$ctx="r0"; # parameter block
$inp="r1";
@@ -221,17 +233,21 @@ WORD64(0x4cc5d4be,0xcb3e42b6, 0x597f299c,0xfc657e2a)
WORD64(0x5fcb6fab,0x3ad6faec, 0x6c44198c,0x4a475817)
.size K512,.-K512
.LOPENSSL_armcap:
-.word OPENSSL_armcap_P-sha512_block_data_order
+.word OPENSSL_armcap_P-.Lsha512_block_data_order
.skip 32-4
.global sha512_block_data_order
.type sha512_block_data_order,%function
sha512_block_data_order:
+.Lsha512_block_data_order:
sub r3,pc,#8 @ sha512_block_data_order
add $len,$inp,$len,lsl#7 @ len to point at the end of inp
#if __ARM_ARCH__>=7
ldr r12,.LOPENSSL_armcap
ldr r12,[r3,r12] @ OPENSSL_armcap_P
+#ifdef __APPLE__
+ ldr r12,[r12]
+#endif
tst r12,#1
bne .LNEON
#endif
diff --git a/crypto/sha/asm/sha512-armv8.pl b/crypto/sha/asm/sha512-armv8.pl
new file mode 100644
index 0000000..45eb719
--- /dev/null
+++ b/crypto/sha/asm/sha512-armv8.pl
@@ -0,0 +1,428 @@
+#!/usr/bin/env perl
+#
+# ====================================================================
+# Written by Andy Polyakov <appro at openssl.org> for the OpenSSL
+# project. The module is, however, dual licensed under OpenSSL and
+# CRYPTOGAMS licenses depending on where you obtain it. For further
+# details see http://www.openssl.org/~appro/cryptogams/.
+# ====================================================================
+#
+# SHA256/512 for ARMv8.
+#
+# Performance in cycles per processed byte and improvement coefficient
+# over code generated with "default" compiler:
+#
+# SHA256-hw SHA256(*) SHA512
+# Apple A7 1.97 10.5 (+33%) 6.73 (-1%(**))
+# Cortex-A53 2.38 15.6 (+110%) 10.1 (+190%(***))
+# Cortex-A57 2.31 11.6 (+86%) 7.51 (+260%(***))
+#
+# (*) Software SHA256 results are of lesser relevance, presented
+# mostly for informational purposes.
+# (**) The result is a trade-off: it's possible to improve it by
+# 10% (or by 1 cycle per round), but at the cost of 20% loss
+# on Cortex-A53 (or by 4 cycles per round).
+# (***) Super-impressive coefficients over gcc-generated code are
+# indication of some compiler "pathology", most notably code
+# generated with -mgeneral-regs-only is significanty faster
+# and lags behind assembly only by 50-90%.
+
+$flavour=shift;
+$output=shift;
+
+$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
+( $xlate="${dir}arm-xlate.pl" and -f $xlate ) or
+( $xlate="${dir}../../perlasm/arm-xlate.pl" and -f $xlate) or
+die "can't locate arm-xlate.pl";
+
+open OUT,"| \"$^X\" $xlate $flavour $output";
+*STDOUT=*OUT;
+
+if ($output =~ /512/) {
+ $BITS=512;
+ $SZ=8;
+ @Sigma0=(28,34,39);
+ @Sigma1=(14,18,41);
+ @sigma0=(1, 8, 7);
+ @sigma1=(19,61, 6);
+ $rounds=80;
+ $reg_t="x";
+} else {
+ $BITS=256;
+ $SZ=4;
+ @Sigma0=( 2,13,22);
+ @Sigma1=( 6,11,25);
+ @sigma0=( 7,18, 3);
+ @sigma1=(17,19,10);
+ $rounds=64;
+ $reg_t="w";
+}
+
+$func="sha${BITS}_block_data_order";
+
+($ctx,$inp,$num,$Ktbl)=map("x$_",(0..2,30));
+
+ at X=map("$reg_t$_",(3..15,0..2));
+ at V=($A,$B,$C,$D,$E,$F,$G,$H)=map("$reg_t$_",(20..27));
+($t0,$t1,$t2,$t3)=map("$reg_t$_",(16,17,19,28));
+
+sub BODY_00_xx {
+my ($i,$a,$b,$c,$d,$e,$f,$g,$h)=@_;
+my $j=($i+1)&15;
+my ($T0,$T1,$T2)=(@X[($i-8)&15], at X[($i-9)&15], at X[($i-10)&15]);
+ $T0=@X[$i+3] if ($i<11);
+
+$code.=<<___ if ($i<16);
+#ifndef __ARMEB__
+ rev @X[$i], at X[$i] // $i
+#endif
+___
+$code.=<<___ if ($i<13 && ($i&1));
+ ldp @X[$i+1], at X[$i+2],[$inp],#2*$SZ
+___
+$code.=<<___ if ($i==13);
+ ldp @X[14], at X[15],[$inp]
+___
+$code.=<<___ if ($i>=14);
+ ldr @X[($i-11)&15],[sp,#`$SZ*(($i-11)%4)`]
+___
+$code.=<<___ if ($i>0 && $i<16);
+ add $a,$a,$t1 // h+=Sigma0(a)
+___
+$code.=<<___ if ($i>=11);
+ str @X[($i-8)&15],[sp,#`$SZ*(($i-8)%4)`]
+___
+# While ARMv8 specifies merged rotate-n-logical operation such as
+# 'eor x,y,z,ror#n', it was found to negatively affect performance
+# on Apple A7. The reason seems to be that it requires even 'y' to
+# be available earlier. This means that such merged instruction is
+# not necessarily best choice on critical path... On the other hand
+# Cortex-A5x handles merged instructions much better than disjoint
+# rotate and logical... See (**) footnote above.
+$code.=<<___ if ($i<15);
+ ror $t0,$e,#$Sigma1[0]
+ add $h,$h,$t2 // h+=K[i]
+ eor $T0,$e,$e,ror#`$Sigma1[2]-$Sigma1[1]`
+ and $t1,$f,$e
+ bic $t2,$g,$e
+ add $h,$h, at X[$i&15] // h+=X[i]
+ orr $t1,$t1,$t2 // Ch(e,f,g)
+ eor $t2,$a,$b // a^b, b^c in next round
+ eor $t0,$t0,$T0,ror#$Sigma1[1] // Sigma1(e)
+ ror $T0,$a,#$Sigma0[0]
+ add $h,$h,$t1 // h+=Ch(e,f,g)
+ eor $t1,$a,$a,ror#`$Sigma0[2]-$Sigma0[1]`
+ add $h,$h,$t0 // h+=Sigma1(e)
+ and $t3,$t3,$t2 // (b^c)&=(a^b)
+ add $d,$d,$h // d+=h
+ eor $t3,$t3,$b // Maj(a,b,c)
+ eor $t1,$T0,$t1,ror#$Sigma0[1] // Sigma0(a)
+ add $h,$h,$t3 // h+=Maj(a,b,c)
+ ldr $t3,[$Ktbl],#$SZ // *K++, $t2 in next round
+ //add $h,$h,$t1 // h+=Sigma0(a)
+___
+$code.=<<___ if ($i>=15);
+ ror $t0,$e,#$Sigma1[0]
+ add $h,$h,$t2 // h+=K[i]
+ ror $T1, at X[($j+1)&15],#$sigma0[0]
+ and $t1,$f,$e
+ ror $T2, at X[($j+14)&15],#$sigma1[0]
+ bic $t2,$g,$e
+ ror $T0,$a,#$Sigma0[0]
+ add $h,$h, at X[$i&15] // h+=X[i]
+ eor $t0,$t0,$e,ror#$Sigma1[1]
+ eor $T1,$T1, at X[($j+1)&15],ror#$sigma0[1]
+ orr $t1,$t1,$t2 // Ch(e,f,g)
+ eor $t2,$a,$b // a^b, b^c in next round
+ eor $t0,$t0,$e,ror#$Sigma1[2] // Sigma1(e)
+ eor $T0,$T0,$a,ror#$Sigma0[1]
+ add $h,$h,$t1 // h+=Ch(e,f,g)
+ and $t3,$t3,$t2 // (b^c)&=(a^b)
+ eor $T2,$T2, at X[($j+14)&15],ror#$sigma1[1]
+ eor $T1,$T1, at X[($j+1)&15],lsr#$sigma0[2] // sigma0(X[i+1])
+ add $h,$h,$t0 // h+=Sigma1(e)
+ eor $t3,$t3,$b // Maj(a,b,c)
+ eor $t1,$T0,$a,ror#$Sigma0[2] // Sigma0(a)
+ eor $T2,$T2, at X[($j+14)&15],lsr#$sigma1[2] // sigma1(X[i+14])
+ add @X[$j], at X[$j], at X[($j+9)&15]
+ add $d,$d,$h // d+=h
+ add $h,$h,$t3 // h+=Maj(a,b,c)
+ ldr $t3,[$Ktbl],#$SZ // *K++, $t2 in next round
+ add @X[$j], at X[$j],$T1
+ add $h,$h,$t1 // h+=Sigma0(a)
+ add @X[$j], at X[$j],$T2
+___
+ ($t2,$t3)=($t3,$t2);
+}
+
+$code.=<<___;
+#include "arm_arch.h"
+
+.text
+
+.extern OPENSSL_armcap_P
+.globl $func
+.type $func,%function
+.align 6
+$func:
+___
+$code.=<<___ if ($SZ==4);
+ ldr x16,.LOPENSSL_armcap_P
+ adr x17,.LOPENSSL_armcap_P
+ add x16,x16,x17
+ ldr w16,[x16]
+ tst w16,#ARMV8_SHA256
+ b.ne .Lv8_entry
+___
+$code.=<<___;
+ stp x29,x30,[sp,#-128]!
+ add x29,sp,#0
+
+ stp x19,x20,[sp,#16]
+ stp x21,x22,[sp,#32]
+ stp x23,x24,[sp,#48]
+ stp x25,x26,[sp,#64]
+ stp x27,x28,[sp,#80]
+ sub sp,sp,#4*$SZ
+
+ ldp $A,$B,[$ctx] // load context
+ ldp $C,$D,[$ctx,#2*$SZ]
+ ldp $E,$F,[$ctx,#4*$SZ]
+ add $num,$inp,$num,lsl#`log(16*$SZ)/log(2)` // end of input
+ ldp $G,$H,[$ctx,#6*$SZ]
+ adr $Ktbl,.LK$BITS
+ stp $ctx,$num,[x29,#96]
+
+.Loop:
+ ldp @X[0], at X[1],[$inp],#2*$SZ
+ ldr $t2,[$Ktbl],#$SZ // *K++
+ eor $t3,$B,$C // magic seed
+ str $inp,[x29,#112]
+___
+for ($i=0;$i<16;$i++) { &BODY_00_xx($i, at V); unshift(@V,pop(@V)); }
+$code.=".Loop_16_xx:\n";
+for (;$i<32;$i++) { &BODY_00_xx($i, at V); unshift(@V,pop(@V)); }
+$code.=<<___;
+ cbnz $t2,.Loop_16_xx
+
+ ldp $ctx,$num,[x29,#96]
+ ldr $inp,[x29,#112]
+ sub $Ktbl,$Ktbl,#`$SZ*($rounds+1)` // rewind
+
+ ldp @X[0], at X[1],[$ctx]
+ ldp @X[2], at X[3],[$ctx,#2*$SZ]
+ add $inp,$inp,#14*$SZ // advance input pointer
+ ldp @X[4], at X[5],[$ctx,#4*$SZ]
+ add $A,$A, at X[0]
+ ldp @X[6], at X[7],[$ctx,#6*$SZ]
+ add $B,$B, at X[1]
+ add $C,$C, at X[2]
+ add $D,$D, at X[3]
+ stp $A,$B,[$ctx]
+ add $E,$E, at X[4]
+ add $F,$F, at X[5]
+ stp $C,$D,[$ctx,#2*$SZ]
+ add $G,$G, at X[6]
+ add $H,$H, at X[7]
+ cmp $inp,$num
+ stp $E,$F,[$ctx,#4*$SZ]
+ stp $G,$H,[$ctx,#6*$SZ]
+ b.ne .Loop
+
+ ldp x19,x20,[x29,#16]
+ add sp,sp,#4*$SZ
+ ldp x21,x22,[x29,#32]
+ ldp x23,x24,[x29,#48]
+ ldp x25,x26,[x29,#64]
+ ldp x27,x28,[x29,#80]
+ ldp x29,x30,[sp],#128
+ ret
+.size $func,.-$func
+
+.align 6
+.type .LK$BITS,%object
+.LK$BITS:
+___
+$code.=<<___ if ($SZ==8);
+ .quad 0x428a2f98d728ae22,0x7137449123ef65cd
+ .quad 0xb5c0fbcfec4d3b2f,0xe9b5dba58189dbbc
+ .quad 0x3956c25bf348b538,0x59f111f1b605d019
+ .quad 0x923f82a4af194f9b,0xab1c5ed5da6d8118
+ .quad 0xd807aa98a3030242,0x12835b0145706fbe
+ .quad 0x243185be4ee4b28c,0x550c7dc3d5ffb4e2
+ .quad 0x72be5d74f27b896f,0x80deb1fe3b1696b1
+ .quad 0x9bdc06a725c71235,0xc19bf174cf692694
+ .quad 0xe49b69c19ef14ad2,0xefbe4786384f25e3
+ .quad 0x0fc19dc68b8cd5b5,0x240ca1cc77ac9c65
+ .quad 0x2de92c6f592b0275,0x4a7484aa6ea6e483
+ .quad 0x5cb0a9dcbd41fbd4,0x76f988da831153b5
+ .quad 0x983e5152ee66dfab,0xa831c66d2db43210
+ .quad 0xb00327c898fb213f,0xbf597fc7beef0ee4
+ .quad 0xc6e00bf33da88fc2,0xd5a79147930aa725
+ .quad 0x06ca6351e003826f,0x142929670a0e6e70
+ .quad 0x27b70a8546d22ffc,0x2e1b21385c26c926
+ .quad 0x4d2c6dfc5ac42aed,0x53380d139d95b3df
+ .quad 0x650a73548baf63de,0x766a0abb3c77b2a8
+ .quad 0x81c2c92e47edaee6,0x92722c851482353b
+ .quad 0xa2bfe8a14cf10364,0xa81a664bbc423001
+ .quad 0xc24b8b70d0f89791,0xc76c51a30654be30
+ .quad 0xd192e819d6ef5218,0xd69906245565a910
+ .quad 0xf40e35855771202a,0x106aa07032bbd1b8
+ .quad 0x19a4c116b8d2d0c8,0x1e376c085141ab53
+ .quad 0x2748774cdf8eeb99,0x34b0bcb5e19b48a8
+ .quad 0x391c0cb3c5c95a63,0x4ed8aa4ae3418acb
+ .quad 0x5b9cca4f7763e373,0x682e6ff3d6b2b8a3
+ .quad 0x748f82ee5defb2fc,0x78a5636f43172f60
+ .quad 0x84c87814a1f0ab72,0x8cc702081a6439ec
+ .quad 0x90befffa23631e28,0xa4506cebde82bde9
+ .quad 0xbef9a3f7b2c67915,0xc67178f2e372532b
+ .quad 0xca273eceea26619c,0xd186b8c721c0c207
+ .quad 0xeada7dd6cde0eb1e,0xf57d4f7fee6ed178
+ .quad 0x06f067aa72176fba,0x0a637dc5a2c898a6
+ .quad 0x113f9804bef90dae,0x1b710b35131c471b
+ .quad 0x28db77f523047d84,0x32caab7b40c72493
+ .quad 0x3c9ebe0a15c9bebc,0x431d67c49c100d4c
+ .quad 0x4cc5d4becb3e42b6,0x597f299cfc657e2a
+ .quad 0x5fcb6fab3ad6faec,0x6c44198c4a475817
+ .quad 0 // terminator
+___
+$code.=<<___ if ($SZ==4);
+ .long 0x428a2f98,0x71374491,0xb5c0fbcf,0xe9b5dba5
+ .long 0x3956c25b,0x59f111f1,0x923f82a4,0xab1c5ed5
+ .long 0xd807aa98,0x12835b01,0x243185be,0x550c7dc3
+ .long 0x72be5d74,0x80deb1fe,0x9bdc06a7,0xc19bf174
+ .long 0xe49b69c1,0xefbe4786,0x0fc19dc6,0x240ca1cc
+ .long 0x2de92c6f,0x4a7484aa,0x5cb0a9dc,0x76f988da
+ .long 0x983e5152,0xa831c66d,0xb00327c8,0xbf597fc7
+ .long 0xc6e00bf3,0xd5a79147,0x06ca6351,0x14292967
+ .long 0x27b70a85,0x2e1b2138,0x4d2c6dfc,0x53380d13
+ .long 0x650a7354,0x766a0abb,0x81c2c92e,0x92722c85
+ .long 0xa2bfe8a1,0xa81a664b,0xc24b8b70,0xc76c51a3
+ .long 0xd192e819,0xd6990624,0xf40e3585,0x106aa070
+ .long 0x19a4c116,0x1e376c08,0x2748774c,0x34b0bcb5
+ .long 0x391c0cb3,0x4ed8aa4a,0x5b9cca4f,0x682e6ff3
+ .long 0x748f82ee,0x78a5636f,0x84c87814,0x8cc70208
+ .long 0x90befffa,0xa4506ceb,0xbef9a3f7,0xc67178f2
+ .long 0 //terminator
+___
+$code.=<<___;
+.size .LK$BITS,.-.LK$BITS
+.align 3
+.LOPENSSL_armcap_P:
+ .quad OPENSSL_armcap_P-.
+.asciz "SHA$BITS block transform for ARMv8, CRYPTOGAMS by <appro\@openssl.org>"
+.align 2
+___
+
+if ($SZ==4) {
+my $Ktbl="x3";
+
+my ($ABCD,$EFGH,$abcd)=map("v$_.16b",(0..2));
+my @MSG=map("v$_.16b",(4..7));
+my ($W0,$W1)=("v16.4s","v17.4s");
+my ($ABCD_SAVE,$EFGH_SAVE)=("v18.16b","v19.16b");
+
+$code.=<<___;
+.type sha256_block_armv8,%function
+.align 6
+sha256_block_armv8:
+.Lv8_entry:
+ stp x29,x30,[sp,#-16]!
+ add x29,sp,#0
+
+ ld1.32 {$ABCD,$EFGH},[$ctx]
+ adr $Ktbl,.LK256
+
+.Loop_hw:
+ ld1 {@MSG[0]- at MSG[3]},[$inp],#64
+ sub $num,$num,#1
+ ld1.32 {$W0},[$Ktbl],#16
+ rev32 @MSG[0], at MSG[0]
+ rev32 @MSG[1], at MSG[1]
+ rev32 @MSG[2], at MSG[2]
+ rev32 @MSG[3], at MSG[3]
+ orr $ABCD_SAVE,$ABCD,$ABCD // offload
+ orr $EFGH_SAVE,$EFGH,$EFGH
+___
+for($i=0;$i<12;$i++) {
+$code.=<<___;
+ ld1.32 {$W1},[$Ktbl],#16
+ add.i32 $W0,$W0, at MSG[0]
+ sha256su0 @MSG[0], at MSG[1]
+ orr $abcd,$ABCD,$ABCD
+ sha256h $ABCD,$EFGH,$W0
+ sha256h2 $EFGH,$abcd,$W0
+ sha256su1 @MSG[0], at MSG[2], at MSG[3]
+___
+ ($W0,$W1)=($W1,$W0); push(@MSG,shift(@MSG));
+}
+$code.=<<___;
+ ld1.32 {$W1},[$Ktbl],#16
+ add.i32 $W0,$W0, at MSG[0]
+ orr $abcd,$ABCD,$ABCD
+ sha256h $ABCD,$EFGH,$W0
+ sha256h2 $EFGH,$abcd,$W0
+
+ ld1.32 {$W0},[$Ktbl],#16
+ add.i32 $W1,$W1, at MSG[1]
+ orr $abcd,$ABCD,$ABCD
+ sha256h $ABCD,$EFGH,$W1
+ sha256h2 $EFGH,$abcd,$W1
+
+ ld1.32 {$W1},[$Ktbl]
+ add.i32 $W0,$W0, at MSG[2]
+ sub $Ktbl,$Ktbl,#$rounds*$SZ-16 // rewind
+ orr $abcd,$ABCD,$ABCD
+ sha256h $ABCD,$EFGH,$W0
+ sha256h2 $EFGH,$abcd,$W0
+
+ add.i32 $W1,$W1, at MSG[3]
+ orr $abcd,$ABCD,$ABCD
+ sha256h $ABCD,$EFGH,$W1
+ sha256h2 $EFGH,$abcd,$W1
+
+ add.i32 $ABCD,$ABCD,$ABCD_SAVE
+ add.i32 $EFGH,$EFGH,$EFGH_SAVE
+
+ cbnz $num,.Loop_hw
+
+ st1.32 {$ABCD,$EFGH},[$ctx]
+
+ ldr x29,[sp],#16
+ ret
+.size sha256_block_armv8,.-sha256_block_armv8
+___
+}
+
+$code.=<<___;
+.comm OPENSSL_armcap_P,4,4
+___
+
+{ my %opcode = (
+ "sha256h" => 0x5e004000, "sha256h2" => 0x5e005000,
+ "sha256su0" => 0x5e282800, "sha256su1" => 0x5e006000 );
+
+ sub unsha256 {
+ my ($mnemonic,$arg)=@_;
+
+ $arg =~ m/[qv]([0-9]+)[^,]*,\s*[qv]([0-9]+)[^,]*(?:,\s*[qv]([0-9]+))?/o
+ &&
+ sprintf ".inst\t0x%08x\t//%s %s",
+ $opcode{$mnemonic}|$1|($2<<5)|($3<<16),
+ $mnemonic,$arg;
+ }
+}
+
+foreach(split("\n",$code)) {
+
+ s/\`([^\`]*)\`/eval($1)/geo;
+
+ s/\b(sha256\w+)\s+([qv].*)/unsha256($1,$2)/geo;
+
+ s/\.\w?32\b//o and s/\.16b/\.4s/go;
+ m/(ld|st)1[^\[]+\[0\]/o and s/\.4s/\.s/go;
+
+ print $_,"\n";
+}
+
+close STDOUT;
diff --git a/fips/fips.c b/fips/fips.c
index 8c9e187..0269609 100644
--- a/fips/fips.c
+++ b/fips/fips.c
@@ -151,7 +151,7 @@ extern const unsigned char FIPS_rodata_start[], FIPS_rodata_end[];
#ifdef _TMS320C6X
const
#endif
-unsigned char FIPS_signature [20] = { 0 };
+unsigned char FIPS_signature [20] = { 0, 0xff };
__fips_constseg
static const char FIPS_hmac_key[]="etaonrishdlcupfm";
diff --git a/fips/fips_canister.c b/fips/fips_canister.c
index dcdb067..adbe696 100644
--- a/fips/fips_canister.c
+++ b/fips/fips_canister.c
@@ -29,6 +29,7 @@ const void *FIPS_text_end(void);
#if !defined(FIPS_REF_POINT_IS_CROSS_COMPILER_AWARE)
# if (defined(__ANDROID__) && (defined(__arm__) || defined(__arm) || \
+ defined(__aarch64__) || \
defined(__i386__)|| defined(__i386))) || \
(defined(__vxworks) && (defined(__ppc__) || defined(__ppc) || \
defined(__mips__)|| defined(__mips))) || \
diff --git a/fips/fips_test_suite.c b/fips/fips_test_suite.c
index e2506ff..3c9bbaa 100644
--- a/fips/fips_test_suite.c
+++ b/fips/fips_test_suite.c
@@ -1325,6 +1325,12 @@ int main(int argc, char **argv)
FIPS_post_set_callback(post_cb);
+#if (defined(__arm__) || defined(__aarch64__))
+ extern unsigned int OPENSSL_armcap_P;
+ if (0 == OPENSSL_armcap_P)
+ fprintf(stderr, "Optimizations disabled\n");
+#endif
+
printf("\tFIPS-mode test application\n");
printf("\t%s\n\n", FIPS_module_version_text());
diff --git a/fips/fipsalgtest.pl b/fips/fipsalgtest.pl
index 672c261..3009521 100644
--- a/fips/fipsalgtest.pl
+++ b/fips/fipsalgtest.pl
@@ -7,17 +7,6 @@
# FIPS test definitions
# List of all the unqualified file names we expect and command lines to run
-# DSA tests
-my @fips_dsa_test_list = (
-
- "DSA",
-
- [ "PQGGen", "fips_dssvs pqg", "path:[^C]DSA/.*PQGGen" ],
- [ "KeyPair", "fips_dssvs keypair", "path:[^C]DSA/.*KeyPair" ],
- [ "SigGen", "fips_dssvs siggen", "path:[^C]DSA/.*SigGen" ],
- [ "SigVer", "fips_dssvs sigver", "path:[^C]DSA/.*SigVer" ]
-
-);
my @fips_dsa_pqgver_test_list = (
"DSA",
@@ -38,16 +27,7 @@ my @fips_dsa2_test_list = (
);
-# ECDSA and ECDSA2 tests
-my @fips_ecdsa_test_list = (
-
- "ECDSA",
-
- [ "KeyPair", "fips_ecdsavs KeyPair", "path:/ECDSA/.*KeyPair" ],
- [ "PKV", "fips_ecdsavs PKV", "path:/ECDSA/.*PKV" ],
- [ "SigGen", "fips_ecdsavs SigGen", "path:/ECDSA/.*SigGen" ],
- [ "SigVer", "fips_ecdsavs SigVer", "path:/ECDSA/.*SigVer" ],
-);
+# ECDSA2 tests
my @fips_ecdsa2_test_list = (
"ECDSA2",
@@ -357,10 +337,8 @@ my @fips_des3_test_list = (
"Triple DES",
[ "TCBCinvperm", "fips_desmovs -f" ],
- [ "TCBCMMT1", "fips_desmovs -f" ],
[ "TCBCMMT2", "fips_desmovs -f" ],
[ "TCBCMMT3", "fips_desmovs -f" ],
- [ "TCBCMonte1", "fips_desmovs -f" ],
[ "TCBCMonte2", "fips_desmovs -f" ],
[ "TCBCMonte3", "fips_desmovs -f" ],
[ "TCBCpermop", "fips_desmovs -f" ],
@@ -368,10 +346,8 @@ my @fips_des3_test_list = (
[ "TCBCvarkey", "fips_desmovs -f" ],
[ "TCBCvartext", "fips_desmovs -f" ],
[ "TCFB64invperm", "fips_desmovs -f" ],
- [ "TCFB64MMT1", "fips_desmovs -f" ],
[ "TCFB64MMT2", "fips_desmovs -f" ],
[ "TCFB64MMT3", "fips_desmovs -f" ],
- [ "TCFB64Monte1", "fips_desmovs -f" ],
[ "TCFB64Monte2", "fips_desmovs -f" ],
[ "TCFB64Monte3", "fips_desmovs -f" ],
[ "TCFB64permop", "fips_desmovs -f" ],
@@ -379,10 +355,8 @@ my @fips_des3_test_list = (
[ "TCFB64varkey", "fips_desmovs -f" ],
[ "TCFB64vartext", "fips_desmovs -f" ],
[ "TCFB8invperm", "fips_desmovs -f" ],
- [ "TCFB8MMT1", "fips_desmovs -f" ],
[ "TCFB8MMT2", "fips_desmovs -f" ],
[ "TCFB8MMT3", "fips_desmovs -f" ],
- [ "TCFB8Monte1", "fips_desmovs -f" ],
[ "TCFB8Monte2", "fips_desmovs -f" ],
[ "TCFB8Monte3", "fips_desmovs -f" ],
[ "TCFB8permop", "fips_desmovs -f" ],
@@ -390,10 +364,8 @@ my @fips_des3_test_list = (
[ "TCFB8varkey", "fips_desmovs -f" ],
[ "TCFB8vartext", "fips_desmovs -f" ],
[ "TECBinvperm", "fips_desmovs -f" ],
- [ "TECBMMT1", "fips_desmovs -f" ],
[ "TECBMMT2", "fips_desmovs -f" ],
[ "TECBMMT3", "fips_desmovs -f" ],
- [ "TECBMonte1", "fips_desmovs -f" ],
[ "TECBMonte2", "fips_desmovs -f" ],
[ "TECBMonte3", "fips_desmovs -f" ],
[ "TECBpermop", "fips_desmovs -f" ],
@@ -401,10 +373,8 @@ my @fips_des3_test_list = (
[ "TECBvarkey", "fips_desmovs -f" ],
[ "TECBvartext", "fips_desmovs -f" ],
[ "TOFBinvperm", "fips_desmovs -f" ],
- [ "TOFBMMT1", "fips_desmovs -f" ],
[ "TOFBMMT2", "fips_desmovs -f" ],
[ "TOFBMMT3", "fips_desmovs -f" ],
- [ "TOFBMonte1", "fips_desmovs -f" ],
[ "TOFBMonte2", "fips_desmovs -f" ],
[ "TOFBMonte3", "fips_desmovs -f" ],
[ "TOFBpermop", "fips_desmovs -f" ],
@@ -419,10 +389,8 @@ my @fips_des3_cfb1_test_list = (
# DES3 CFB1 tests
[ "TCFB1invperm", "fips_desmovs -f" ],
- [ "TCFB1MMT1", "fips_desmovs -f" ],
[ "TCFB1MMT2", "fips_desmovs -f" ],
[ "TCFB1MMT3", "fips_desmovs -f" ],
- [ "TCFB1Monte1", "fips_desmovs -f" ],
[ "TCFB1Monte2", "fips_desmovs -f" ],
[ "TCFB1Monte3", "fips_desmovs -f" ],
[ "TCFB1permop", "fips_desmovs -f" ],
@@ -475,8 +443,6 @@ my @fips_ecdh_test_list = (
#
my %verify_special = (
- "DSA:PQGGen" => "fips_dssvs pqgver",
- "DSA:KeyPair" => "fips_dssvs keyver",
"DSA:SigGen" => "fips_dssvs sigver",
"DSA2:PQGGen" => "fips_dssvs pqgver",
"DSA2:KeyPair" => "fips_dssvs keyver",
@@ -650,10 +616,8 @@ if (!$fips_enabled{"v2"}) {
}
}
-push @fips_test_list, @fips_dsa_test_list if $fips_enabled{"dsa"};
push @fips_test_list, @fips_dsa2_test_list if $fips_enabled{"dsa2"};
push @fips_test_list, @fips_dsa_pqgver_test_list if $fips_enabled{"dsa-pqgver"};
-push @fips_test_list, @fips_ecdsa_test_list if $fips_enabled{"ecdsa"};
push @fips_test_list, @fips_ecdsa2_test_list if $fips_enabled{"ecdsa2"};
push @fips_test_list, @fips_rsa_test_list if $fips_enabled{"rsa"};
push @fips_test_list, @fips_rsa_pss0_test_list if $fips_enabled{"rsa-pss0"};
diff --git a/fips/fipssyms.h b/fips/fipssyms.h
index 5719aea..76db619 100644
--- a/fips/fipssyms.h
+++ b/fips/fipssyms.h
@@ -668,6 +668,50 @@
#define bn_mul_mont_gather5 fips_bn_mul_mont_gather5
#define bn_scatter5 fips_bn_scatter5
#define bn_gather5 fips_bn_gather5
+#define _armv8_aes_probe _fips_armv8_aes_probe
+#define _armv8_pmull_probe _fips_armv8_pmull_probe
+#define _armv8_sha1_probe _fips_armv8_sha1_probe
+#define _armv8_sha256_probe _fips_armv8_sha256_probe
+#define aes_v8_encrypt fips_aes_v8_encrypt
+#define aes_v8_decrypt fips_aes_v8_decrypt
+#define aes_v8_set_encrypt_key fips_aes_v8_set_encrypt_key
+#define aes_v8_set_decrypt_key fips_aes_v8_set_decrypt_key
+#define aes_v8_cbc_encrypt fips_aes_v8_cbc_encrypt
+#define aes_v8_ctr32_encrypt_blocks fips_aes_v8_ctr32_encrypt_blocks
+#define gcm_init_v8 fips_gcm_init_v8
+#define gcm_gmult_v8 fips_gcm_gmult_v8
+#define gcm_ghash_v8 fips_gcm_ghash_v8
+#if defined(__APPLE__) && __ASSEMBLER__
+#define _OPENSSL_armcap_P _fips_openssl_armcap_P
+#define __armv7_neon_probe __fips_armv7_neon_probe
+#define __armv7_tick __fips_armv7_tick
+#define __armv8_aes_probe __fips_armv8_aes_probe
+#define __armv8_pmull_probe __fips_armv8_pmull_probe
+#define __armv8_sha1_probe __fips_armv8_sha1_probe
+#define __armv8_sha256_probe __fips_armv8_sha256_probe
+#define _aes_v8_encrypt _fips_aes_v8_encrypt
+#define _aes_v8_decrypt _fips_aes_v8_decrypt
+#define _aes_v8_set_encrypt_key _fips_aes_v8_set_encrypt_key
+#define _aes_v8_set_decrypt_key _fips_aes_v8_set_decrypt_key
+#define _aes_v8_cbc_encrypt _fips_aes_v8_cbc_encrypt
+#define _aes_v8_ctr32_encrypt_blocks _fips_aes_v8_ctr32_encrypt_blocks
+#define _gcm_init_v8 _fips_gcm_init_v8
+#define _gcm_gmult_v8 _fips_gcm_gmult_v8
+#define _gcm_ghash_v8 _fips_gcm_ghash_v8
+#define _sha1_block_data_order _fips_sha1_block_data_order
+#define _sha256_block_data_order _fips_sha256_block_data_order
+#define _sha512_block_data_order _fips_sha512_block_data_order
+#define _AES_decrypt _fips_aes_decrypt
+#define _AES_encrypt _fips_aes_encrypt
+#define _AES_set_decrypt_key _fips_aes_set_decrypt_key
+#define _AES_set_encrypt_key _fips_aes_set_encrypt_key
+#define _gcm_gmult_4bit _fips_gcm_gmult_4bit
+#define _gcm_ghash_4bit _fips_gcm_ghash_4bit
+#define _gcm_gmult_neon _fips_gcm_gmult_neon
+#define _gcm_ghash_neon _fips_gcm_ghash_neon
+#define _bn_GF2m_mul_2x2 _fips_bn_GF2m_mul_2x2
+#define _OPENSSL_cleanse _FIPS_openssl_cleanse
+#endif
#if defined(_MSC_VER)
# pragma const_seg("fipsro$b")
diff --git a/iOS/Makefile b/iOS/Makefile
new file mode 100644
index 0000000..db26da6
--- /dev/null
+++ b/iOS/Makefile
@@ -0,0 +1,76 @@
+#
+# OpenSSL/iOS/Makefile
+#
+
+DIR= iOS
+TOP= ..
+CC= cc
+INCLUDES= -I$(TOP) -I$(TOP)/include
+CFLAG= -g -static
+MAKEFILE= Makefile
+PERL= perl
+RM= rm -f
+
+EXE=incore_macho
+
+CFLAGS= $(INCLUDES) $(CFLAG)
+
+top:
+ @$(MAKE) -f $(TOP)/Makefile reflect THIS=exe
+
+exe: fips_algvs.app/fips_algvs
+
+incore_macho: incore_macho.c $(TOP)/crypto/sha/sha1dgst.c
+ $(HOSTCC) $(HOSTCFLAGS) -I$(TOP)/include -I$(TOP)/crypto -o $@ incore_macho.c $(TOP)/crypto/sha/sha1dgst.c
+
+fips_algvs.app/fips_algvs: $(TOP)/test/fips_algvs.c $(TOP)/fips/fipscanister.o fopen.m incore_macho
+ FIPS_SIG=./incore_macho \
+ $(TOP)/fips/fipsld $(CFLAGS) -I$(TOP)/fips -o $@ \
+ $(TOP)/test/fips_algvs.c $(TOP)/fips/fipscanister.o \
+ fopen.m -framework Foundation || rm $@
+ codesign -f -s "iPhone Developer" --entitlements fips_algvs.app/Entitlements.plist fips_algvs.app || rm $@
+
+install:
+ @[ -n "$(INSTALLTOP)" ] # should be set by top Makefile...
+ @set -e; for i in $(EXE); \
+ do \
+ (echo installing $$i; \
+ cp $$i $(INSTALL_PREFIX)$(INSTALLTOP)/bin/$$i.new; \
+ chmod 755 $(INSTALL_PREFIX)$(INSTALLTOP)/bin/$$i.new; \
+ mv -f $(INSTALL_PREFIX)$(INSTALLTOP)/bin/$$i.new $(INSTALL_PREFIX)$(INSTALLTOP)/bin/$$i ); \
+ done;
+ @set -e; for i in $(SCRIPTS); \
+ do \
+ (echo installing $$i; \
+ cp $$i $(INSTALL_PREFIX)$(OPENSSLDIR)/misc/$$i.new; \
+ chmod 755 $(INSTALL_PREFIX)$(OPENSSLDIR)/misc/$$i.new; \
+ mv -f $(INSTALL_PREFIX)$(OPENSSLDIR)/misc/$$i.new $(INSTALL_PREFIX)$(OPENSSLDIR)/misc/$$i ); \
+ done
+
+tags:
+ ctags $(SRC)
+
+tests:
+
+links:
+
+lint:
+ lint -DLINT $(INCLUDES) $(SRC)>fluff
+
+depend:
+ @if [ -z "$(THIS)" ]; then \
+ $(MAKE) -f $(TOP)/Makefile reflect THIS=$@; \
+ else \
+ $(MAKEDEPEND) -- $(CFLAG) $(INCLUDES) $(DEPFLAG) -- $(PROGS) $(SRC); \
+ fi
+
+dclean:
+ $(PERL) -pe 'if (/^# DO NOT DELETE THIS LINE/) {print; exit(0);}' $(MAKEFILE) >Makefile.new
+ mv -f Makefile.new $(MAKEFILE)
+
+clean:
+ rm -f *.o *.obj lib tags core .pure .nfs* *.old *.bak fluff $(EXE)
+ rm -f fips_algvs.app/fips_algvs
+
+# DO NOT DELETE THIS LINE -- make depend depends on it.
+
diff --git a/iOS/fips_algvs.app/Entitlements.plist b/iOS/fips_algvs.app/Entitlements.plist
new file mode 100644
index 0000000..929c4e9
--- /dev/null
+++ b/iOS/fips_algvs.app/Entitlements.plist
@@ -0,0 +1,8 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
+<plist version="1.0">
+<dict>
+ <key>get-task-allow</key>
+ <true/>
+</dict>
+</plist>
\ No newline at end of file
diff --git a/iOS/fips_algvs.app/Info.plist b/iOS/fips_algvs.app/Info.plist
new file mode 100644
index 0000000..3fd8fb4
--- /dev/null
+++ b/iOS/fips_algvs.app/Info.plist
@@ -0,0 +1,24 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
+<plist version="1.0">
+<dict>
+ <key>CFBundleName</key>
+ <string>fips_algvs</string>
+ <key>CFBundleSupportedPlatforms</key>
+ <array>
+ <string>iPhoneOS</string>
+ </array>
+ <key>CFBundleExecutable</key>
+ <string>fips_algvs</string>
+ <key>CFBundleIdentifier</key>
+ <string>fips_algvs</string>
+ <key>CFBundleResourceSpecification</key>
+ <string>ResourceRules.plist</string>
+ <key>LSRequiresIPhoneOS</key>
+ <true/>
+ <key>CFBundleDisplayName</key>
+ <string>fips_algvs</string>
+ <key>CFBundleVersion</key>
+ <string>1.0</string>
+</dict>
+</plist>
diff --git a/iOS/fips_algvs.app/ResourceRules.plist b/iOS/fips_algvs.app/ResourceRules.plist
new file mode 100644
index 0000000..e7ec329
--- /dev/null
+++ b/iOS/fips_algvs.app/ResourceRules.plist
@@ -0,0 +1,25 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
+<plist version="1.0">
+<dict>
+ <key>rules</key>
+ <dict>
+ <key>.*</key>
+ <true/>
+ <key>Info.plist</key>
+ <dict>
+ <key>omit</key>
+ <true/>
+ <key>weight</key>
+ <real>10</real>
+ </dict>
+ <key>ResourceRules.plist</key>
+ <dict>
+ <key>omit</key>
+ <true/>
+ <key>weight</key>
+ <real>100</real>
+ </dict>
+ </dict>
+</dict>
+</plist>
diff --git a/iOS/fopen.m b/iOS/fopen.m
new file mode 100644
index 0000000..8d2e790
--- /dev/null
+++ b/iOS/fopen.m
@@ -0,0 +1,93 @@
+#include <stdio.h>
+#include <dlfcn.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <unistd.h>
+#include <Foundation/Foundation.h>
+
+static FILE *(*libc_fopen)(const char *, const char *) = NULL;
+
+__attribute__((constructor))
+static void pre_main(void)
+{
+ /*
+ * Pull reference to fopen(3) from libc.
+ */
+ void *handle = dlopen("libSystem.B.dylib",RTLD_LAZY);
+
+ if (handle) {
+ libc_fopen = dlsym(handle,"fopen");
+ dlclose(handle);
+ }
+
+ /*
+ * Change to Documents directory.
+ */
+ NSString *docs = [NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) lastObject];
+
+ NSFileManager *filemgr = [NSFileManager defaultManager];
+ [filemgr changeCurrentDirectoryPath: docs];
+ [filemgr release];
+}
+
+char *mkdirhier(char *path)
+{
+ char *slash;
+ struct stat buf;
+
+ if (path[0]=='.' && path[1]=='/') path+=2;
+
+ if ((slash = strrchr(path,'/'))) {
+ *slash = '\0';
+ if (stat(path,&buf)==0) {
+ *slash = '/';
+ return NULL;
+ }
+ (void)mkdirhier(path);
+ mkdir (path,0777);
+ *slash = '/';
+ }
+
+ return slash;
+}
+/*
+ * Replacement fopen(3)
+ */
+FILE *fopen(const char *filename, const char *mode)
+{
+ FILE *ret;
+
+ if ((ret = (*libc_fopen)(filename,mode)) == NULL) {
+ /*
+ * If file is not present in Documents directory, try from Bundle.
+ */
+ NSString *nsspath = [NSString stringWithFormat:@"%@/%s",
+ [[NSBundle mainBundle] bundlePath],
+ filename];
+
+ if ((ret = (*libc_fopen)([nsspath cStringUsingEncoding:NSUTF8StringEncoding],mode)) == NULL &&
+ mode[0]=='w' &&
+ ((filename[0]!='.' && filename[0]!='/') ||
+ (filename[0]=='.' && filename[1]=='/')) ) {
+ /*
+ * If not present in Bundle, create directory in Documents
+ */
+ char *path = strdup(filename), *slash;
+ static int once = 1;
+
+ if ((slash = mkdirhier(path)) && once) {
+ /*
+ * For some reason iOS truncates first created file
+ * upon program exit, so we create one preemptively...
+ */
+ once = 0;
+ strcpy(slash,"/.0");
+ creat(path,0444);
+ }
+ free(path);
+ ret = (*libc_fopen)(filename,mode);
+ }
+ }
+
+ return ret;
+}
diff --git a/iOS/incore_macho.c b/iOS/incore_macho.c
new file mode 100644
index 0000000..8842764
--- /dev/null
+++ b/iOS/incore_macho.c
@@ -0,0 +1,1016 @@
+/* incore_macho.c */
+/* ====================================================================
+ * Copyright (c) 2011 The OpenSSL Project. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ *
+ * 3. All advertising materials mentioning features or use of this
+ * software must display the following acknowledgment:
+ * "This product includes software developed by the OpenSSL Project
+ * for use in the OpenSSL Toolkit. (http://www.openssl.org/)"
+ *
+ * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to
+ * endorse or promote products derived from this software without
+ * prior written permission. For written permission, please contact
+ * openssl-core at openssl.org.
+ *
+ * 5. Products derived from this software may not be called "OpenSSL"
+ * nor may "OpenSSL" appear in their names without prior written
+ * permission of the OpenSSL Project.
+ *
+ * 6. Redistributions of any form whatsoever must retain the following
+ * acknowledgment:
+ * "This product includes software developed by the OpenSSL Project
+ * for use in the OpenSSL Toolkit (http://www.openssl.org/)"
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY
+ * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR
+ * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+
+/* ====================================================================
+ * Copyright 2011 Thursby Software Systems, Inc. All rights reserved.
+ *
+ * The portions of the attached software ("Contribution") is developed by
+ * Thursby Software Systems, Inc and is licensed pursuant to the OpenSSL
+ * open source license.
+ *
+ * The Contribution, originally written by Paul W. Nelson of
+ * Thursby Software Systems, Inc, consists of the fingerprint calculation
+ * required for the FIPS140 integrity check.
+ *
+ * No patent licenses or other rights except those expressly stated in
+ * the OpenSSL open source license shall be deemed granted or received
+ * expressly, by implication, estoppel, or otherwise.
+ *
+ * No assurances are provided by Thursby that the Contribution does not
+ * infringe the patent or other intellectual property rights of any third
+ * party or that the license provides you with all the necessary rights
+ * to make use of the Contribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND. IN
+ * ADDITION TO THE DISCLAIMERS INCLUDED IN THE LICENSE, THURSBY
+ * SPECIFICALLY DISCLAIMS ANY LIABILITY FOR CLAIMS BROUGHT BY YOU OR ANY
+ * OTHER ENTITY BASED ON INFRINGEMENT OF INTELLECTUAL PROPERTY RIGHTS OR
+ * OTHERWISE.
+ */
+
+#include <stdio.h>
+#include <ctype.h>
+#include <mach-o/loader.h>
+#include <mach-o/nlist.h>
+#include <mach-o/stab.h>
+#include <mach-o/reloc.h>
+#include <mach-o/fat.h>
+#include <fcntl.h>
+#include <sys/mman.h>
+#include <sys/vmparam.h>
+#include <sys/stat.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <openssl/crypto.h>
+#include <openssl/sha.h>
+#include <openssl/hmac.h>
+#include <openssl/fips.h>
+
+#ifndef CPU_SUBRTPE_V7F
+# define CPU_SUBRTPE_V7F ((cpu_subtype_t) 10)
+#endif
+/* iPhone 5 and iPad 4 (A6 Processors) */
+#ifndef CPU_SUBTYPE_ARM_V7S
+# define CPU_SUBTYPE_ARM_V7S ((cpu_subtype_t) 11)
+#endif
+#ifndef CPU_SUBTYPE_ARM_V7K
+# define CPU_SUBTYPE_ARM_V7K ((cpu_subtype_t) 12)
+#endif
+#ifndef CPU_SUBTYPE_ARM_V8
+# define CPU_SUBTYPE_ARM_V8 ((cpu_subtype_t) 13)
+#endif
+
+#ifndef CPU_TYPE_ARM64
+# define CPU_TYPE_ARM64 (CPU_TYPE_ARM | CPU_ARCH_ABI64)
+#endif
+
+static int gVerbosity = 0;
+
+static void hexdump(const unsigned char *buf,size_t len,
+ unsigned long address,FILE* fp)
+{
+ unsigned long addr;
+ int i;
+
+ addr = 0;
+ while(addr<len)
+ {
+ fprintf(fp,"%6.6lx - ",addr+address);
+ for(i=0;i<16;i++)
+ {
+ if(addr+i<len)
+ fprintf(fp,"%2.2x ",buf[addr+i]);
+ else
+ fprintf(fp," ");
+ }
+ fprintf(fp," \"");
+ for(i=0;i<16;i++)
+ {
+ if(addr+i<len)
+ {
+ if(isprint(buf[addr+i]) && (buf[addr+i]<0x7e) )
+ putc(buf[addr+i],fp);
+ else
+ putc('.',fp);
+ }
+ }
+ fprintf(fp,"\"\n");
+ addr += 16;
+ }
+ fflush(fp);
+}
+
+struct segment_rec;
+typedef struct section_rec {
+ char sectname[16];
+ char segname[16];
+ uint64_t addr;
+ uint64_t size;
+ uint32_t offset;
+ uint32_t align;
+ uint32_t reloff;
+ uint32_t nreloc;
+ uint32_t flags;
+ struct segment_rec* segment;
+ struct section_rec* _next;
+} section_t;
+
+typedef struct segment_rec {
+ char segname[16];
+ uint64_t vmaddr;
+ uint64_t vmsize;
+ off_t fileoff;
+ uint64_t filesize;
+ vm_prot_t maxprot;
+ vm_prot_t initprot;
+ uint32_t nsects;
+ uint32_t flags;
+ unsigned char* mapped;
+ struct segment_rec* _next;
+} segment_t;
+
+typedef struct symtab_entry_rec {
+ uint32_t n_strx;
+ uint8_t n_type;
+ uint8_t n_sect;
+ int16_t n_desc;
+ uint64_t n_value;
+ const char * n_symbol;
+ section_t* section;
+ unsigned char* mapped; /* pointer to the actual data in mapped file */
+ struct symtab_entry_rec* _next;
+} symtab_entry_t;
+
+
+typedef struct macho_file_rec
+{
+ const char * filename;
+ void* mapped;
+ size_t size; /* number of valid bytes at 'mapped' */
+ uint32_t align; /* byte alignment for this arch */
+ int isBigEndian;/* 1 if everything is byte swapped */
+
+ cpu_type_t cpu_type;
+ cpu_subtype_t cpu_subtype;
+ section_t* sec_head;
+ section_t* sec_tail;
+
+ segment_t* seg_head;
+ segment_t* seg_tail;
+
+ symtab_entry_t* sym_head;
+ symtab_entry_t* sym_tail;
+ struct macho_file_rec *next;
+ char * fingerprint_computed;
+ char * fingerprint_original;
+
+} macho_file_t;
+
+static const char *cputype(cpu_type_t cputype, cpu_subtype_t subtype)
+{
+ const char *rval = "unknown";
+ switch( cputype )
+ {
+ case CPU_TYPE_I386: rval = "i386"; break;
+ case CPU_TYPE_X86_64: rval = "x86_64"; break;
+ case CPU_TYPE_ARM64: rval = "aarch64"; break;
+ case CPU_TYPE_ARM:
+ {
+ switch( subtype )
+ {
+ case CPU_SUBTYPE_ARM_V6: rval = "armv6"; break;
+ case CPU_SUBTYPE_ARM_V7: rval = "armv7"; break;
+ case CPU_SUBTYPE_ARM_V7S: rval = "armv7s"; break;
+ case CPU_SUBTYPE_ARM_V7K: rval = "armv7k"; break;
+ case CPU_SUBTYPE_ARM_V8: rval = "armv8"; break;
+ default: rval = "arm"; break;
+ }
+ }
+ }
+ return rval;
+}
+
+static void *add_section( macho_file_t *macho, void *pCommand,
+ uint8_t is64bit, struct segment_rec *segment )
+{
+ void* rval = 0;
+ uint32_t flags;
+
+ section_t* sec = (section_t*)calloc(1, sizeof(section_t));
+ if(!sec) return NULL;
+
+ if(is64bit)
+ {
+ struct section_64* pSec = (struct section_64*)pCommand;
+ flags = pSec->flags;
+ memcpy( sec->sectname, pSec->sectname, 16 );
+ memcpy( sec->segname, pSec->segname, 16 );
+ sec->addr = pSec->addr;
+ sec->size = pSec->size;
+ sec->offset = pSec->offset;
+ sec->align = pSec->align;
+ sec->reloff = pSec->reloff;
+ sec->nreloc = pSec->nreloc;
+ sec->flags = pSec->flags;
+ rval = pCommand + sizeof(struct section_64);
+ }
+ else
+ {
+ struct section* pSec = (struct section*)pCommand;
+ flags = pSec->flags;
+ memcpy( sec->sectname, pSec->sectname, 16 );
+ memcpy( sec->segname, pSec->segname, 16 );
+ sec->addr = pSec->addr;
+ sec->size = pSec->size;
+ sec->offset = pSec->offset;
+ sec->align = pSec->align;
+ sec->reloff = pSec->reloff;
+ sec->nreloc = pSec->nreloc;
+ sec->flags = pSec->flags;
+ rval = pCommand + sizeof(struct section);
+ }
+ if( gVerbosity > 2 )
+ fprintf(stderr, " flags=%x\n", flags);
+ sec->segment = segment;
+ sec->_next = NULL;
+ if( macho->sec_head )
+ macho->sec_tail->_next = sec;
+ else
+ macho->sec_head = sec;
+ macho->sec_tail = sec;
+ return rval;
+}
+
+static section_t *lookup_section(macho_file_t* macho, uint32_t nsect)
+{
+ section_t *rval = macho->sec_head;
+
+ if(nsect == 0) return NULL;
+
+ while( rval != NULL && --nsect > 0 )
+ rval = rval->_next;
+ return rval;
+}
+
+static void *add_segment( macho_file_t *macho, void *pCommand, uint8_t is64bit )
+{
+ void *rval = 0;
+ segment_t *seg = (segment_t *)calloc(1, sizeof(segment_t));
+
+ if(!seg)
+ return 0;
+ if(is64bit)
+ {
+ struct segment_command_64 *pSeg = (struct segment_command_64*)pCommand;
+
+ memcpy( seg->segname, pSeg->segname, 16 );
+ seg->vmaddr = pSeg->vmaddr;
+ seg->vmsize = pSeg->vmsize;
+ seg->fileoff = pSeg->fileoff;
+ seg->filesize = pSeg->filesize;
+ seg->maxprot = pSeg->maxprot;
+ seg->initprot = pSeg->initprot;
+ seg->nsects = pSeg->nsects;
+ seg->flags = pSeg->flags;
+ rval = pCommand + sizeof(struct segment_command_64);
+ } else {
+ struct segment_command *pSeg = (struct segment_command*)pCommand;
+
+ memcpy( seg->segname, pSeg->segname, 16 );
+ seg->vmaddr = pSeg->vmaddr;
+ seg->vmsize = pSeg->vmsize;
+ seg->fileoff = pSeg->fileoff;
+ seg->filesize = pSeg->filesize;
+ seg->maxprot = pSeg->maxprot;
+ seg->initprot = pSeg->initprot;
+ seg->nsects = pSeg->nsects;
+ seg->flags = pSeg->flags;
+ rval = pCommand + sizeof(struct segment_command);
+ }
+ seg->_next = NULL;
+ seg->mapped = macho->mapped + seg->fileoff;
+
+ if( macho->seg_head )
+ macho->seg_tail->_next = seg;
+ else
+ macho->seg_head = seg;
+ macho->seg_tail = seg;
+
+ if( gVerbosity > 2 )
+ fprintf(stderr, "Segment %s: flags=%x\n", seg->segname, seg->flags );
+
+ unsigned int ii;
+ for( ii=0; ii<seg->nsects; ii++ )
+ {
+ rval = add_section(macho, rval, is64bit, seg);
+ }
+ return rval;
+}
+
+static const char *type_str(uint8_t n_type)
+{
+ static char result[16] = {};
+ int idx = 0;
+ uint8_t stab;
+
+ memset(result, 0, sizeof(result));
+ if( n_type & N_PEXT )
+ result[idx++] = 'P';
+ if( n_type & N_EXT )
+ result[idx++] = 'E';
+ if( idx > 0 )
+ result[idx++] = ':';
+ switch( n_type & N_TYPE )
+ {
+ case N_UNDF: result[idx++] = 'U'; break;
+ case N_ABS: result[idx++] = 'A'; break;
+ case N_PBUD: result[idx++] = 'P'; break;
+ case N_SECT: result[idx++] = 'S'; break;
+ case N_INDR: result[idx++] = 'I'; break;
+ default: result[idx++] = '*'; break;
+ }
+ stab = n_type & N_STAB;
+ if( stab )
+ {
+ result[idx++] = ':';
+ result[idx++] = '0'+(stab >> 5);
+ }
+ result[idx++] = 0;
+ return result;
+}
+
+static symtab_entry_t *lookup_entry_by_name( macho_file_t *macho,
+ const char *name)
+{
+ symtab_entry_t *entry;
+
+ for( entry = macho->sym_head; entry; entry = entry->_next )
+ {
+ if(strcmp(entry->n_symbol,name)==0 && (entry->n_type & N_STAB)==0 )
+ {
+ if( entry->section == NULL )
+ {
+ entry->section = lookup_section( macho, entry->n_sect );
+ if( entry->section )
+ {
+ section_t* sec = entry->section;
+ segment_t* seg = sec->segment;
+ uint64_t offset = entry->n_value - seg->vmaddr;
+
+ entry->mapped = seg->mapped+offset;
+ }
+ else
+ entry = 0;
+ }
+ break;
+ }
+ }
+ return entry;
+}
+
+static void check_symtab(macho_file_t *macho,void *pCommand,uint8_t is64bit )
+{
+
+ struct symtab_command *pSym = (struct symtab_command *)pCommand;
+ void *pS = macho->mapped + pSym->symoff;
+ unsigned int ii = 0;
+
+ /* collect symbols */
+ for( ii=0; ii<pSym->nsyms; ii++ )
+ {
+ struct nlist *pnlist=(struct nlist*)pS;
+ symtab_entry_t *entry=(symtab_entry_t*)calloc(1,sizeof(symtab_entry_t));
+
+ if(!entry)
+ {
+ fprintf(stderr, "out of memory!\n");
+ _exit(1);
+ }
+ entry->n_strx = pnlist->n_un.n_strx;
+ entry->n_type = pnlist->n_type;
+ entry->n_sect = pnlist->n_sect;
+ entry->n_desc = pnlist->n_desc;
+ entry->section = NULL;
+ if(is64bit)
+ {
+ struct nlist_64 *pnlist64 = (struct nlist_64*)pS;
+
+ entry->n_value = pnlist64->n_value;
+ pS += sizeof(struct nlist_64);
+ }
+ else
+ {
+ entry->n_value = pnlist->n_value;
+ pS += sizeof(struct nlist);
+ }
+ entry->n_symbol=(const char *)macho->mapped+pSym->stroff+entry->n_strx;
+ entry->_next = NULL;
+ if( macho->sym_head )
+ macho->sym_tail->_next = entry;
+ else
+ macho->sym_head = entry;
+ macho->sym_tail = entry;
+ }
+ if( gVerbosity > 2 )
+ {
+ /* dump info */
+ symtab_entry_t* entry;
+
+ for( entry = macho->sym_head; entry; entry=entry->_next )
+ {
+ /* only do non-debug symbols */
+ if( (entry->n_type & N_STAB) == 0 )
+ fprintf(stderr, "%32.32s %18llx type=%s, sect=%d\n",
+ entry->n_symbol, entry->n_value,
+ type_str(entry->n_type), entry->n_sect);
+ }
+ }
+}
+
+static int load_architecture( macho_file_t* inFile )
+{
+ /* check the header */
+ unsigned int ii;
+ void * pCurrent = inFile->mapped;
+ struct mach_header* header = (struct mach_header*)pCurrent;
+
+ if( header->magic != MH_MAGIC && header->magic != MH_MAGIC_64 )
+ {
+ fprintf(stderr, "%s is not a mach-o file\n", inFile->filename);
+ return -1;
+ }
+ else if( header->filetype == MH_BUNDLE )
+ {
+ fprintf(stderr, "%s is not a mach-o executable file (filetype MH_BUNDLE, should be MH_EXECUTE or MH_DYLIB)\n", inFile->filename);
+ return -1;
+ }
+ else if( header->filetype == MH_DYLINKER )
+ {
+ fprintf(stderr, "%s is not a mach-o executable file (filetype MH_DYLINKER, should be MH_EXECUTE or MH_DYLIB)\n", inFile->filename);
+ return -1;
+ }
+ else if( !(header->filetype == MH_EXECUTE || header->filetype == MH_DYLIB) )
+ {
+ fprintf(stderr, "%s is not a mach-o executable file (filetype %d, should be MH_EXECUTE or MH_DYLIB)\n", inFile->filename, header->filetype);
+ return -1;
+ }
+
+ if( gVerbosity > 1 )
+ fprintf(stderr, "loading %s(%s)\n", inFile->filename, cputype(header->cputype, header->cpusubtype));
+
+ inFile->cpu_type = header->cputype;
+ inFile->cpu_subtype = header->cpusubtype;
+
+ if( header->magic == MH_MAGIC )
+ pCurrent += sizeof( struct mach_header );
+ else if( header->magic == MH_MAGIC_64 )
+ pCurrent += sizeof( struct mach_header_64 );
+ for( ii=0; ii<header->ncmds; ii++ )
+ {
+ struct load_command* command = (struct load_command*)pCurrent;
+ const char * lc_name;
+
+ switch( command->cmd )
+ {
+ case LC_SEGMENT:
+ {
+ lc_name = "LC_SEGMENT";
+ add_segment(inFile, pCurrent, header->magic == MH_MAGIC_64);
+ break;
+ }
+ case LC_SYMTAB:
+ {
+ lc_name = "LC_SYMTAB";
+ check_symtab(inFile, pCurrent, header->magic == MH_MAGIC_64 );
+ break;
+ }
+ case LC_SYMSEG: lc_name = "LC_SYMSEG"; break;
+ case LC_THREAD: lc_name = "LC_THREAD"; break;
+ case LC_UNIXTHREAD: lc_name = "LC_UNIXTHREAD"; break;
+ case LC_LOADFVMLIB: lc_name = "LC_LOADFVMLIB"; break;
+ case LC_IDFVMLIB: lc_name = "LC_IDFVMLIB"; break;
+ case LC_IDENT: lc_name = "LC_IDENT"; break;
+ case LC_FVMFILE: lc_name = "LC_FVMFILE"; break;
+ case LC_PREPAGE: lc_name = "LC_PREPAGE"; break;
+ case LC_DYSYMTAB: lc_name = "LC_DYSYMTAB"; break;
+ case LC_LOAD_DYLIB: lc_name = "LC_LOAD_DYLIB"; break;
+ case LC_ID_DYLIB: lc_name = "LC_ID_DYLIB"; break;
+ case LC_LOAD_DYLINKER: lc_name = "LC_LOAD_DYLINKER"; break;
+ case LC_ID_DYLINKER: lc_name = "LC_ID_DYLINKER"; break;
+ case LC_PREBOUND_DYLIB: lc_name = "LC_PREBOUND_DYLIB"; break;
+ case LC_ROUTINES: lc_name = "LC_ROUTINES"; break;
+ case LC_SUB_FRAMEWORK: lc_name = "LC_SUB_FRAMEWORK"; break;
+ case LC_SUB_UMBRELLA: lc_name = "LC_SUB_UMBRELLA"; break;
+ case LC_SUB_CLIENT: lc_name = "LC_SUB_CLIENT"; break;
+ case LC_SUB_LIBRARY: lc_name = "LC_SUB_LIBRARY"; break;
+ case LC_TWOLEVEL_HINTS: lc_name = "LC_TWOLEVEL_HINTS"; break;
+ case LC_PREBIND_CKSUM: lc_name = "LC_PREBIND_CKSUM"; break;
+ case LC_LOAD_WEAK_DYLIB: lc_name = "LC_LOAD_WEAK_DYLIB"; break;
+ case LC_SEGMENT_64:
+ {
+ lc_name = "LC_SEGMENT_64";
+ add_segment(inFile, pCurrent, TRUE);
+ break;
+ }
+ case LC_ROUTINES_64: lc_name = "LC_ROUTINES_64"; break;
+ case LC_UUID: lc_name = "LC_UUID"; break;
+ case LC_RPATH: lc_name = "LC_RPATH"; break;
+ case LC_CODE_SIGNATURE: lc_name = "LC_CODE_SIGNATURE"; break;
+ case LC_SEGMENT_SPLIT_INFO:
+ lc_name = "LC_SEGMENT_SPLIT_INFO"; break;
+ case LC_REEXPORT_DYLIB: lc_name = "LC_REEXPORT_DYLIB"; break;
+ case LC_LAZY_LOAD_DYLIB: lc_name = "LC_LAZY_LOAD_DYLIB"; break;
+ case LC_ENCRYPTION_INFO: lc_name = "LC_ENCRYPTION_INFO"; break;
+ case LC_DYLD_INFO: lc_name = "LC_DYLD_INFO"; break;
+ case LC_DYLD_INFO_ONLY: lc_name = "LC_DYLD_INFO_ONLY"; break;
+ case LC_LOAD_UPWARD_DYLIB: lc_name = "LC_LOAD_UPWARD_DYLIB"; break;
+ case LC_VERSION_MIN_MACOSX:
+ lc_name = "LC_VERSION_MIN_MACOSX"; break;
+ case LC_VERSION_MIN_IPHONEOS:
+ lc_name = "LC_VERSION_MIN_IPHONEOS"; break;
+ case LC_FUNCTION_STARTS: lc_name = "LC_FUNCTION_STARTS"; break;
+ case LC_DYLD_ENVIRONMENT: lc_name = "LC_DYLD_ENVIRONMENT"; break;
+ default: lc_name=NULL; break;
+ }
+ if( gVerbosity > 1 )
+ {
+ if(lc_name)
+ fprintf(stderr,"command %s: size=%d\n",lc_name,
+ command->cmdsize );
+ else
+ fprintf(stderr,"command %x, size=%d\n",command->cmd,
+ command->cmdsize);
+ }
+ pCurrent += command->cmdsize;
+ }
+ return 0;
+}
+
+#define HOSTORDER_VALUE(val) (isBigEndian ? OSSwapBigToHostInt32(val) : (val))
+
+static macho_file_t *load_file(macho_file_t *inFile)
+{
+ macho_file_t *rval = NULL;
+ void *pCurrent = inFile->mapped;
+ struct fat_header *fat = (struct fat_header *)pCurrent;
+
+ if( fat->magic==FAT_MAGIC || fat->magic==FAT_CIGAM )
+ {
+ int isBigEndian = fat->magic == FAT_CIGAM;
+ unsigned int ii = 0;
+ struct fat_arch *pArch = NULL;
+ uint32_t nfat_arch = 0;
+
+ pCurrent += sizeof(struct fat_header);
+ pArch = pCurrent;
+ nfat_arch = HOSTORDER_VALUE(fat->nfat_arch);
+ for( ii=0; ii<nfat_arch; ii++)
+ {
+ macho_file_t *archfile=(macho_file_t *)calloc(1,
+ sizeof(macho_file_t));
+ if( archfile )
+ {
+ archfile->filename = strdup(inFile->filename);
+ archfile->mapped = inFile->mapped +
+ HOSTORDER_VALUE(pArch->offset);
+ archfile->size = HOSTORDER_VALUE(pArch->size);
+ archfile->align = HOSTORDER_VALUE(pArch->align);
+ archfile->isBigEndian = isBigEndian;
+ archfile->cpu_type = HOSTORDER_VALUE(pArch->cputype);
+ archfile->cpu_subtype = HOSTORDER_VALUE(pArch->cpusubtype);
+ if( load_architecture(archfile) == 0 )
+ {
+ archfile->next = rval;
+ rval = archfile;
+ }
+ }
+ else
+ return NULL; /* no memory */
+ pArch++;
+ }
+ }
+ else
+ {
+ struct mach_header* header = (struct mach_header*)pCurrent;
+
+ if( header->magic != MH_MAGIC && header->magic != MH_MAGIC_64 )
+ {
+ fprintf(stderr, "%s is not a mach-o file\n", inFile->filename);
+ }
+ else if( header->filetype == MH_BUNDLE )
+ {
+ fprintf(stderr, "%s is not a mach-o executable file "
+ "(filetype MH_BUNDLE, should be MH_EXECUTE or MH_DYLIB)\n", inFile->filename);
+ }
+ else if( header->filetype == MH_DYLINKER )
+ {
+ fprintf(stderr, "%s is not a mach-o executable file "
+ "(filetype MH_DYLINKER, should be MH_EXECUTE or MH_DYLIB)\n", inFile->filename);
+ }
+ else if( !(header->filetype == MH_EXECUTE || header->filetype == MH_DYLIB) )
+ {
+ fprintf(stderr, "%s is not a mach-o executable file "
+ "(filetype %d should be MH_EXECUTE or MH_DYLIB)\n",
+ inFile->filename, header->filetype );
+ }
+ if( load_architecture(inFile) == 0 )
+ {
+ inFile->next = 0;
+ rval = inFile;
+ }
+ }
+ return rval;
+}
+
+#define FIPS_SIGNATURE_SIZE 20
+#define FIPS_FINGERPRINT_SIZE 40
+
+static void debug_symbol( symtab_entry_t* sym )
+{
+ if( gVerbosity > 1 )
+ {
+ section_t* sec = sym->section;
+ segment_t* seg = sec->segment;
+ fprintf(stderr, "%-40.40s: %llx sect=%s, segment=%s prot=(%x->%x)\n",
+ sym->n_symbol, sym->n_value, sec->sectname,
+ seg->segname, seg->initprot, seg->maxprot );
+ }
+}
+
+/*
+ * Minimalistic HMAC from fips_standalone_sha1.c
+ */
+static void hmac_init(SHA_CTX *md_ctx,SHA_CTX *o_ctx,
+ const char *key)
+ {
+ size_t len=strlen(key);
+ int i;
+ unsigned char keymd[HMAC_MAX_MD_CBLOCK];
+ unsigned char pad[HMAC_MAX_MD_CBLOCK];
+
+ if (len > SHA_CBLOCK)
+ {
+ SHA1_Init(md_ctx);
+ SHA1_Update(md_ctx,key,len);
+ SHA1_Final(keymd,md_ctx);
+ len=20;
+ }
+ else
+ memcpy(keymd,key,len);
+ memset(&keymd[len],'\0',HMAC_MAX_MD_CBLOCK-len);
+
+ for(i=0 ; i < HMAC_MAX_MD_CBLOCK ; i++)
+ pad[i]=0x36^keymd[i];
+ SHA1_Init(md_ctx);
+ SHA1_Update(md_ctx,pad,SHA_CBLOCK);
+
+ for(i=0 ; i < HMAC_MAX_MD_CBLOCK ; i++)
+ pad[i]=0x5c^keymd[i];
+ SHA1_Init(o_ctx);
+ SHA1_Update(o_ctx,pad,SHA_CBLOCK);
+ }
+
+static void hmac_final(unsigned char *md,SHA_CTX *md_ctx,SHA_CTX *o_ctx)
+ {
+ unsigned char buf[20];
+
+ SHA1_Final(buf,md_ctx);
+ SHA1_Update(o_ctx,buf,sizeof buf);
+ SHA1_Final(md,o_ctx);
+ }
+
+static int fingerprint(macho_file_t* inFile, int addFingerprint)
+{
+ int rval = 0;
+ unsigned char signature[FIPS_SIGNATURE_SIZE];
+ char signature_string[FIPS_FINGERPRINT_SIZE+1];
+ unsigned int len = sizeof(signature);
+ const char *fingerprint = NULL;
+ int ii = 0;
+
+#define LOOKUP_SYMBOL( symname, prot ) \
+ symtab_entry_t *symname = \
+ lookup_entry_by_name( inFile, "_" #symname ); \
+ if( ! symname ) { \
+ fprintf(stderr, "%s: Not a FIPS executable (" \
+ #symname " not found)\n", inFile->filename ); \
+ return -1;\
+ } \
+ if( (symname->section->segment->initprot & \
+ (PROT_READ|PROT_WRITE|PROT_EXEC)) != (prot) ) { \
+ fprintf(stderr, #symname \
+ " segment has the wrong protection.\n"); \
+ debug_symbol(symname);return -1;\
+ }
+
+ LOOKUP_SYMBOL( FIPS_rodata_start, PROT_READ | PROT_EXEC );
+ LOOKUP_SYMBOL( FIPS_rodata_end, PROT_READ | PROT_EXEC );
+ LOOKUP_SYMBOL( FIPS_text_startX, PROT_READ | PROT_EXEC );
+ LOOKUP_SYMBOL( FIPS_text_endX, PROT_READ | PROT_EXEC );
+ LOOKUP_SYMBOL( FIPS_signature, PROT_WRITE | PROT_READ );
+ LOOKUP_SYMBOL( FINGERPRINT_ascii_value, PROT_READ | PROT_EXEC );
+
+ if( gVerbosity > 1 )
+ {
+ debug_symbol( FIPS_rodata_start );
+ debug_symbol( FIPS_rodata_end );
+ debug_symbol( FIPS_text_startX );
+ debug_symbol( FIPS_text_endX );
+ debug_symbol( FIPS_signature );
+ debug_symbol( FINGERPRINT_ascii_value );
+
+ fingerprint = (const char *)FINGERPRINT_ascii_value->mapped;
+ fprintf(stderr, "fingerprint: ");
+ for(ii=0; ii<40; ii++ )
+ {
+ if( fingerprint[ii] == 0 )
+ break;
+ putc(fingerprint[ii], stderr);
+ }
+ putc('\n', stderr);
+ }
+
+ /* check for the prefix ? character */
+ {
+ const unsigned char * p1 = FIPS_text_startX->mapped;
+ const unsigned char * p2 = FIPS_text_endX->mapped;
+ const unsigned char * p3 = FIPS_rodata_start->mapped;
+ const unsigned char * p4 = FIPS_rodata_end->mapped;
+ static const char FIPS_hmac_key[]="etaonrishdlcupfm";
+ SHA_CTX md_ctx,o_ctx;
+
+ hmac_init(&md_ctx,&o_ctx,FIPS_hmac_key);
+
+ if (p1<=p3 && p2>=p3)
+ p3=p1, p4=p2>p4?p2:p4, p1=NULL, p2=NULL;
+ else if (p3<=p1 && p4>=p1)
+ p3=p3, p4=p2>p4?p2:p4, p1=NULL, p2=NULL;
+
+ if (p1) {
+
+ SHA1_Update(&md_ctx,p1,(size_t)p2-(size_t)p1);
+ }
+ if (FIPS_signature->mapped>=p3 && FIPS_signature->mapped<p4)
+ {
+ /* "punch" hole */
+ SHA1_Update(&md_ctx,p3,(size_t)FIPS_signature-(size_t)p3);
+ p3 = FIPS_signature->mapped+FIPS_SIGNATURE_SIZE;
+ if (p3<p4) {
+ SHA1_Update(&md_ctx,p3,(size_t)p4-(size_t)p3);
+ }
+ }
+ else {
+ SHA1_Update(&md_ctx,p3,(size_t)p4-(size_t)p3);
+ }
+
+ hmac_final(signature,&md_ctx,&o_ctx);
+
+ {
+ char *pString = NULL;
+ unsigned int i = 0;
+
+ memset( signature_string, 0, sizeof(signature_string));
+ pString = signature_string;
+ for (i=0;i<len;i++)
+ {
+ snprintf(pString, 3, "%02x",signature[i]);
+ pString+=2;
+ }
+ *pString = 0;
+ }
+ }
+ fingerprint = (char *)FINGERPRINT_ascii_value->mapped;
+ inFile->fingerprint_original = strndup(fingerprint,FIPS_FINGERPRINT_SIZE);
+ inFile->fingerprint_computed = strdup(signature_string);
+
+ if( addFingerprint )
+ {
+ void *fp_page = NULL;
+ void *fp_end = NULL;
+
+ if(strcmp(fingerprint,"?have to make sure this string is unique")!=0)
+ {
+ if (memcmp((char*)fingerprint, signature_string, FIPS_FINGERPRINT_SIZE)!=0)
+ {
+ fprintf(stderr,
+ "%s(%s) original fingerprint incorrect: %s\n",
+ inFile->filename,
+ cputype(inFile->cpu_type, inFile->cpu_subtype),
+ fingerprint);
+ }
+ }
+
+ fp_page = (void*)((uintptr_t)fingerprint & ~PAGE_MASK);
+ fp_end = (void*)((uintptr_t)(fingerprint+(PAGE_SIZE*2)) & ~PAGE_MASK);
+ if( mprotect( fp_page, fp_end-fp_page, PROT_READ|PROT_WRITE ) )
+ {
+ perror("Can't write the fingerprint - mprotect failed");
+ fprintf(stderr, "fp_page=%p, fp_end=%p, len=%ld\n",
+ fp_page, fp_end, (size_t)(fp_end-fp_page));
+ rval = 1;
+ }
+ else
+ {
+ memcpy((char*)fingerprint, signature_string, FIPS_FINGERPRINT_SIZE);
+ if( msync(fp_page, (fp_end-fp_page), 0) )
+ perror("msync failed");
+ }
+ if( gVerbosity > 0 )
+ fprintf(stderr, "%s(%s) fingerprint: %s\n", inFile->filename,
+ cputype(inFile->cpu_type,inFile->cpu_subtype),
+ signature_string);
+ }
+ if( *fingerprint == '?' )
+ {
+ printf("%s(%s) has no fingerprint.\n", inFile->filename,
+ cputype(inFile->cpu_type, inFile->cpu_subtype));
+ rval = 2;
+ }
+ else if( strncmp( fingerprint, signature_string, FIPS_FINGERPRINT_SIZE) == 0 )
+ {
+ if( ! addFingerprint )
+ printf("%s(%s) fingerprint is correct: %s\n", inFile->filename,
+ cputype(inFile->cpu_type, inFile->cpu_subtype),
+ signature_string);
+ }
+ else
+ {
+ printf("%s(%s) fingerprint %.40s is not correct\n", inFile->filename,
+ cputype(inFile->cpu_type,inFile->cpu_subtype), fingerprint);
+ printf("calculated: %s\n", signature_string);
+ rval = -1;
+ }
+ return rval;
+}
+
+static int make_fingerprint( const char * inApp, int addFingerprint )
+{
+ int rval = 1;
+ int appfd = -1;
+ if( addFingerprint )
+ appfd = open( inApp, O_RDWR );
+ if( appfd < 0 )
+ {
+ if( addFingerprint )
+ fprintf(stderr, "Can't modify %s. Verifying only.\n", inApp);
+ addFingerprint = 0;
+ appfd = open( inApp, O_RDONLY );
+ }
+ if( appfd >= 0 )
+ {
+ struct stat stbuf;
+ fstat(appfd, &stbuf);
+ void * pApp = mmap(0, (size_t)stbuf.st_size, PROT_READ,
+ MAP_SHARED, appfd, (off_t)0);
+ if( pApp == MAP_FAILED )
+ {
+ perror(inApp);
+ }
+ else
+ {
+ macho_file_t theFile;
+ macho_file_t* architectures;
+ macho_file_t* pArchitecture;
+
+ memset( &theFile, 0, sizeof(theFile) );
+ theFile.filename = inApp;
+ theFile.mapped = pApp;
+ architectures = load_file(&theFile);
+ for( pArchitecture = architectures; pArchitecture;
+ pArchitecture = pArchitecture->next )
+ {
+ rval = fingerprint(pArchitecture, addFingerprint);
+ if( rval && addFingerprint )
+ {
+ printf("Failure\n");
+ break;
+ }
+ }
+ if((rval==0) && addFingerprint)
+ {
+ printf("Fingerprint Stored\n");
+ }
+ munmap(pApp, (size_t)stbuf.st_size);
+ }
+ close(appfd);
+ }
+ else
+ {
+ fprintf(stderr, "Can't open %s\n", inApp );
+ }
+ return rval;
+}
+
+static void print_usage(const char * prog)
+{
+ fprintf(stderr, "usage:\n\t%s [--debug] [--quiet] [-exe|-dso|-dylib] executable\n", prog);
+ _exit(1);
+}
+
+int main (int argc, const char * argv[])
+{
+ const char * pname = argv[0];
+ const char * filename = NULL;
+ int addFingerprint = 1;
+ const char * verbose_env = getenv("FIPS_SIG_VERBOSE");
+
+ if( verbose_env )
+ gVerbosity = atoi(verbose_env);
+
+ if( gVerbosity < 0 )
+ gVerbosity = 1;
+
+ while( --argc )
+ {
+ ++argv;
+ if( strcmp(*argv,"-exe")==0 || strcmp(*argv,"--exe")==0 ||
+ strcmp(*argv,"-dso")==0 || strcmp(*argv,"--dso")==0 ||
+ strcmp(*argv,"-dylib")==0 || strcmp(*argv,"--dylib")==0 ||
+ strcmp(*argv,"--verify")==0 )
+ {
+ if(strcmp(*argv,"--verify")==0)
+ addFingerprint=0;
+
+ if( argc > 0 )
+ {
+ filename = *++argv;
+ argc--;
+ }
+ }
+ else if(strcmp(*argv,"-d")==0 || strcmp(*argv,"-debug")==0 || strcmp(*argv,"--debug")==0)
+ {
+ if( gVerbosity < 2 )
+ gVerbosity = 2;
+ else
+ gVerbosity++;
+ }
+ else if(strcmp(*argv,"-q")==0 || strcmp(*argv,"-quiet")==0 || strcmp(*argv,"--quiet")==0)
+ gVerbosity = 0;
+ else if(strncmp(*argv,"-",1)!=0) {
+ filename = *argv;
+ }
+ }
+
+ if( !filename )
+ {
+ print_usage(pname);
+ return 1;
+ }
+
+ if( access(filename, R_OK) )
+ {
+ fprintf(stderr, "Can't access %s\n", filename);
+ return 1;
+ }
+
+ return make_fingerprint( filename, addFingerprint );
+}
+
diff --git a/test/fips_algvs.c b/test/fips_algvs.c
index ed03507..8ff75dc 100644
--- a/test/fips_algvs.c
+++ b/test/fips_algvs.c
@@ -70,6 +70,67 @@ int main(int argc, char **argv)
}
#else
+#if defined(__vxworks)
+
+#include <taskLibCommon.h>
+#include <string.h>
+
+int fips_algvs_main(int argc, char **argv);
+#define main fips_algvs_main
+
+static int fips_algvs_argv(char *a0)
+{
+ char *argv[32] = { "fips_algvs" };
+ int argc = 1;
+ int main_ret;
+
+ if (a0) {
+ char *scan = a0, *arg = a0;
+
+ while (*scan) {
+ if (*scan++ == ' ') {
+ scan[-1] = '\0';
+ argv[argc++] = arg;
+ if (argc == (sizeof(argv)/sizeof(argv[0])-1))
+ break;
+
+ while (*scan == ' ') scan++;
+ arg = scan;
+ }
+ }
+ if (*scan == '\0') argv[argc++] = arg;
+ }
+
+ argv[argc] = NULL;
+
+ main_ret = fips_algvs_main(argc, argv);
+
+ if (a0) free(a0);
+
+ return main_ret;
+}
+
+int fips_algvs(int a0)
+{
+ return taskSpawn("fips_algvs", 100, (VX_FP_TASK | VX_SPE_TASK), 100000,
+ (FUNCPTR)fips_algvs_argv,
+ a0 ? strdup(a0) : 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
+}
+
+static FILE *fips_fopen(const char *path, const char *mode)
+{
+ char fips_path [256];
+
+ if (path[0] != '/' && strlen(path) < (sizeof(fips_path)-8)) {
+ strcpy(fips_path,"/fips0/");
+ strcat(fips_path,path);
+ return fopen(fips_path,mode);
+ }
+ return fopen(path,mode);
+}
+#define fopen fips_fopen
+#endif
+
#define FIPS_ALGVS
extern int fips_aesavs_main(int argc, char **argv);
@@ -265,6 +326,16 @@ int main(int argc, char **argv)
SysInit();
#endif
+#if (defined(__arm__) || defined(__aarch64__))
+ if (*args && !strcmp(*args, "-noaccel"))
+ {
+ extern unsigned int OPENSSL_armcap_P;
+
+ OPENSSL_armcap_P=0;
+ args++;
+ argc--;
+ }
+#endif
if (*args && *args[0] != '-')
{
rv = run_prg(argc - 1, args);
diff --git a/util/incore b/util/incore
index e6e6ecf..bb765b1 100755
--- a/util/incore
+++ b/util/incore
@@ -382,7 +382,7 @@ if (!$legacy_mode) {
}
$FINGERPRINT_ascii_value
- = $exe->Lookup("FINGERPRINT_ascii_value") or die;
+ = $exe->Lookup("FINGERPRINT_ascii_value");
}
if ($FIPS_text_startX && $FIPS_text_endX) {
@@ -439,9 +439,12 @@ $fingerprint = FIPS_incore_fingerprint();
if ($legacy_mode) {
print unpack("H*",$fingerprint);
-} else {
+} elsif (defined($FINGERPRINT_ascii_value)) {
seek(FD,$FINGERPRINT_ascii_value->{st_offset},0) or die "$!";
print FD unpack("H*",$fingerprint) or die "$!";
+} else {
+ seek(FD,$FIPS_signature->{st_offset},0) or die "$!";
+ print FD $fingerprint or die "$!";
}
close (FD);
More information about the openssl-commits
mailing list