[openssl/openssl] 3f42f4: Improve chacha20 perfomance on aarch64 by interlea...
daniel-hu-arm
noreply at github.com
Thu Sep 1 08:03:41 UTC 2022
Branch: refs/heads/master
Home: https://github.com/openssl/openssl
Commit: 3f42f41ad19c631287386fd8d58f9e02466c5e3f
https://github.com/openssl/openssl/commit/3f42f41ad19c631287386fd8d58f9e02466c5e3f
Author: Daniel Hu <Daniel.Hu at arm.com>
Date: 2022-09-01 (Thu, 01 Sep 2022)
Changed paths:
M crypto/chacha/asm/chacha-armv8-sve.pl
Log Message:
-----------
Improve chacha20 perfomance on aarch64 by interleaving scalar with SVE/SVE2
The patch will process one extra block by scalar in addition to
blocks by SVE/SVE2 in parallel. This is esp. helpful in the
scenario where we only have 128-bit vector length.
The actual uplift to performance is complicated, depending on the
vector length and input data size. SVE/SVE2 implementation don't
always perform better than Neon, but it should prevail in most
cases
On a CPU with 256-bit SVE/SVE2, interleaved processing can
handle 9 blocks in parallel (8 blocks by SVE and 1 by Scalar).
on 128-bit SVE/SVE2 it is 5 blocks. Input size that is a multiple
of 9/5 blocks on respective CPU can be typically handled at
maximum speed.
Here are test data for 256-bit and 128-bit SVE/SVE2 by running
"openssl speed -evp chacha20 -bytes 576" (and other size)
----------------------------------+---------------------------------
256-bit SVE | 128-bit SVE2
----------------------------------|---------------------------------
Input 576 bytes 512 bytes | 320 bytes 256 bytes
----------------------------------|---------------------------------
SVE 1716361.91k 1556699.18k | 1615789.06k 1302864.40k
----------------------------------|---------------------------------
Neon 1262643.44k 1509044.05k | 680075.67k 1060532.31k
----------------------------------+---------------------------------
If the input size gets very large, the advantage of SVE/SVE2 over
Neon will fade out.
Signed-off-by: Daniel Hu <Daniel.Hu at arm.com>
Change-Id: Ieedfcb767b9c08280d7c8c9a8648919c69728fab
Reviewed-by: Tomas Mraz <tomas at openssl.org>
Reviewed-by: Paul Dale <pauli at openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18901)
More information about the openssl-commits
mailing list