[openssl-users] Performance problems with OpenSSL and threading

John Foley foleyj at cisco.com
Fri Apr 24 23:17:13 UTC 2015


Some of the algorithms still invoke the FIPS flavor of the algorithm
even when FIPS is disabled.  For example, this code is from
EVP_DigestUpdate().

int EVP_DigestUpdate(EVP_MD_CTX *ctx, const void *data, size_t count)
{
#ifdef OPENSSL_FIPS
    return FIPS_digestupdate(ctx, data, count);
#else
    return ctx->update(ctx, data, count);
#endif
}

In 1.0.1, locks 39 and 40 are:

# define CRYPTO_LOCK_FIPS                39
# define CRYPTO_LOCK_FIPS2               40

You should try adding some logic to skip the lock in your lock handler
when the lock ID is 39 or 40. Again, it should be safe to not lock on
these two as long as the FIPS POST has completed.



On 04/24/2015 05:56 PM, Bryan Call wrote:
> In my last email I ran the benchmark on Fedora 21 (big mistake).  
> Here are the results when running it back on the 28/56 core RHEL 6.5
> server showing contention in a different place (fips).  Is there a
> reason it would be calling into the FIPS code to get a lock of FIPS is
> not enabled?
>
> Contention log:
> [Apr 24 21:35:09.731] Server {0x7f5529ccd700} ERROR: contention for
> lock - total contention: 70662 waiting: 16 file: fips.c line: 471 type: 40
> [Apr 24 21:35:09.732] Server {0x7f55299ca700} ERROR: contention for
> lock - total contention: 71605 waiting: 2 file: fips.c line: 107 type: 39
>
>
> Line in in fips that see the contention the most:
> [bcall at bm77 trafficserver]$ grep fips.c diags.log | cut -f 19 -d ' ' |
> sort | uniq -c | sort -rn
>  875379 471
>  288338 107
>  275472 127
>
>
> Different lock types and their contention frequency (multiple benchmarks):
>  875379 40  < - in fips.c
>  563810 39  <- also in fips.c
>  440518 1
>   44145 18
>   17447 2
>   11356 10
>    1377 19
>     687 12
>     532 25
>     302 33
>     239 9
>      58 16
>      40 21
>      32 11
>      19 3
>       1 14
>
>
> Code to diable FIPS and log:
> #ifdef OPENSSL_FIPS
>     int mode = FIPS_mode();
>     Debug("ssl", "FIPS_mode: %d", mode);
>     mode = FIPS_mode();
>     if (mode) {
>       FIPS_mode_set(0);
>       Debug("ssl", "FIPS_mode: %d", mode);
>     }
> #endif
>
> [Apr 24 21:43:45.860] Server {0x7f7628146800} DEBUG: (ssl) FIPS_mode: 0
>
> -Bryan
>
>
>
>
>> On Apr 24, 2015, at 10:56 AM, John Foley <foleyj at cisco.com
>> <mailto:foleyj at cisco.com>> wrote:
>>
>> When you create the private key given to SSL_CTX, be sure to
>> instantiate a private key for each thread as well.  If you share the
>> same key across all threads, this can introduce contention.
>>
>> Here are the type 1 & 2 locks:
>>
>> # define CRYPTO_LOCK_ERR                 1
>> # define CRYPTO_LOCK_EX_DATA             2
>>
>> Have you checked the OpenSSL error log to see what errors are being
>> generated?
>>
>> Also, if you disable FIPS, does the thread contention still occur?
>>
>>
>>
>> On 04/24/2015 01:36 PM, Bryan Call wrote:
>>> We are using a single SSL_CTX across all the threads, so I will
>>> create multiple SSL_CTX per thread.  I implemented dynlock callbacks
>>> yesterday, but I didn’t see them being used in the tests I did.  I
>>> also added atomic counters to see what type of locks are having
>>> contention (see logs below):
>>>
>>> Most of the lock contention seems to be around type 2 and type 1
>>> locks when calling our lock callback.  In the FIPS/POST and SSL_CTX
>>> recommendations, in your previous email, what types of locks would
>>> those help with?
>>>
>>> Log output:
>>> [Apr 24 10:21:24.339] Server {0xb310000} ERROR: <SSLUtils.cc
>>> <http://sslutils.cc/>:178 (SSL_locking_callback)> contention for
>>> lock - total contention: 3938 waiting: 1 file:
>>> /SourceCache/OpenSSL098/OpenSSL098-52.20.2/src/crypto/ex_data.c
>>> line: 496 type: 2
>>> [Apr 24 10:21:24.754] Server {0xb10a000} ERROR: <SSLUtils.cc
>>> <http://sslutils.cc/>:178 (SSL_locking_callback)> contention for
>>> lock - total contention: 1350 waiting: 1 file:
>>> /SourceCache/OpenSSL098/OpenSSL098-52.20.2/src/crypto/err/err_def.c
>>> line: 343 type: 1
>>>
>>>
>>> -Bryan
>>>
>>>
>>>
>>>
>>>> On Apr 23, 2015, at 4:46 PM, John Foley <foleyj at cisco.com
>>>> <mailto:foleyj at cisco.com>> wrote:
>>>>
>>>> Looking at your call stack, it appears you're testing with FIPS
>>>> mode enabled.  There's a lock in FIPS_selftest_failed() that's used
>>>> to ensure the crypto algorithms inside the FIPS boundary are not
>>>> utilized until after the POST has completed and passed.
>>>> Unfortunately this lock remains installed for the lifetime of the
>>>> process.  We've seen thread contention in this area when using
>>>> OpenSSL.  One work-around is to modify your handler that you pass
>>>> to SSL_locking_callback() to not lock when the lock ID is
>>>> CRYPTO_LOCK_FIPS.  But you would only want to disable this
>>>> particular lock ID after the POST has completed.
>>>>
>>>> If the above doesn't help, another item to check is whether you're
>>>> instantiating an SSL_CTX for each thread.  I've seen some
>>>> developers use a single SSL_CTX for multiple threads.  This results
>>>> in a single instance of the server cert and private key being
>>>> shared across all the threads.  This results in thread contention
>>>> on the asymmetric key operations associated with the cert and
>>>> private key (e.g. RSA).  Be sure that each thread instantiates it's
>>>> own SSL_CTX and certificate/private key.
>>>>
>>>> Lastly, add some debugs to count the locks by lock ID in your lock
>>>> handler.  There are about 40 different lock IDs defined in
>>>> crypto/crypto.h.  You'll need to identify which lock is causing the
>>>> thread contention.  Once you know that, then you can look in the
>>>> code to see where that lock is used and troubleshoot from there.
>>>>
>>>>
>>>>
>>>> On 04/23/2015 07:23 PM, Bryan Call wrote:
>>>>> I was running a benchmark on a 28 core (56 hyper-threaded) server
>>>>> that is running 84 threads in the process and I am seeing a lot of
>>>>> lock contention.  I saw a lot of lock contention in the calls to
>>>>> SSL_get_error() ssl3_accept().  I am running RHEL 6.5
>>>>> and openssl-1.0.1e-30.el6_6.7.x86_64.  We have
>>>>> used SSL_locking_callback() to setup pthread mutex locks, but
>>>>> haven’t setup dynlocks yet.  Would dynlocks help in this scenario?
>>>>>
>>>>>
>>>>> Output from perf:
>>>>> Samples: 1M of event 'cycles', Event count (approx.): 602783506611
>>>>> - 48.04% [ET_NET 0] [kernel.kallsyms] [k] _spin_lock ◆
>>>>> - _spin_lock ▒
>>>>>
>>>>>   * + 48.05% futex_wait_setup ▒
>>>>>   * + 47.79% futex_wake ▒
>>>>>   * + 1.20% tcp_v4_rcv ▒
>>>>>   * + 0.98% task_rq_lock ▒
>>>>>   * + 0.64% double_lock_balance ▒
>>>>>
>>>>> + 3.94% [ET_NET 0] libcrypto.so.1.0.1e [.] bn_mul_mont ▒
>>>>> + 1.93% [ET_NET 0] libcrypto.so.1.0.1e [.] BN_usub ▒
>>>>> + 1.80% [ET_NET 0] libcrypto.so.1.0.1e [.] 0x00000000000e5200 ▒
>>>>> + 1.49% [ET_NET 0] libcrypto.so.1.0.1e [.] sha256_block_data_order ▒
>>>>> + 0.97% [ET_NET 0] libcrypto.so.1.0.1e [.] gcm_ghash_clmul ▒
>>>>> + 0.85% [ET_NET 0] libpthread-2.12.so [.] pthread_mutex_lock ▒
>>>>> + 0.84% [ET_NET 0] libcrypto.so.1.0.1e [.]
>>>>> aesni_ctr32_encrypt_blocks ▒
>>>>> + 0.74% [ET_NET 0] libc-2.12.so [.] malloc ▒
>>>>> + 0.72% [ET_NET 0] libc-2.12.so [.] memcpy ▒
>>>>> + 0.57% [ET_NET 0] libcrypto.so.1.0.1e [.] BN_mod_mul_montgomery ▒
>>>>> + 0.51% [ET_NET 0] libpthread-2.12.so [.] pthread_mutex_unlock
>>>>>
>>>>> pstack of the process (only a couple of the interesting threads):
>>>>> Thread 1 (Thread 0x7f1fba151800 (LWP 1106)):
>>>>> #0 0x00007f1fbaaeb264 in __lll_lock_wait () from
>>>>> /lib64/libpthread.so.0
>>>>> #1 0x00007f1fbaae6508 in _L_lock_854 () from /lib64/libpthread.so.0
>>>>> #2 0x00007f1fbaae63d7 in pthread_mutex_lock () from
>>>>> /lib64/libpthread.so.0
>>>>> #3 0x000000000072822f in SSL_locking_callback(int, int, char
>>>>> const*, int) ()
>>>>> #4 0x00000031d2ae7f03 in ?? () from /usr/lib64/libcrypto.so.10
>>>>> #5 0x00000031d2ae7d8c in ?? () from /usr/lib64/libcrypto.so.10
>>>>> #6 0x00000031d2ae80ab in ERR_get_state () from
>>>>> /usr/lib64/libcrypto.so.10
>>>>> #7 0x00000031d2ae8557 in ?? () from /usr/lib64/libcrypto.so.10
>>>>> #8 0x00000031d2e424a9 in SSL_get_error () from /usr/lib64/libssl.so.10
>>>>> #9 0x00000000007249ef in
>>>>> SSLNetVConnection::load_buffer_and_write(long, long&, long&,
>>>>> MIOBufferAccessor&, int&) ()
>>>>> #10 0x00000000007389df in write_to_net_io(NetHandler*,
>>>>> UnixNetVConnection*, EThread*) ()
>>>>> #11 0x0000000000738523 in write_to_net(NetHandler*,
>>>>> UnixNetVConnection*, EThread*) ()
>>>>> #12 0x0000000000731922 in NetHandler::mainNetEvent(int, Event*) ()
>>>>> #13 0x00000000004f6d48 in Continuation::handleEvent(int, void*) ()
>>>>> #14 0x000000000075877a in EThread::process_event(Event*, int) ()
>>>>> #15 0x0000000000758c84 in EThread::ex
>>>>>
>>>>> Thread 56 (Thread 0x7f1fb0ecd700 (LWP 1172)):
>>>>> #0 0x00007f1fbaae77d9 in pthread_mutex_unlock () from
>>>>> /lib64/libpthread.so.0
>>>>> #1 0x0000000000728265 in SSL_locking_callback(int, int, char
>>>>> const*, int) ()
>>>>> #2 0x00000031d2b57402 in ?? () from /usr/lib64/libcrypto.so.10
>>>>> #3 0x00000031d2b574bd in FIPS_selftest_failed () from
>>>>> /usr/lib64/libcrypto.so.10
>>>>> #4 0x00000031d2ae9f1c in EVP_DigestInit_ex () from
>>>>> /usr/lib64/libcrypto.so.10
>>>>> #5 0x00000031d2a7977e in HMAC_Init_ex () from
>>>>> /usr/lib64/libcrypto.so.10
>>>>> #6 0x00000031d2a79c29 in ?? () from /usr/lib64/libcrypto.so.10
>>>>> #7 0x00000031d2af7a42 in EVP_PKEY_CTX_ctrl () from
>>>>> /usr/lib64/libcrypto.so.10
>>>>> #8 0x00000031d2ae9ffb in EVP_DigestInit_ex () from
>>>>> /usr/lib64/libcrypto.so.10
>>>>> #9 0x00000031d2af952a in ?? () from /usr/lib64/libcrypto.so.10
>>>>> #10 0x00000031d2e34cf2 in tls1_PRF () from /usr/lib64/libssl.so.10
>>>>> #11 0x00000031d2e35796 in tls1_final_finish_mac () from
>>>>> /usr/lib64/libssl.so.10
>>>>> #12 0x00000031d2e2db2e in ssl3_get_message () from
>>>>> /usr/lib64/libssl.so.10
>>>>> #13 0x00000031d2e1dc3c in ssl3_get_cert_verify () from
>>>>> /usr/lib64/libssl.so.10
>>>>> #14 0x00000031d2e220d6 in ssl3_accept () from /usr/lib64/libssl.so.10
>>>>> #15 0x0000000000725287 in
>>>>> SSLNetVConnection::sslServerHandShakeEvent(int&) ()
>>>>> #16 0x00000000007251b3 in
>>>>> SSLNetVConnection::sslStartHandShake(int, int&) ()
>>>>> #17 0x0000000000723df3 in
>>>>> SSLNetVConnection::net_read_io(NetHandler*, EThread*) ()
>>>>> #18 0x0000000000731852 in NetHandler::mainNetEvent(int, Event*) ()
>>>>> #19 0x00000000004f6d48 in Continuation::handleEvent(int, void*) ()
>>>>> #20 0x000000000075877a in EThread::process_event(Event*, int) ()
>>>>> #21 0x0000000000758c84 in EThread::execute() ()
>>>>> #22 0x0000000000757cf8 in spawn_thread_internal(void*) ()
>>>>> #23 0x00007f1fbaae49d1 in start_thread () from /lib64/libpthread.so.0
>>>>> #24 0x00000030ff0e88fd in clone () from /lib64/libc.so.6
>>>>>
>>>>> -Bryan
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> openssl-users mailing list
>>>>> To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users
>>>>
>>>> _______________________________________________
>>>> openssl-users mailing list
>>>> To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users
>>>
>>>
>>>
>>> _______________________________________________
>>> openssl-users mailing list
>>> To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users
>>
>> _______________________________________________
>> openssl-users mailing list
>> To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users
>
>
>
> _______________________________________________
> openssl-users mailing list
> To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mta.openssl.org/pipermail/openssl-users/attachments/20150424/9e2ba4d3/attachment-0001.html>


More information about the openssl-users mailing list