ssl3_get_record:decryption failed on some machines
Matt Caswell
matt at openssl.org
Mon Nov 25 10:35:01 UTC 2019
On 25/11/2019 08:45, fergtm at hyperion.io wrote:
> Sorry to bring this up again but I really don't know how to fix. I already
> re-wrote my code to use SSL_read/SSL_write instead of a SSL filter BIO but I
> still get the same error.
>
> I can reproduce when the sender is nginx, socat openssl-listen or openssl
> s_server. Both the server and client are running in the same machine.
>
> The SSL object is not using a socket BIO instead I use a BIO pair. I may be
> using the BIO pair incorrectly but I haven't found any complete examples on
> how to use them.
>
> It works perfectly if I use a debug build of OpenSSL
This suggests it *could* be a compiler bug. You might want to experiment
with different optimization levels to see if that makes a difference.
Matt
>
> Thanks
>
> -----Original Message-----
> From: openssl-users <openssl-users-bounces at openssl.org> On Behalf Of
> Fernando Gutierrez Mendez
> Sent: Monday, November 18, 2019 2:34 PM
> To: openssl-users at openssl.org
> Subject: Re: ssl3_get_record:decryption failed on some machines
>
> The writer is my own code but I can also reproduce the problem when server
> is nginx and client is my app.
>
> In my code I do not use OpenSSL socket BIOs instead I do read/writes through
> a BIO pair:
>
> pairBase = BIO_new(BIO_s_bio());
> pairInt = BIO_new(BIO_s_bio());
>
> [...]
>
> BIO_make_bio_pair(pairBase, pairInt);
>
> [...]
>
> sslBIO = BIO_new_ssl(ssl_ctx, 1 /* Client */);
>
> [...]
>
> BIO_push(sslBIO, pairInt);
>
> After each BIO_read/BIO_write to sslBIO I read/write any available data from
> the network to pairBase.
>
> I think I'm handling partial writes correctly:
>
> SSL_CTX_set_mode(ssl_ctx, SSL_MODE_AUTO_RETRY |
> SSL_MODE_ENABLE_PARTIAL_WRITE | SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER);
>
> [..]
>
> ret = BIO_write(sslBIO, buf, (int)length);
>
> if (ret <= 0 && !BIO_should_retry(sslBIO))
> {
> /* Handle error */
> return;
> }
>
> if (ret > 0)
> {
> buf = ((uint8_t *)buf) + (size_t)ret;
> length -= (size_t)ret;
> }
>
> but again the problem reproduces even if the writer is nginx.
>
> Thanks
>
> On Mon, Nov 18, 2019 at 02:19:30PM -0500, Viktor Dukhovni wrote:
>>> On Nov 18, 2019, at 1:44 PM, Fernando Gutierrez Mendez
> <fergtm at hyperion.io> wrote:
>>>
>>> I use non-blocking IO with a SSL BIO so a call to BIO_read eventually
> returns -1, when this happens I call BIO_should_retry to test if this is due
> an error or because of the underlying non-blocking transport.
>>
>> Is the writer side also non-blocking? Is it your own code?
>>
>>> This code works correctly but after transferring between 1Mb to 5Mb (it
> varies every time) BIO_should_rety returns false and SSL_get_error returns
> SSL_ERROR_SSL. The error is "139964546914112:error:1408F119:SSL
> routines:ssl3_get_record:decryption failed or bad record
> mac:../ssl/record/ssl3_record.c:677"
>>
>> One way to get decryption integrity failure is for a non-blocking
>> writer to not handle partial writes correctly, if on an incomplete
>> write the writer resends the whole buffer, rather than only what it
>> failed to send last time, the TCP stream ends up stuttering
>> ciphertext, and the reader sees data integrity errors.
>>
>> This can be seen by looking for unexpected runs of repeated ciphertext
>> in a PCAP capture of the data.
>>
>> Whether the data sent to a particular reader ever ends up blocked at
>> the TCP layer for a given writer can depend on various network-layer
>> issues making some machines more prone to problems than others.
>>
>> --
>> Viktor.
>>
>
More information about the openssl-users
mailing list