[External] RE: SSL handshake hanging

Helde, Paavo Paavo.Helde at PERKINELMER.COM
Thu Apr 20 08:39:48 UTC 2023


Thank you for the reply! I do not have access to the network where this is happening, so debugging is complicated. But now it seems the problem is with low-level TCP connections which sometimes (very rarely) go stale/stuck by unknown reasons. There is no hint of this connection in the server-side logs, so it looks like the connection never reaches the server. It seems openssl only got involved because SSL handshake is the first thing which takes place in the connection.

Anyway, it looks like we can work around that by putting a SO_RCVTIMEO timeout on the socket, and closing and reopening the socket when this gets hit. We might consider using libcurl in future.

Thanks, and regards
Paavo

-----Original Message-----
From: openssl-users <openssl-users-bounces at openssl.org> On Behalf Of Michael Wojcik via openssl-users
Sent: teisipäev, 18. aprill 2023 19:19
To: openssl-users at openssl.org
Subject: [External] RE: SSL handshake hanging

> From: openssl-users <openssl-users-bounces at openssl.org> On Behalf Of 
> Helde, Paavo
> Sent: Tuesday, 18 April, 2023 03:32

> We are using openssl for client-side HTTP connections. Sometimes they 
> get randomly hanging during SSL handshake. It looks like there are 
> some network or server-side problems, earlier the same server was responding with an error like:

> SSL_write() failed with error code: SSL_ERROR_SYSCALL

> According to google this means: The SSL_ERROR_SYSCALL with errno value 
> of 0 indicates unexpected EOF from the peer.

Without more information this doesn't tell us anything useful. A network trace on the client side would at least tell us whether the client is closing its end, and whether it's doing it with a TCP FIN or RST. A network trace on the server side to compare would also be useful.

> Later another request is made to the same server, which hangs indefinitely. Stack backtrace in gdb:

> #0  0x00007ff999c54ab4 in read ()
> ...
> #13 0x00007ff97c99d54c in SSL_connect ()

So it's doing a blocking receive during the handshake and the server isn't responding. Again, I'd do a network trace, or ideally network traces on both the client and the server.

> My question is, what I can do on the client side to debug the problem, 
> or at least to avoid such hanging? I guess I can set socket read 
> timeout beforehand, and reset it after handshake, or is there a better way?

Socket receive timeout (for stacks that support it) or non-blocking sockets are the best options for preventing long blocking socket receive operations, yes. You may well want a socket receive timeout (possibly for a different value) after the handshake completes. If the server finishes the handshake and receives your HTTP request, but then takes a day to send the response, do you want your application to block for that long?

In practice, most applications either have to let the user interrupt blocking network operations, or enforce a reasonable timeout -- where "reasonable" depends on the application requirements.

> This is openssl 1.1, would it make sense to switch over to openssl 3.0?

Yes, since 1.1.1 goes out of support in a few months. Won't help with this issue, though.

Another option, and one that I would recommend, is to not use OpenSSL directly at all. Use a library that does HTTPS and handles timeouts for you, such as libcurl. HTTP is a difficult protocol to implement correctly; OpenSSL is an API that is difficult to use correctly. Abstractions are your friend.

--
Michael Wojcik


More information about the openssl-users mailing list