Server application hangs on SS_read, even when client disconnects

Sat Nov 14 07:52:51 UTC 2020

Hello Michael,

Thanks for all those information.

I corrected your suggested point (close parent process sockets). I also
activated keepalive, with values adapted to my application.

I hope this will solve my issue, but as the problem may take several weeks
to occur, I will not know immediately if this was the origin :-)

Many thanks for your help.

Regards,
Brice

Le ven. 13 nov. 2020 à 18:52, Michael Wojcik <Michael.Wojcik at microfocus.com>
a écrit :

> > From: Brice André <brice at famille-andre.be>
> > Sent: Friday, 13 November, 2020 09:13
>
> > "Does the server parent process close its copy of the conversation
> socket?"
> > I checked in my code, but it seems that no. Is it needed?
>
> You'll want to do it, for a few reasons:
>
> - You'll be leaking descriptors in the server, and eventually it will hit
> its limit.
> - If the child process dies without cleanly closing its end of the
> conversation,
> the parent will still have an open descriptor for the socket, so the
> network stack
> won't terminate the TCP connection.
> - A related problem: If the child just closes its socket without calling
> shutdown,
> no FIN will be sent to the client system (because the parent still has its
> copy of
> the socket open). The client system will have the connection in one of the
> termination
> states (FIN_WAIT, maybe? I don't have my references handy) until it times
> out.
> - A bug in the parent process might cause it to operate on the connected
> socket,
> causing unexpected traffic on the connection.
> - All such sockets will be inherited by future child processes, and one of
> them might
> erroneously perform some operation on one of them. Obviously there could
> also be a
> security issue with this, depending on what your application does.
>
> Basically, when a descriptor is "handed off" to a child process by
> forking, you
> generally want to close it in the parent, unless it's used for parent-child
> communication. (There are some cases where the parent wants to keep it
> open for
> some reason, but they're rare.)
>
> On a similar note, if you exec a different program in the child process (I
> wasn't
> sure from your description), it's a good idea for the parent to set the
> FD_CLOEXEC
> option (with fcntl) on its listening socket and any other descriptors that
> shouldn't
> be passed along to child processes. You could close these manually in the
> child
> process between the fork and exec, but FD_CLOEXEC is often easier to
> maintain.
>
> For some applications, you might just dup2 the socket over descriptor 0 or
> descriptor 3, depending on whether the child needs access to stdio, and
> then close
> everything higher.
>
> Closing descriptors not needed by the child process is a good idea even if
> you
> don't exec, since it can prevent various problems and vulnerabilities that
> result
> from certain classes of bugs. It's a defensive measure.
>
> The best source for this sort of recommendation, in my opinion, remains W.
> Richard
> Stevens' /Advanced Programming in the UNIX Environment/. The book is old,
> and Linux
> isn't UNIX, but I don't know of any better explanation of how and why to
> do things
> in a UNIX-like OS.
>
> And my favorite source of TCP/IP information is Stevens' /TCP/IP
> Illustrated/.
>
> > May it explain my problem?
>
> In this case, I don't offhand see how it does, but I may be overlooking
> something.
>
> > I suppose that, if for some reason, the communication with the client is
> lost
> > (crash of client, loss of network, etc.) and keepalive is not enabled,
> this may
> > fully explain my problem ?
>
> It would give you those symptoms, yes.
>
> > If yes, do you have an idea of why keepalive is not enabled?
>
> The Host Requirements RFC mandates that it be disabled by default. I think
> the
> primary reasoning for that was to avoid re-establishing virtual circuits
> (e.g.
> dial-up connections) for long-running connections that had long idle
> periods.
>
> Linux may well have a kernel tunable or similar to enable TCP keepalive by
> default, but it seems to be switched off on your system. You'd have to
> consult
> the documentation for your distribution, I think.
>
> By default (again per the Host Requirements RFC), it takes quite a long
> time for
> TCP keepalive to detect a broken connection. It doesn't start probing
> until the
> connection has been idle for 2 hours, and then you have to wait for the TCP
> retransmit timer times the retransmit count to be exhausted - typically
> over 10
> minutes. Again, some OSes let you change these defaults, and some let you
> change
> them on an individual connection.
>
> --
> Michael Wojcik
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mta.openssl.org/pipermail/openssl-users/attachments/20201114/bb1f0fb8/attachment-0001.html>