[openssl-dev] [openssl/openssl] Dtls listen refactor (#5024)

Tue Jan 16 19:44:38 UTC 2018

Matt Caswell <matt at openssl.org> wrote:
    >> a) when the existing FD is connect(2) any future traffic to the bound
    >> port will get rejected with no port.  So the application really has to
    >> open a new socket first.  The application can do this two ways: it can
    >> open a new socket on which to receive new connections, or it can open
    >> a new socket on which to communicate with the new client.  The second
    >> method is better for reason (b) below.  Either way, it socket to
    >> communicate with the client needs to be bind(2) to the address that
    >> the client used to communicate with the server, and DTLSv1_listen()
    >> didn't collect or return that information.

    > The second way is what is intended.

Unfortunately, there remains a race condition because we have to call bind()
before connect() on the new socket.  Under load, if a packet is received
between the bind() and the connect(), it might go onto the wrong socket
queue. So some packets that could have been processed will get dropped and
have to be retransmitted by the client.

It could be solved if there was a way to punt packets received on the wrong
socket to the correct BIO on the input side.  I looked into this, but decided
it was too difficult...

That would also let one operate a multitude of DTLS connections using single
socket which might be a boon to some users.  Mis-designed it would scale
badly on multi-threaded machines and involve lots of ugly locks.
I don't want to consider the impacts if one had to pass packets between processes...
I don't advocate a solution like this (I'll live with the dropped packets),
but I think it's worth making people aware that they might see mis-directed
packets get dropped.

    > Maybe I misunderstand your point -
    > but the client address *is* returned? Admittedly its wrapped in a
    > BIO_ADDR, but its easy to get the underlying "raw" address using
    > BIO_ADDR_rawaddress():

    > Why isn't recvfrom() suitable (which is what the code currently uses to
    > get the address)?

The address of the remote client is returned ("getpeername()") by DTLSv1_listen().
That's all that recvfrom() gives you.

recvfrom() was a reasonable API for SunOS 3.x machines with a single 10Mb/s
interface with a single IPv4 address.  I loved all that at the time...
But it doesn't work that well when we might have a dozen different kind of
IPv6 addresses on each virtual interface.

The address that I'm talking about needing is the one the remote client used
to reach us.  That's the destination IP of the incoming packet ("getsockname()" in TCP speak).

With TCP this is never an issue because the kernel creates the new socket and
copies the right stuff in for us when it creates the socket.

With UDP, the source address for outgoing packets needs to match or the
client may get a response from an address that it didn't connect to.  Worse,
there might be firewalls or policy routing that would cause the packet to
disappear or get routed differently.  In my application, I definitely dealing
with connections over IPv6 Link-Local addresses with a multitude of virtual
links.

In your code example:
    bind(client_fd, &server_addr, sizeof(struct sockaddr_in6));

server_addr has to be set from the destination address of the incoming
packet, it's not a global that the admin set, or the SIP negotiated.

In the bad old days, this meant opening a socket for every interface on the
machine, and re-reading the list of interfaces based upon some heuristic.
(routing socket, SIGHUP, ...)

Even getting a list of interfaces (and their addresses) is itself a
OS-dependant activity!  And, if you use the old BSD interface on Linux,
you'll miss a bunch of interfaces, because the Linux kernel people thought
that it would confuse BSD APIs if interfaces were returned that the BSD
interface didn't create.  So you can't even win there.

The IPv6 API gives us recvmsg() and ipi6_pktinfo, which makes it all sane.
But, we never got a standard interface for IPv4: Linux uses something that
looks identical to IPv6, and BSD has something with slightly different names.

--
]               Never tell me the odds!                 | ipv6 mesh networks [
]   Michael Richardson, Sandelman Software Works        | network architect  [
]     mcr at sandelman.ca  http://www.sandelman.ca/        |   ruby on rails    [

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 487 bytes
Desc: not available
URL: <http://mta.openssl.org/pipermail/openssl-dev/attachments/20180116/8a530fa3/attachment-0001.sig>