[openssl-users] BIO_read hangs, how can I know if the server wants to send data?

Tue Apr 26 15:09:23 UTC 2016

> From: openssl-users [mailto:openssl-users-bounces at openssl.org] On Behalf
> Of Matt Caswell
> Sent: Tuesday, April 26, 2016 10:06
> To: openssl-users at openssl.org
> Subject: Re: [openssl-users] BIO_read hangs, how can I know if the server
> wants to send data?
> 
> On 26/04/16 14:28, Hanno Böck wrote:
> >
> > What I want to do: Send a couple of HTTP requests over one connection
> > (with HTTP/1.1, keep-alive enabled).
> > Seems simple enough: I send a HTTP request and then read what the
> > server sends, then send the next.
> >
> > However: How do I know when the server has stopped sending?
> > I have attached a code sample (it's missing lots of error checking in
> > the initialization phase, but that's just for simplification of the
> > code and shouldn't matter for now).
> 
> There are a few ways of doing this:
> 
> 1) Track it at the application protocol layer. For example read the
> "Content-Length" HTTP header and wait until you've received that amount
> of data. This is probably the best way. The other ways below only tell
> you whether the network *currently* has any data to provide to you - not
> whether the server has finished sending.

A couple of points:

- This problem applies to any TCP-based application, regardless of whether TLS is used. TCP is a full-duplex byte-stream protocol. It does not provide any record-boundary or flow-direction indicators. I would strongly recommend consulting a good TCP communications reference such as Stevens' /UNIX Network Programming/ or Comer's /Internetworking with TCP/IP/.

- You can't rely on the presence of a Content-length header in the server's response. For HTTP/1.1, the ways in which a response can be delimited are:
	- Some request types, such as HEAD, do not allow a message body in the response, regardless of what header lines were present. In this case the response is delimited by the blank line that follows the head.
	- Some response types, notably the 1xx range, do not allow a message body in the response, and are delimited by the blank line at the end of the head.
	- A response can be delimited by terminating the connection.
	- A response can include a message body which is exactly the number of octets specified in the optional Content-length header.
	- A response can be delimited by using a self-delimiting transfer encoding. In practice, this means using the "chunked" transfer-encoding, and indicating the end of the message body with a zero-length chunk. If trailers are allowed, the actual end of the response is the end of the trailers. If the chunked T-E is used, any Content-length header MUST be ignored.
	- A response can be delimited by using a self-delimiting Content-type, such as MIME multipart types, if the client accepts such content.

Thus determining the end of an HTTP message is not trivial. See RFC 2616 for details.

It's even worse for HTTP/2, but then HTTP/2 is worse in many ways.

In practice, interactive HTTP user agents (browsers) use a combination of methods and heuristics, because they have to deal with broken servers, broken code running under servers, broken intermediary nodes (gateways and proxies), network problems, etc. Thus they try to apply the rules for determining the end of the response, but they also try to render data as it's received, and after a while they'll time out and decide that a message has ended.

-- 
Michael Wojcik
Technology Specialist, Micro Focus