[openssl-users] Nonblocking IO: Kindly need your urgent authoritative confirmation that the OpenSSL API's SSL_read and SSL_write and select() must indeed be used together *exactly* like this, as to keep us all safe (from infinite loop & zombification bugs)!

Tinker tinkr at openmailbox.org
Sun Feb 22 21:22:16 UTC 2015


(This is a resubmit of the same post with the numbered list represented 
better here in text format, as the previous mail got the numbered list 
presented well only in the HTML version.)


Dear OpenSSL list,

I need your authoritative answer on the following question.

Guaranteedly, this same question has been voiced on this mailing list 
already, in less or more similar wording. My question is:



Please help me to get clear on *exactly* what my program needs to do in 
response to a SSL_WANTS_READ or SSL_WANTS_WRITE return value, from 
either of SSL_read, SSL_write and SSL_shutdown.

This is for use of OpenSSL in the Nonblocking IO mode on a socket. I 
will also use the BIO mode later, so let's include this in the question 
too;


The reason I ask, is that if my understanding of how this should be 
handled is not *ABSOLUTELY CORRECT*, then because of my incorrect 
understanding I will be at risk of implementing the select() calls in my 
program incorrectly, and that would lead to risk of either *INFINITE 
LOOP* or *ZOMBIFICATION* (because of doing a select() too little or too 
much), which would be *ABOUT AS BAD AS A PROGRAM BUG COULD EVER BE*.



I humbly ask you to help me get clear on this, simply by confirming to 
me that the understanding expressed in each the following points is 
*ABSOLUTELY CORRECT*.

With your clear confirmation that each of these points are indeed 
correct, perhaps maybe the last word needed to be voiced on this topic 
will have been said - thank you very much!!



The points that I kindly ask you to confirm that they are absolutely 
correct are:

  ## 1
By giving me an SSL_WANTS_READ return value to an SSL_read call, OpenSSL 
tells me that it cannot do any more SSL_read work for me i.e. any more 
reading from the SSL channel for me, without that I re-invoke SSL_read, 
*and before that re-invocation* performed a select() for *readability* 
on this socket (so that is, include this socket in the *readfds* 
argument to the select() call) as to get a confirmation that more data 
has arrived on this socket and then having acquired this confirmation.

  ## 2
By giving me an SSL_WANTS_READ return value to an SSL_write call, 
OpenSSL tells me that it cannot do any more SSL_write work for me i.e. 
any more writing to the SSL channel for me, without that I re-invoke 
SSL_write, *and before that re-invocation* performed a select() for 
*readability* on this socket (so that is, include this socket in the 
*readfds* argument to the select() call) as to get a confirmation that 
more data has arrived on this socket and then having acquired this 
confirmation.

  ## 3
By giving me an SSL_WANTS_READ return value to an SSL_shutdown call, 
OpenSSL tells me that it cannot do any more SSL_shutdown work for me 
i.e. any more pushing forward of the clean closure of the SSL channel 
for me, without that I re-invoke SSL_shutdown, *and before that 
re-invocation* performed a select() for *readability* on this socket (so 
that is, include this socket in the *readfds* argument to the select() 
call) as to get a confirmation that more data has arrived on this socket 
and then having acquired this confirmation.

  ## 4
Thus in all points 1)-3), SSL_WANTS_READ is essentially OpenSSL's way to 
say that it ran out of input data on the socket, and that it needs more 
input data in order to proceed with the respective operation at hand.


  ## 5
By giving me an SSL_WANTS_WRITE return value to an SSL_read call, 
OpenSSL tells me that it cannot do any more SSL_read work for me i.e. 
any more reading from the SSL channel for me, without that I re-invoke 
SSL_read, *and before that re-invocation* performed a select() for 
*writability* on this socket (so that is, include this socket in the 
*writefds* argument to the select() call) as to get a confirmation that 
the next SSL_read call performed will be able to write more data to the 
socket, than the SSL_read call that returned SSL_WANTS_WRITE could.

  ## 6
By giving me an SSL_WANTS_WRITE return value to an SSL_write call, 
OpenSSL tells me that it cannot do any more SSL_write work for me i.e. 
any more writing to the SSL channel for me, without that I re-invoke 
SSL_write, *and before that re-invocation* performed a select() for 
*writability* on this socket (so that is, include this socket in the 
*writefds* argument to the select() call) as to get a confirmation that 
the next SSL_write call performed will be able to write more data to the 
socket, than the SSL_write call that returned SSL_WANTS_WRITE could.

  ## 7
By giving me an SSL_WANTS_WRITE return value to an SSL_shutdown call, 
OpenSSL tells me that it cannot do any more SSL_shutdown work for me 
i.e. any more pushing forward of the clean closure of the SSL channel 
for me, without that I re-invoke SSL_shutdown, *and before that 
re-invocation* performed a select() for *writability* on this socket (so 
that is, include this socket in the *writefds* argument to the select() 
call) as to get a confirmation that the next SSL_shutdown call performed 
will be able to write more data to the socket, than the SSL_shutdown 
call that returned SSL_WANTS_WRITE could.

  ## 8
Thus in all points 5)-7), SSL_WANTS_WRITE is essentially OpenSSL's way 
to say that the OS' write buffer filled out so that OpenSSL's call to 
fwrite() for the socket, wouldn't accept any more data written to the 
socket at the time, and that for this reason it needs/wants separate 
ascertainment that the socket is now writable i.e. an fwrite() to the 
socket will accept at least one more byte of data written to it.


  ## 9
Beyond the SSL_read and SSL_write calls mentioned specifically to be 
done in points 1)-3) and 5)-7) (and if mentioned only), my program is 
*not needed or otherwise indicated perform any more SSL_read and 
SSL_write calls whatsoever, at any point*.

  ## 10
One consequence of all of the above 1)-9) is that, if I would want to 
implement a flexible SSL channel reading procedure in my program, and i 
implement it in terms of SSL_read() and select() only, and the point 
with my procedure is to provide a facility for reading from an SSL 
channel in such a way that
* A certain number of bytes are required to be read before it returns, 
and the procedure should block until it gets that many bytes (let's call 
these "bytes_needed"), and
* A certain number of bytes are allowed to be read before it returns 
just in case they would happen to be available for reading already at 
the time of the call, but there will be no blocking to get that many 
bytes (let's call these "bytes_optional").

So, this procedure would be defined something like:

my_flexible_reading_procedure(SSL* ssl,int socket,void* buffer,int 
bytes_needed,int bytes_optional).

Then, my program invokes this procedure, for needing 1 byte and 
optionally accepting another 999.

my_flexible_reading_procedure invokes SSL_read(), which returns 
SSL_WANTS_READ.

my_flexible_reading_procedure select():s for readability and eventually 
the select() tells us that more input data is available.

my_flexible_reading_procedure then invokes SSL_read() again, and it 
tells us it read 5 bytes.

my_flexible_reading_procedure has now gotten all bytes it needed (which 
was 1), but, it *must assume* that the reason that SSL_read() returned 
with only 5 bytes instead of all the 1 + 999 = 1000 bytes, was because 
of technicalities regarding buffering or whatever, so 
my_flexible_reading_procedure must reinvoke SSL_read():

my_flexible_reading_procedure invokes SSL_read() again, and this time it 
returns SSL_WANTS_READ.

Now, thanks to the clarifications above, we know that a SSL_WANTS_READ 
return value just means that OpenSSL has read all input data available 
from the socket at this time, and that my_flexible_reading_procedure 
want to read/get any more SSL channel data then it at first needs to 
wait for more data to come in on the socket from the other party, by 
making another select() for readability.

my_flexible_reading_procedure doesn't want this though, because it 
already got all the bytes it needed, which was 1.

So therefore, my_flexible_reading_procedure returns with a success 
result, reporting that it got 5 bytes, and we know that the SSL channel 
is in a healthy state, and my program is free to make another SSL_read, 
SSL_write or SSL_shutdown call at any moment it wishes to, and then of 
course if that respective call would return SSL_WANTS_READ or 
SSL_WANTS_WRITE in turn, then that will need to be handled as described 
in 1)-3) and 5)-7) above.

11)
This means that libCurl's handling of SSL_ERROR_WANT_READ result, and 
SSL_ERROR_WANT_WRITE result too right?, from SSL_read, defined on rows 
1000-1007 and 3079-3083 in 
https://github.com/bagder/curl/blob/aba2c4dca2601cb942f47ea9622e01001c01b799/lib/vtls/openssl.c 
,

and its handling of SSL_ERROR_WANT_READ and SSL_ERROR_WANT_WRITE results 
from SSL_write, defined on rows 3025-3031,

is *absolutely broken* right?

(Perhaps libCurl's use of OpenSSL is done in some limited scope and 
context so that somehow doing it like this works, but generally 
speaking, implementing it like this is *absolutely broken*.)

  ## 12
Nginx' handling of SSL_WANT_READ&_WRITE result from SSL_read and 
SSL_write is correct though, 
https://github.com/git-mirror/nginx/blob/76c07f20962badcadaa0e01d0d8be73cc9ed461f/src/event/ngx_event_openssl.c 
: SSL_WANT_READ is channelized to a select() for more input data (that's 
on rows 803 and 1318), and SSL_WANT_WRITE is channelized to a select() 
for writability (that's on rows 819 and 1061).

(Presuming that what "c->read->ready = 0;" really does is to trig a 
select() for more input data, and what "c->write->ready = 0;" really 
does is to trig a select() for writability.)


Thank you very much for taking your time to clarify this matter!

Mikael


More information about the openssl-users mailing list