[openssl-dev] Pipelining

Mon Feb 15 13:42:56 UTC 2016

I have just pushed to github some code that I have been working on to
implement a feature I have called "pipelining". This is still WIP,
although is fairly well advanced. I am keen to hear any feedback. You
can see the PR here:
https://github.com/openssl/openssl/pull/682

The idea is that some engines may be able to process multiple
simultaneous crypto operations. This capability could be utilised to
parallelise the processing of a single connection. For example a single
write could be split into multiple records and each one encrypted
independently and in parallel (this can only work in TLS1.1+ where we
get a new IV per record).

This initial version only provides pipelining for TLS (no DTLS at this
stage). Documentation still needs to be added (hence WIP).

It works similarly to how the existing MULTIBLOCK code works now. The
primary differences are that MULTIBLOCK only ever deals with batches of
4 or 8 records in one go and the engine is responsible for writing the
entire set of records including all the record headers. This means the
engine ciphers are TLS protocol specific. They couldn't ever be used for
DTLS and it doesn't make any sense to call them directly (i.e. not in
the context of a TLS connection).

The pipelining support is protocol agnostic and hence can be used for
TLS or directly via EVP. It is also much more flexible about the number
of pipelines that are used at any one time.

Two new SSL/SSL_CTX parameters can be used to control how this works in
the "write pipelining": max_pipelines and split_send_fragment.

max_pipelines defines the maximum number of pipelines that can ever be
used in one go for a single connection. It must always be less than or
equal to SSL_MAX_PIPELINES (currently defined to be 32). By default only
one pipeline will be used (i.e. normal non-parallel operation).

split_send_fragment defines how data is split up into pipelines. The
number of pipelines used will be determined by the amount of data
provided to the SSL_write call divided by split_send_fragment.

For example if split_send_fragment is set to 2000 and max_pipelines is 4
then:
SSL_write called with 0-2000 bytes == 1 pipeline used
SSL_write called with 2001-4000 bytes == 2 pipelines used
SSL_write called with 4001-6000 bytes == 3 pipelines used
SSL_write called with 6001+ bytes == 4 pipelines used

split_send_fragment must always be less than or equal to
max_send_fragment. By default it is set to be equal to
max_send_fragment. This will mean that the same number of records will
always be created as would have been created in the non-parallel case,
although the data will be apportioned differently. In the parallel case
data will be spread equally between the pipelines.

Read pipelining is controlled in a slightly different way than with
write pipelining. While reading we are constrained by the number of
records that the peer (and the network) can provide to us in one go. The
more records we can get in one go the more opportunity we have to
parallelise the processing.

There are two parameters that affect this:
The number of pipelines that we are willing to process in one go. This
is controlled by max_pipelines (as for write pipelining)
The size of our read buffer. A subsequent commit will provide an API for
adjusting the size of the buffer.

Another requirement for this to work is that read_ahead must be set. The
read_ahead parameter will attempt to read as much data into our read
buffer as the network can provide. Without this set, data is read into
the read buffer on demand. Setting the max_pipelines parameter to a
value greater than 1 will automatically also turn read_ahead on.

Finally, the read pipelining as currently implemented will only
parallelise the processing of application data records. This would only
make a difference for renegotiation so is unlikely to have a significant
impact.

You need to have an engine that provides ciphers that support this. I
have updated the dasync engine to provide some suitable ciphers. I've
also added some arguments to s_client and s_server to support setting of
the split_send_fragment and max_pipelines variables for write
pipelining, and the size of the read buffer for read pipelining.

Matt