[openssl-dev] Work on a new RNG for OpenSSL

Wed Jun 28 17:01:29 UTC 2017

On 06/26/2017 11:28 PM, Paul Dale wrote:
> Given the variety of RNGs available, would an EVP RNG interface make sense?  With a safe default in place (and no weak generators provided), the decision can be left to the user.
> A side benefit is that the unit tests could implement a simple fully deterministic generator and the code for producing known sequences removed from the code base.

There are some benefits to this idea, as you note, but it does not seem
like a clear "immediate win" to me.  Maybe this is just some emotional
response that has not fully absorbed "no weak generators provided", as I
can't really articulate any reason to oppose it other than "randomness
is so low-level that we should just provide it and not options for it".

>
> Defence in depth seems prudent: independent sources with agglomeration and whitening.

As Kurt noted, [on modern OSes,] it is really unclear what sources are
available to us that are not already being used by the kernel.  Rich had
commented about the dragonfly (kernel) implementation "wow, is it really
that easy?".  To large extent, yes, a secure RNG can present as being
that simple/easy -- if you're writing it in the kernel!  The kernel has
easy and direct access to lots of interrupt-driven entropy sources, any
hardware generators present, etc., as well as rdrand/etc.  It doesn't
have to worry about fork-safety or syscall overhead, and can basically
just implement the raw crypto needed for whitening/mixing/stretching.

So, [on these same modern OSes,] what benefit do we really get from
using multiple "independent" sources?  They are unlikely to actually be
independent if the kernel is consuming them as well and we consume the
kernel.

Now, of course OpenSSL runs on OSes that do not provide a modern kernel
RNG and we will need some solution for them, which will likely look as
you describe.  I'm just not convinced there is much value in duplicating
what the kernel is doing in the cases that the kernel does it well.

>
> We shouldn't trust the user to provide entropy.  I've seen what is typically provided.  Uninitialised buffers aren't random.  User inputs (mouse and keyboard) likewise aren't very good.  That both are still being suggested is frustrating.  I've seen worse suggestions, some to the effect that "time(NULL) ^ getpid()" is too good and just time() is enough.

Definitely.  But, as we're not the kernel, finding good sources of real
randomness as a generic userspace process is quite hard.

>
> As for specific questions and comments:
>
> John Denker wrote:
>> If you trust the ambient OS to provide a seed, why not
>> trust it for everything, and not bother to implement an
>> openssl-specfic RNG at all?
> I can think of a few possibilities:

Ah, preemptive replies to my comments above, excellent.

> * Diversifying the sources provides resistance to compromise of individual sources.  Although a full kernel compromise is unrecoverable, a kernel bug that leaked the internal pools in a read only manner isn't unforeseeable.

It is not unforseeable, sure, but so are lots of things.  Spewing the
contents of the openssl process-local randomness pool on the network
isn't unforseeable, either; do we have any reason to think there is
substantially more risk from one unknown than the other?

> * Not all operating systems have good RNGs.

Sure, and we need to support the ones that don't have good RNGs.
But on the ones that do, what do we gain from duplicating the effort?

>
> * Draining the kernel's entropy pools is unfriendly behaviour, other processes will typically want some randomness too.
>
> * At boot time the kernel pools are empty (low or no quality).  This compounds when several things require seeding.

I'm not sure what you mean by "draining the kernel's entropy pools". 
That is, if you are adhering to the belief that taking random bits out
of a generator removes entropy from it that must be replenished, does
that not apply also to any generator/pool we write for ourselves?  Or
maybe you just refer to the behavior of linux /dev/random, in which case
I would point out Ted (the author/maintainer of linux /dev/random)'s
suggestion to just use (getrandom or) /dev/random and tacit agreement
that the behavior of reducing the entropy count on reads from
/dev/random is not really needed anymore.

At boot time *all* pools are empty.  FreeBSD has a random seed file on
disk to be loaded on next boot that helps with this (I didn't check
linux), and openssl has/can use ~/.rnd or similar, but those are not
immune from compromise out-of-band.  In order to be properly confident
of good randomness, new randomness needs to be collected from the
environment and added to the pool, and the kernel is in a much better
position to do so (and know when it has enough!) than we are.

> * Performance is also a consideration, although with a gradual collection strategy this should be less of a concern.  Except at start up.

Given that we're going to be implementing several variants anyway, I
propose to defer decisions based on performance considerations until
hard data are actually available.

>
> John Denker wrote:
>> Are you designing to resist an output-text-only attack?  Or do you
>> also want "some" ability to recover from a compromise of
>> the PRNG internal state?
> Yes to both ideally.  Feeding additional seed material into a PRNG could help in the latter case.  It depends on how easy it is to compromise the PRNG state.  If it is trivial in terms of resources, it isn't going to be recoverable.  A short reseed interval can be effective against a slow attack (but not always).

Of course, if we just use the kernel RNG's output directly, we have no
internal state to compromise (though the kernel does, but the kernel
also does its own reseeding).

>
>
> John Denker wrote:
>>> Do you think we need to use multiple sources of randomness?
>> Quality is more important than quantity.
> Given the difficulty of quantifying the quality of sources, I think that having multiple sources is prudent.  Finding out that a source is low quality when it was believed to be good is less embarrassing when you've other sources in operation.  Likewise, if a source fails.

More sources are better, sure.  But don't count the same source twice! 
(That is, when the kernel uses it already and then we want to use it again.)

-Ben

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mta.openssl.org/pipermail/openssl-dev/attachments/20170628/7ed19bb7/attachment-0001.html>