[openssl-dev] Plea for a new public OpenSSL RNG API

Tue Aug 29 09:45:26 UTC 2017

Hi everybody,

on the [openssl-dev] mailing list, there has been a long ongoing discussion about the new RAND_DRBG API and comparing it with the old RAND_METHOD API (see "[openssl-dev] Work on a new RNG for OpenSSL"). Two of the most controversal questions were:

 - Do we really need a new RNG API? Should the RAND_DRBG API be made public or kept private? (Currently, it's exported from libcrypto but only used internally by libssl.)
 - How much control should the user (programmer) be given over the reseeding process and/or should he be allowed to add his own additional randomness?

Many developers seem to be realizing the interesting possibilities of the DRBG API and are asking for public access to this new and promising API. One of the driving forces behind it is the question about how to do seeding and reseeding right. Among others, Uri Blumenthal asked for making the DRBG API public.

Currently, the OpenSSL core members seem to be reluctant to make the API public, at least at this early stage. I understand Rich Salz's viewpoint that this requires a thorough discussion, because a public interface can't be easily changed and wrong decisions in the early phase can become a heavy burdon.

Nevertheless, I agree with Uri Blumenthal that the DRBG API should be made public. So here comes my

======================================
Plea for a new public OpenSSL RNG API:
======================================

    The new RAND_DRBG is the superior API. It shouldn't be kept private and hidden behind the ancient RAND_METHOD API.
    The philosophy of the two APIs is not very well compatible, in particular when it comes to reseeding and adding
    additional unpredictable input. Hiding the RAND_DRBG behind the RAND_METHOD API only causes problems.
	Also, it will force people to patch their OpenSSL copy if they want to use the superior API.

    The RAND_DRBG API should become the new public OpenSSL RNG API and the old RAND_METHOD API should be deprecated
    in the long run. This transition does not need to be rushed, but it would be good if there would be early consent
    on the road map. I am thinking of a smooth transition with a phase of coexistence and a compatibility layer
    mapping the default RAND_METHOD to the default public RAND_DRBG instance. (This compatibility layer already exists,
    it's the 'RAND_OpenSSL()' method.)

Historical Background
=====================

As Rich already mentioned in his blog post, the RAND_DRBG isn't new. It's been a part of OpenSSL for a long time, hidden in the FIPS 2.0 Object Module.

I have been working with the FIPS DRBG for quite a while now, using a FIPS-capable OpenSSL 1.0.2x crypto library. The reason why our company switched to the FIPS DRBG is that one of our products runs on a small hardware device which does not have a reliable entropy source, but the product has to meet high security standards, in particular w.r.t. its RNG. So we decided to use the SmartCard RNG as primary entropy source for a deterministic AES-CTR based RNG and use /dev/urandom as additional input. Reseeding should occur on every generate request. Using the FIPS DRBG, these requirements were easily met, because the API gives such a fine grained control over reseeding and adding additional entropy.

The DRBG was well documented, its design in NIST SP800-90A (now: NIST SP800-90Ar1)  and its API in the OpenSSL FIPS 2.0 User Guide. The implementation was thoroughly tested and assessed during the FIPS certification process. So the only minor obstacle was that we had to patch the crypto library (not the FIPS object module) in order to get public access to the FIPS_drbg_*() methods.

I always considered the DRBG API more mature than the good old RAND_METHOD API and I wondered, why the DRBG code lay forgotten for so many years in the FIPS 2.0 object module sources and was never ported to master.

When in June of this year the thread "[openssl-dev] Work on a new RNG for OpenSSL" popped up (https://mta.openssl.org/pipermail/openssl-dev/2017-June), I closely watched the discussion, and when John Denker suggested having a look at NIST SP800-90A, I was electrified:

    > Constructive suggestion:  If you want to see what a RNG looks
    > like when designed by cryptographers, take a look at:
    >   Elaine Barker and John Kelsey,    >   "Recommendation for Random Number Generation Using Deterministic Random Bit Generators"
    >   http://csrc.nist.gov/publications/nistpubs/800-90A/SP800-90A.pdf
    > 
    > That design may look complicated, but if you think you can
    > leave out some of the blocks in their diagram, proceed with
    > caution.  Every one of those blocks is there for a reason.

    <https://mta.openssl.org/pipermail/openssl-dev/2017-June/009423.html>

>From his mail and the reaction to it, I had the impression that nobody seemed to remember the fact that the DRBG code was already present in the FIPS object module. The hidden treasure seemed forgotten! When I seized the opportunity and proposed to port the FIPS DRBG code into master, Rich Salz liked my idea and immediately started working on PR #3789:

       <https://mta.openssl.org/pipermail/openssl-dev/2017-June/009439.html>
       <https://mta.openssl.org/pipermail/openssl-dev/2017-June/009440.html>
       <https://github.com/openssl/openssl/pull/3789>

I am very grateful to Rich for picking up the idea so fast and and giving it so much speed and momentum. He has done a lot of work to convince others and defend the idea against initial scepticism by voices objecting to the seemingly new and unknown API. In the middle of August, when the first bulk of work was finished, Rich wrote a detailed blog post to advertise the new DRBG and explain his work <https://www.openssl.org/blog/blog/2017/08/12/random> and the discussion restarted:

     "[openssl-dev] Work on a new RNG for OpenSSL>  (see https://mta.openssl.org/pipermail/openssl-dev/2017-August)

It became quickly evident that users that the DRBG API had promising features, so they started asking for public access to this new API. The driving force was the question about how to do seeding and reseeding right. Among others, Uri Blumenthal was a dedicated advocate to make the DRBG API public (https://mta.openssl.org/pipermail/openssl-dev/2017-August/009594.html).

But the OpenSSL members currently seem to be reluctant to make the API public right away. I understand Rich's viewpoint that this decision requires a thorough discussion, because a public interface can't be easily changed and wrong decisions in the early phase can become a heavy burdon.

Nevertheless, I agree with Uri Blumenthal that the DRBG API should be made public. Here are some of my arguments for it.

The DRBG API supports multiple instances and chaining
=====================================================

The NIST DRBG standard had chaining of multiple DRBG instances in the mind from the very beginnning, see for example footnote [4] on page 25 of NIST SP800-90Ar1:

    > Entropy input may be obtained from an entropy source or an NRBG, both of which provide fresh entropy.
    > Entropy input could also be obtained from a DRBG that has access to an entropy source or NRBG.

The original OpenSSL FIPS DRBG implementation did not support chaining, but this support has been added by Rich during the DRBG port.

The DRBG API has a highly flexible concept for seeding and reseeding
====================================================================

As mentioned previously, the DRBG has a callback mechanism with callbacks like get_entropy() and get_adin() callbacks, which make it easy to fine tune the default instantiation process by either adding additional randomness input, or changing the entropy source entirely. The callbacks are primarily intended for obtaining randomness instantiation and reseeing. There is a clear concept for reseeding, which can be adjusted by changing the reseed_interval: Normally, the DRBG reseeds itself automatically, whenever the reseed_interval has been reached.

In addition, there is a way for the DRBG consumer to add his own unpredictability when requesting random bytes: by adding additional input 'adin' to the RAND_DRBG_generate() call:

    int RAND_DRBG_generate(RAND_DRBG *drbg, unsigned char *out, size_t outlen,
                           int prediction_resistance,
                           const unsigned char *adin, size_t adinlen)

So why are there so many ways to add randomness and additional input? And what is the difference?

* The get_entropy() and get_adin() callbacks are used by the DRBG itself to _pull_ unpredictable data from some backend entropy source (which can also be a chained DRBG which is connected to some entropy source) during instantiation or reseeding.
* The 'adin' argument can be used by the DRBG consumer to _push_ unpredictable input into the DRBG when generating random output.

Note that the DRBG consumer has no possibility to push out-of-band randomness into the DRBG. Adding randomness is always coupled with a generate or (re-)seed operation. This is a very import difference between the RAND_DRBG and the RAND_METHOD and one of the reasons why it's so hard to do reseeding right in both the RAND and RAND_DRBG API simultaneously. In fact, currently 

The 'RAND_add()/RAND_bytes()' pattern is broken
===============================================

In OpenSSL, the classical way for the RNG consumer to add his own randomness is to call 'RAND_add()' before calling 'RAND_bytes()'. If the new 'RAND_OpenSSL()' method (the "compatibility layer" hiding the public RAND_DRBG instance)  is the default, then this does not work as expected anymore:

The reason is that a call to 'RAND_add()' adds the provided randomness only to a global buffer ('rand_bytes'), from which it will be pulled during the next reseed. But no reseed is triggered. So the next RAND_bytes() call will be unaffected from the RAND_add(), which is not what the consumer expected. (The same holds for 'RAND_seed()', since 'drbg_seed()' only calls into 'drbg_add()')

Reseeding of DRBGs occurs only at the following occasions:

* immediately after a 'fork()' (new)
* if the 'reseed_counter' exceeds the 'reseed_interval'
* if 'RAND_DRBG_generate()' is called requesting 'prediction_resistance'
* 'RAND_DRBG_reseed()' is called explicitely

*Note:* Currently it looks like the situation is even worse: if 'RAND_add()' is called multiple times before a reseed occurs, then the result of the previous call is overwritten.

Reseeding the 'DRBG' whenever the user calls 'RAND_add()' does not seem a good solution. It would be too expensive, in particular if system entropy is pulled for reseeding. Of course it is possible to fix this issue, but the DRBG provides for a much simpler solution: it lets the consumer contribute to the entropy of the internal state by providing additional input. If the user input contains entropy, that's fine, if it's "snake oil", no harm. The additional input is mixed into the internal state in just the same way as the entropy buffer using the 'ctr_df()' derivation function. One might think of the 'entropy' input as trusted randomness and 'adin' as untrusted randomness.

For this reason, I would like to see the 'RAND_add()/RAND_bytes()' pattern deprecated and the 'RAND_DRBG_generate() with additional input' pattern advertised instead.

The DRBG API supports different implementations
===============================================

Well, it _supported_ them, until recently. But that's not irreversible.

The DRBG concept, as layed out in the NIST standard, provides a generic framework for deterministic RNGs (the acronym DRBG stands for Deterministic Random Bit Generator). It deals with general questions like how to instantiate and reseed the RNG, where does it get the entropy from, etc. 

The standard proposes three concrete implementations, Hash_DRBG, HMAC_DRBG, and CTR_DRBG. In the FIPS code, all three were implemented, and the genericity was achieved using a data union and a set of five function pointers, reminiscent of a vtable in C++:

    struct drbg_ctx_st
    {
        ...

        /* Implementation specific structures */
        union 
            {
            DRBG_HASH_CTX hash;
            DRBG_HMAC_CTX hmac;
            DRBG_CTR_CTX  ctr;
            } d;
        /* Initialiase PRNG and setup callbacks below */
        int (*init)(DRBG_CTX *ctx, int nid, int security, unsigned int flags);
        /* Intantiate PRNG */
        int (*instantiate)(DRBG_CTX *ctx,
                    const unsigned char *ent, size_t entlen,
                    const unsigned char *nonce, size_t noncelen,
                    const unsigned char *pers, size_t perslen);
        /* reseed */
        int (*reseed)(DRBG_CTX *ctx,
                    const unsigned char *ent, size_t entlen,
                    const unsigned char *adin, size_t adinlen);
        /* generat output */
        int (*generate)(DRBG_CTX *ctx,
                    unsigned char *out, size_t outlen,
                    const unsigned char *adin, size_t adinlen);
        /* uninstantiate */
        int (*uninstantiate)(DRBG_CTX *ctx);

        ...
    };

This part of the code was removed during the DRBG port, because currently CTR_DRBG is the only impementation. I would like to suggest to restore this 'polymorphic' implementation, to ease adding new implementations CHACHA20_DRBG in the future

The DRBG API is well documented and tested
==========================================

The entire DRBG API is part of the OpenSSL FIPS 2.0 Module and as such is well tested and well documented. Most of the documentation for the manual pages still-to-be-written can be taken from the FIPS User Guide and converted into manual pages, starting with textual modifications like

	FIPS_drbg_xxx(...)  ->  RAND_DRBG_xxx(...)
	DRBG_CTX *dctx      ->  RAND_DRBG *dctx

and taking the new typedefs into consideration. Here is for example a comparison of the API function to install the callbacks:

    FIPS DRBG:

        int FIPS_drbg_set_callbacks(DRBG_CTX *dctx,
            size_t (*get_entropy)(DRBG_CTX *ctx,  <args> ),
            void (*cleanup_entropy)(DRBG_CTX *ctx, <args> ),
            size_t entropy_blocklen,
            size_t (*get_nonce)(DRBG_CTX *ctx, <args> ),
            void (*cleanup_nonce)(DRBG_CTX *ctx, <args>)
        );

    RAND_DRBG:

        typedef size_t (*RAND_DRBG_get_entropy_fn)(RAND_DRBG *ctx, <args> );
        typedef void (*RAND_DRBG_cleanup_entropy_fn)(RAND_DRBG *ctx, <args> );
        typedef size_t (*RAND_DRBG_get_nonce_fn)(RAND_DRBG *ctx,  <args> );
        typedef void (*RAND_DRBG_cleanup_nonce_fn)(RAND_DRBG *ctx,  <args> );

        int RAND_DRBG_set_callbacks(RAND_DRBG *dctx,
                                    RAND_DRBG_get_entropy_fn get_entropy,
                                    RAND_DRBG_cleanup_entropy_fn cleanup_entropy,
                                    RAND_DRBG_get_nonce_fn get_nonce,
                                    RAND_DRBG_cleanup_nonce_fn cleanup_nonce);

Conclusion
==========

I see no reason why the RAND_DRBG shouldn't be made public as soon as possible, keeping the API as close as possible to the original FIPS DRBG API (FIPS 3.0 is upcoming!). In a second step, the current compatibility binding from RAND_METHOD to RAND_DRBG could be deprecated and faded out smoothly.

Looking forward to receiving your comments. (But please be patient with me, I'm currently on physical rehab after a surgery.)

Matthias St. Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 4328 bytes
Desc: not available
URL: <http://mta.openssl.org/pipermail/openssl-dev/attachments/20170829/52976b0f/attachment-0001.bin>