[openssl-dev] [openssl.org #4116] [PATCH] Reimplement non-asm OPENSSL_cleanse()

Brian Smith via RT rt at openssl.org
Sat Oct 31 23:01:22 UTC 2015


On Sat, Oct 31, 2015 at 11:50 AM, Alessandro Ghedini via RT <rt at openssl.org>
wrote:

> In any case memset_s is not available anywhere anyway, so that doesn't
> really
>
matter.
>

Is it available in some places, e.g.
https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man3/memset_s.3.html
.


> > * Otherwise, SecureZeroMemory, when SecureZeroMemory is available.
> > * Otherwise, if a flag OPENSSL_REQUIRE_SECURE_ZERO is set, fail.
>
> In 99% of the cases (e.g. Linux with glibc or any *BSD) that would fail,
> so I
> don't see the point in that.
>

The point is to let the person building OPENSSL say "I want the build to
fail if there isn't a secure way to zero memory, because I'm expecting
there to always be one in my configuration." Alternatively, there could be
an OPENSSL_USE_MEMSET_S flag that says "always use memset_s and never
anything else."


> > Note in particular that the C compiler is allowed to completely defeat
> the
> > purpose of the function unless SecureZeroMemory or memset_s is used, even
> > if you use "volatile" or other tricks.
>
> I don't think that is true (regarding the volatile pointer). But assuming
> that
> a broken compiler decided to optimize that call away, what's stopping it to
> optimize the call to the asm implementation as well? Also, such broken
> compiler
> probably wouldn't know about C11 either, so even if memset_s() was
> available it
> could optimize that as well.
>

Such optimizations are legal. Otherwise, C11 wouldn't have defined
|memset_s|. And, the entire purpose of |memset_s| is to disable such
optimizations.


> I don't know how compilers are supposed to treat memset_s(), but if I
> define a
> memset_s() function which just calls memset() internally, GCC (v5.2.1
> 20151028)
> optimizes that away just as it does with plain memset(), so libc
> implementations would probably need to adopt "tricks" to avoid the
> optimization
> as well.
>

Right. It has to be built into the compiler.


> FWIW OpenSSH implements the portable explicit_bzero() using the volatile
> pointer as well (unless memset_s() is detected at build time).


That is similar to what I'm suggesting.


> On OpenBSD it
> just uses OpenBSD's explicit_bzero() which is implemented using memset()
> and a
> weak function pointer.


It is a good idea to detect OpenBSD at compile time and use
|explicit_bzero|, just like for |SecureZeroMemory| on Windows.

Don't pay much attention to what tricks OpenBSD's |explicit_bzero| uses.
Those tricks are not guaranteed to work for anything other than that one
function on OpenBSD.


> But that (as used in LibreSSL) seems to have problems in
> relation to LTO, unless optimizations are specifically disabled:
> https://github.com/libressl-portable/openbsd/issues/5


Right, this is more evidence that the only correct implementation is to use
a compiler-provided |memset_s|, |explicit_bzero|, |SecureZeroMemory|. It is
not possible to implement your own in a way that is guaranteed to work. If
you need to implement your own.



> On the other hand BoringSSL uses memset() with an explicit memory barrier
>

The explicit memory barrier is bad for performance.

And, it also isn't guaranteed to work. I don't speak for the BoringSSL
team, but in correspondence with them, they don't seem to care much if
OpenSSL_cleanse works or if it is used.


> > The primary purpose of the assembly language implementations is to
> reduce the
> > possibility that the C compiler will do the weird things that C
> compilers love
> > to do.
>
> According to the changelog and git log, the primary purpose of the asm
> implementations was to improve performance (see commit b2dba9b). Using the
> volatile pointer implementation would IMO make these optimizations useless,
> hence the proposal to drop them and make things simpler.


Sorry. Instead of "primary purpose" (which implies intent), I should have
said 'primary advantage".  An assembly language implementation is more
likely to work than a C implementation because the C compiler generally
won't analyze externally-assembled code, so it has to assume that the
externally-assembled code has side effects, so it must call the
externally-assembled code. In theory, it is possible for the assembler, C
compiler, and the linker to work together during LTO and figure out that
the assembly language implementation doesn't have any side effects other
than zeroing memory, but that seems unlikely--much less likely than the C
compiler subverting any trick.

Note that sometimes I notice places that OpenSSL doesn't call
OPENSSL_cleanse when it seems like it would be warranted to be consistent
with other code. ecdh_compute_key [1] is one example. Generally, I don't
expect OpenSSL to (securely) zero memory, so it doesn't matter much to me.
But, if it matters to others, then this is something that would require an
substantial amount of auditing and fixing.

[1]
https://github.com/openssl/openssl/blob/965a1cb92e4774ca2f74dad9e060aa7b2d80c77d/crypto/ecdh/ech_ossl.c#L82

Cheers,
Brian
-- 
https://briansmith.org/



More information about the openssl-dev mailing list