[openssl-dev] Removing gcm128_context->H for non-1-bit builds

Andy Polyakov appro at openssl.org
Sat Jun 11 17:08:31 UTC 2016


>>> Could somebody adjust who understand the assembly code (probably Andy)
>>> modify it to use symbolic names for the offsets that are used to
>>> access Xi, H, Htable? If so, then I can write the patch to
>>> conditionally exclude `H` on platforms that don't need it after
>>> `CRYPTO_gcm128_init` finishes executing.
> 
>> But going the
>> line of taking into consideration all corner cases is a stretch and
>> should be weighed against 16 out of ~380[!] bytes waste. I'd say it's
>> not worth it.
> 
> I see it both as an *optimization* and also a way to ensure
> *correctness*. In particular, if the code doesn't expect H to be there
> in configurations that don't use H, then some tricks that people might
> use (in particular, a trick I am using) becomes safer.
> 
> In particular, notice that in the gcm128_context structure, there are
> three kinds of state (again, only talking about non-s390x, non-1-bit
> platforms):
> 
> 1. State that is only used in the _init function: H.
> 
> 2. State that needs to be preserved in between
> authenticated-and-encrypted messages. This is `Htable`, `EK0`,
> `gmult`, `ghash`, `block`, etc.
> 
> 3. State that needs to be preserved only between the time you start an
> authenticated-and-encrypted message and the time you end it. This
> includes `len`, `EKi`, `mres`, `ares`, etc. currently. In theory this
> could also include `gmult`, `ghash` and `block`, if the code were
> refactored to recomputed them for each message and/or if things like
> the OPENSSL_STATIC_ARMCAP-type optimization allowed one to omit them
> from the structure completely in some configurations where there is no
> way they could vary at runtime. Also, Htable is 256 bytes on its own
> (on the platforms I care about), but actually in some
> platforms/configurations not all of Htable is used.
> 
> In my code, after I call the _init function, I extract out all the
> numbers in category #2 and store them in my per-connection context
> structure on. Then, when I need to encrypt/decrypt a message, I
> construct a full gcm128_context *on the stack*, zero it out, and then
> fill in the values from category #2. Then I encrypt/decrypt the
> message, and then throw away the gcm128_context.

In other words we *are* talking about super-custom code with very
special needs. As already mentioned, it would be next to impossible to
justify customization of OpenSSL to accommodate overly specific
requirements. And given above description it shouldn't be actually
needed, not even previously posted patch facilitating omission of H
should be required. I mean given knowledge about cases when H is not
used, you can omit it from your compressed state and leave it zeroed on
stack, right? *Or* [given that memory is seemingly at premium] you can
choose to preserve H in your private structure, omit Htable[!] and
initialize the latter in on-stack structure on per-call basis, per call
to *your* super-custom subroutine that is. But in case you choose to
omit H, here is "manifest". With exception for s390x and c64xplus none
of the modes/asm/ghash-*.pl modules use H or are even dependent on
relative position of Xi, H and Htable. s390x module depends on relative
position of H and Htable and does use both. c64xplus module is dependent
on relative position of H and Htable but uses H alone, i.e. Htable is
not used at all and space is wasted. Then there is
modes/asm/aesni-gcm-x86_64.pl, which does rely on relative position of
all three fields, Xi, H and Htable (offsets relative Xi are hard-coded),
but it does not use H. It might be appropriate to mention that
*historically* modes/adm/ghash-armv4.pl *was* dependent on relative
position of H and Htable and was using H alone in NEON code path, but
not anymore.



More information about the openssl-dev mailing list