[openssl-project] Help deciding on PR 6341 (facilitate reading PKCS#12 objects in OSSL_STORE)
openssl-users at dukhovni.org
Sat Jun 2 18:27:42 UTC 2018
> On Jun 2, 2018, at 2:36 AM, Richard Levitte <levitte at openssl.org> wrote:
>> Canonicalize when importing for use with the store API.
>> Not sure whether wchar_t though, just octet string in UTF-8 seems saner.
> Dunno about that, really. The aim, to quote David W, is to make it
> *hard* for applications to get it wrong, and we all know that an octet
> string is merely an octet string...
Octet strings are by *defintion* not wide characters, they are an
opaque string of *octets* (an array of uint8). The purpose of
whchar_t and friends is to process non-ascii *character strings*,
with the wide versions of strlen(), strchr(), ... We do none of
this. We pass the opaque input to a key-derivation function that
treats it as a opaque octet-string.
> We cannot know with absolute certainty that it's UTF-8 encoded.
Indeed someone could pass us an octet string that is not derived
from the UTF-8 encoding of some actual character string entered
by a user. That does not matter. What matters is that all
user input is canonically encoded, in a *platform-independent*
way. And for that the application is responsible for converting
user input to UTF-8. If the application does not do it right,
it will get incorrect (fail to decrypt) or non-portable (fail
to decrypt in the future on other platforms) behaviour.
> The way I saw it is that UTF-8
> really means Unicode, and a way to codify that is wchar_t.
NO. That's not the point. UTF-8 yields a canonical encoding
of what the user typed to an opaque octet string. That
encoding is the application's responsibility. We must not
treat the password as a character string, that's not portable.
> openssl-users> That is the password is an opaque byte string, not a character
> openssl-users> string in the platform's encoding of i18n strings.
> Here is, unfortunately, where standards differ. PKCS#12 has a
> requirement that makes the pass phrase anything but opaque.
OK, looking at:
we see that PKCS#5 v2.1 sensibly defines passwords as opaque strings
in some unspecified standard encoding (ASCII or UTF-8 for example).
PKCS#12 however, is sadly requiring a 16-bit BMPString encoding
(instead of UTF-8), presumably for backwards compatibility.
> With that, the characters have meaning and need to be interpreted
> correctly to form a standard compliant BMPString.
Well, in that case for PKCS#12 we must require a well-formed
UTF-8 input, which we can convert to BMPString without any
need for locale-specific information. The ASN.1 library
presumably can convert from UTF-8 to BMP, or code can be
added to do that if missing.
> (it would have been smarter to have the PKCS12 routines take wchar_t
> strings rather than char strings... hindsight is what it is...)
No, wchar_t is not defined to be a 16-bit BMPString compatible
encoding. It is AFAIK a platform-specific string representation
that is not canonical.
More information about the openssl-project