[openssl-dev] [PATCH] Support broken PKCS#12 key generation.

Mon Aug 29 17:40:59 UTC 2016

On Mon, 2016-08-29 at 15:25 +0200, Andy Polyakov wrote:
> First of all. *Everything* that is said below (and what modifications in
> question are about) applies to non-ASCII passwords. ASCII-only passwords
> are not affected by this and keep working as they were.
> 
> > 
> > > 
> > > commit 647ac8d3d7143e3721d55e1f57730b6f26e72fc9
> > > 
> > > OpenSSL versions before 1.1.0 didn't convert non-ASCII
> > > UTF8 PKCS#12 passwords to Unicode correctly.
> > > 
> > > To correctly decrypt older files, if MAC verification fails
> > > with the supplied password attempt to use the broken format
> > > which is compatible with earlier versions of OpenSSL.
> > > 
> > > Reviewed-by: Richard Levitte <levitte at openssl.org>
> > 
> > Hm, this sounds like something that other crypto libraries also ought
> > to try to work around. 
> > 
> > Please could you elaborate on the specific problem, and/or show a test
> > case?
> 
> Note that there is 80-test_pkcs12.t that refers to shibboleth.pfx.

Thanks.

> > I'm not quite sure how to interpret the patch itself. You pass the
> > password through OPENSSL_asc2uni() and then OPENSSL_uni2utf8() — which
> > is essentially converting ISO8859-1 to UTF-8.
> 
> You make it sound like these two are called one after another. But
> that's not what happens. It *tries* to access data with passwords
> converted with OPENSSL_asc2uni *or* OPENSSL_utf82uni, effectively
> independently of each other, in order to see which one is right. So that
> you can *transparently* access old broken data, as well as
> specification-compliant one.

Hm... at line 541 of pkcs12.c we call PKCS12_verify_mac(…mpass…) with
the original provided password. Which is in the locale-native character
set, is it not? No attempt to convert to anything known.

Then if that *fails* we do indeed convert it via OPENSSL_asc2uni() and
OPENSSL_uni2utf8() to make 'badpass' and try again:

        } else if (!PKCS12_verify_mac(p12, mpass, -1)) {
            /*
             * May be UTF8 from previous version of OpenSSL:
             * convert to a UTF8 form which will translate
             * to the same Unicode password.
             */
            unsigned char *utmp;
            int utmplen;
            utmp = OPENSSL_asc2uni(mpass, -1, NULL, &utmplen);
            if (utmp == NULL)
                goto end;
            badpass = OPENSSL_uni2utf8(utmp, utmplen);
            OPENSSL_free(utmp);
            if (!PKCS12_verify_mac(p12, badpass, -1)) {
                BIO_printf(bio_err, "Mac verify error: invalid password?\n");
                ERR_print_errors(bio_err);
                goto end;
            } else {
                BIO_printf(bio_err, "Warning: using broken algorithm\n");
                if (!twopass)
                    cpass = badpass;
            }

The shibboleth.pfx test seems to pass on the *first* call to
PKCS12_verify_mac() — it *isn't* testing this fallback. If I add a
space to the end of the correct password and stick a breakpoint on
PKCS12_verify_mac(), I see the following calls:

#0  PKCS12_verify_mac (p12=0x956e40, 
    pass=0x956a30 "σύνθημα γνώρισμα ", passlen=-1)
    at crypto/pkcs12/p12_mutl.c:152
#1  0x0000000000425567 in pkcs12_main (argc=0, argv=0x7fffffffdd90)
    at apps/pkcs12.c:541

And then it converts that string from ISO8859-1 (which it wasn't) to
UTF-8, and calls PKCS12_verify_mac() again:

#0  PKCS12_verify_mac (p12=0x956e40, 
    pass=0x9597e0 "Ï\302\203Ï\302\215Î½Î¸Î·Î¼Î±
Î³Î½Ï\302\216Ï\302\201Î¹Ï\302\203Î¼Î± ", passlen=-1) at
crypto/pkcs12/p12_mutl.c:152
#1  0x00000000004255fc in pkcs12_main (argc=0, argv=0x7fffffffdd90)
    at apps/pkcs12.c:554

> > 
> > So, if my password is "naïve". In UTF-8 that's 6e 61 c3 af 76 65, which
> > is the correct sequence of bytes to use for the password?
> 
> 00 6e 00 61 00 ef 00 76 00 65, big-endian UTF-16.

Is that conversion done inside PKCS12_verify_mac()? Because we
definitely aren't passing UTF-16-BE into PKCS12_verify_mac().

> > 
> > And you now convert that sequence of bytes to 6e 61 c3 83 c2 af 76 65
> > by assuming it's ISO8859-1 (which would be 'naÃ¯ve') and converting to UTF-8?
> 
> I don't follow. Why would it have to be converted to this sequence?

That's what it's doing. Changing the example above and showing the same
breakpoints as they get hit again...

Starting program: /home/dwmw2/git/openssl/apps/openssl pkcs12 -noout -password pass:naïve -in test/shibboleth.pfx
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, PKCS12_verify_mac (p12=0x956e30, pass=0x956a30 "naïve", 
    passlen=-1) at crypto/pkcs12/p12_mutl.c:152
152	    if (p12->mac == NULL) {
(gdb) x/7bx pass
0x956a30:	0x6e	0x61	0xc3	0xaf	0x76	0x65	0x00
(gdb) c
Continuing.

Breakpoint 1, PKCS12_verify_mac (p12=0x956e30, pass=0x959960 "naÃ¯ve", 
    passlen=-1) at crypto/pkcs12/p12_mutl.c:152
152	    if (p12->mac == NULL) {
(gdb) x/8bx pass
0x959960:	0x6e	0x61	0xc3	0x83	0xc2	0xaf	0x76	0x65

> > 
> > So... what was the bug that was actually being worked around?
> 
> 6e 61 c3 af 76 65 was expanded to 00 6e 00 61 00 c3 00 af 00 76 00 65,
> in violation of specification.

OK, that makes sense, because PKCS12_verify_mac() is going to convert
that last byte sequence I showed above into UTF16-BE making it:
  006e 0061 00c3 00af 0076 0065

> > 
> > That
> > older versions were converting from the local charset to UTF-8 twice in
> > a row? So you've implemented a "convert ISO8859-1 to UTF-8" fallback
> > which will cope with that in *some* locales but not all...?
> 
> Yeah, something like that. Note that if you have created PKCS#12 file
> with a non-UTF8 locale, it's not given that you can read it with another
> locale, UTF-8 or not. This was *always* the case. And that's *not* what
> we try to address here. We assume that you don't change locale when you
> upgrade OpenSSL version. Ultimate goal is to make it compliant with
> specification and therefore interoperable with other compliant
> implementations. But we can't just switch, because then it stops
> interoperate with older OpenSSL versions.

Hm, words fail me.

Well, that's not entirely true. But *polite* words fail me...

Let me try to understand this... you have always ignored, and still
ignore, the actual LC_CTYPE which tells you the character set in which
the password was provided from the user.

You *used* to convert it through your horribly misnamed
OPENSSL_asc2uni() function, which actually converts ISO8859-1 to
UTF16BE by simply expanding it and inserting 0x00 between each byte of
the original. So effectively you were "assuming" the input was
ISO8859-1.

Now you assume it's UTF-8, and convert it with OPENSSL_utf8touni().
(And now what happens when the locale input *isn't* valid UTF-8,
because it's a legacy 8-bit charset? That conversion should fail,
right?)

Your latest workaround (from which I started this thread) is addressing
the first problem *purely* for the case of the UTF-8 locale. It checks
for the "we treated UTF-8 as if it were ISO8859-1" case, which is what
leads to the 006e 0061 00c3 00af 0076 0065 example you gave above.

But you *haven't* actually implemented any workaround for the other
whole set of "we treated locale XXXXX as if it were ISO8859-1" bugs
that your original code had. Or the whole set of "we treated local
XXXXX as if it were UTF-8" bugs that the new code has.

So for other applications to try to read OpenSSL's PKCs#12 files, what
we need to do is first convert the sane Unicode rendition of the
password into the native locale charset (e.g. Windows-1252), then take
that bytestream and *pretend* it's ISO8859-1, and convert to UTF-16BE
to check the MAC. Then if that fails, take the same Windows-1252
bytestream and *pretend* it's UTF-8, and convert *that* to UTF-16BE to
see if it works.

So... let's have a password 'nałve', in a ISO8859-2 environment where
that is represented by the bytes 6e 61 b3 76 65.

First I should try the correct version according to the spec:
 006e 0061 0142 0076 0065

Then we try the "OpenSSL assumed it was ISO8859-1" version:
 006e 0061 00b3 0076 0065

And finally we try the "OpenSSL assumed it was UTF-8" version:
 006e 0061 ... actually you *can't* convert that byte sequence "from"
UTF-8 since it isn't valid UTF-8. So what will new OpenSSL do here,
again?

> Is this helpful?

On the whole, I wish I hadn't asked.... but I think we need to sort it out.

For a start, *please* can you kill those horribly misnamed
OPENSSL_asc2uni() and OPENSSL_utf82uni() functions. Especially the
former. Having a function which is misnamed in what it takes as input
*and* what it gives out output is particularly impressive.

It was a stupid thing for Windows to do, to misuse the term 'Unicode'
to mean UCS-2LE (and later UTF-16LE when it retroactively changed its
mind. For you to then abuse 'uni' to mean UTF-16BE, something
*different* to what Windows abuses the term for, is really not very
helpful at all! :)

-- 
dwmw2
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5760 bytes
Desc: not available
URL: <http://mta.openssl.org/pipermail/openssl-dev/attachments/20160829/1d5997e6/attachment-0001.bin>