[openssl-dev] [openssl.org #4698] PEM parsing incorrect; whitespace in PEM crashes parser

Wed Oct 5 18:38:49 UTC 2016

One more reference: https://tools.ietf.org/html/rfc4648#section-3.3
describes the considerations for 'non-base64 characters'.

Short form: MIME requires that they be ignored. 7468 says SHOULD.
4648 says 'reject, unless the referencing spec says otherwise' (which
7468 does.)

I wrote previously that MIME's limit on line length is 72; according to
4648 3.1 it's actually 76.  Sorry.  The point is, it's NOT 64 (which is what
PEM specifies.).  (65 in OpenSSL must include the end-of-line.)

Note that all 3 constants are (deliberately) a multiple of 4, meaning that
the decoding of a byte can't span lines.  However, this is not true in
the wild;
end-of-line can appear anywhere. (Again, wrapping by MUAs, web browsers
and embedded devices are the most frequent offenders.)

Here's the full text of 3.3:

>    Base encodings use a specific, reduced alphabet to encode binary
>    data.  Non-alphabet characters could exist within base-encoded data,
>    caused by data corruption or by design.  Non-alphabet characters may
>    be exploited as a "covert channel", where non-protocol data can be
>    sent for nefarious purposes.  Non-alphabet characters might also be
>    sent in order to exploit implementation errors leading to, e.g.,
>    buffer overflow attacks.
>
>    Implementations MUST reject the encoded data if it contains
>    characters outside the base alphabet when interpreting base-encoded
>    data, unless the specification referring to this document explicitly
>    states otherwise.  Such specifications may instead state, as MIME
>    does, that characters outside the base encoding alphabet should
>    simply be ignored when interpreting data ("be liberal in what you
>    accept").  Note that this means that any adjacent carriage return/
>    line feed (CRLF) characters constitute "non-alphabet characters" and
>    are ignored.  Furthermore, such specifications MAY ignore the pad
>    character, "=", treating it as non-alphabet data, if it is present
>    before the end of the encoded data.  If more than the allowed number
>    of pad characters is found at the end of the string (e.g., a base 64
>    string terminated with "==="), the excess pad characters MAY also be
>    ignored.
>

Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed. 

-- 
Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4698
Please log in as guest with password guest if prompted