Obtaining SSL certificate info from SSL object - BUG?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • John Nagle

    Obtaining SSL certificate info from SSL object - BUG?

    The Python SSL object offers two methods from obtaining
    the info from an SSL certificate, "server()" and "issuer()".
    The actual values in the certificate are a series of name/value
    pairs in ASN.1 binary format. But what "server()" and "issuer()"
    return are strings, with the pairs separated by "/". The
    documentation at "http://docs.python.org/lib/ssl-objects.html"
    says "Returns a string containing the ASN.1 distinguished name identifying the
    server's certificate. (See below for an example showing what distinguished names
    look like.)" There is, however, no "below".

    What you actually get back looks like this, which is Google's certificate:

    "/C=US/ST=California/L=Mountain View/O=Google Inc/CN=www.google.c om"

    So, no problem; just split on "/", right?

    Unfortunately, "/" is a legal character in certificate values.

    Worse, this isn't just a theoretical problem. Verisign's issuer
    information reads:

    "/O=VeriSign Trust Network/OU=VeriSign, Inc./OU=VeriSign International
    Server CA - Class 3/OU=www.verisign .com/CPS Incorp.by Ref. LIABILITY LTD.(c)97
    VeriSign".

    Note that

    "OU=Terms of use at www.verisign.com/rpa (c)00"

    with a "/" in the middle of the value field. So you hit this
    problem on every cert issued by Verisign. Oops.

    Nor does there seem to be a way to get at the certificate itself
    from within Python. There was some discussion of this in 2002 at



    when someone wrote: "Furthermor e, while the server and issuer are exposed
    through undocumented attributes, the server_cert is not. So there is no way to
    validate the cert manually, short of rewriting socketmodule.c. This is one case
    where the batteries included have been sitting on the shelf too long."

    Clearly, "server()" and "issuer()" should return lists, not strings. That
    would resolve the ambiguity. ASN.1 is a representation for lists, and
    hammering those lists into strings loses information.

    Is there a workaround for this? Without rebuilding Python
    and becoming incompatible?

    John Nagle
    Animats
  • Paul Rubin

    #2
    Re: Obtaining SSL certificate info from SSL object - BUG?

    John Nagle <nagle@animats. comwrites:
    Is there a workaround for this? Without rebuilding Python
    and becoming incompatible?
    I've parsed certs by calling openssl in a subprocess. Maybe that's
    not what you wanted to hear. If you're really industrious you might
    be able to extend the tlslite cert parsing code (written in pure
    Python) to get those fields out.

    Comment

    • Donn Cave

      #3
      Re: Obtaining SSL certificate info from SSL object - BUG?

      In article <453D95EA.10206 02@animats.com> ,
      John Nagle <nagle@animats. comwrote:
      The Python SSL object offers two methods from obtaining
      the info from an SSL certificate, "server()" and "issuer()".
      The actual values in the certificate are a series of name/value
      pairs in ASN.1 binary format. But what "server()" and "issuer()"
      return are strings, with the pairs separated by "/". The
      documentation at "http://docs.python.org/lib/ssl-objects.html"
      says "Returns a string containing the ASN.1 distinguished name identifying
      the
      server's certificate.
      ....
      "/O=VeriSign Trust Network/OU=VeriSign, Inc./OU=VeriSign International
      Server CA - Class 3/OU=www.verisign .com/CPS Incorp.by Ref. LIABILITY
      LTD.(c)97
      VeriSign".
      >
      Note that
      >
      "OU=Terms of use at www.verisign.com/rpa (c)00"
      >
      with a "/" in the middle of the value field.
      ....
      Is there a workaround for this? Without rebuilding Python
      and becoming incompatible?
      As a practical matter, I think it's fairly safe to assume
      there will be no values that include / in a context like
      really looks like that X.500 style distinguished name.

      So if you parse out that string in those terms, and require
      each of those key = value pairs to have reasonable values -
      key has no embedded spaces, value has non-zero length - then
      you should be OK. Re-join any invalid component to its
      predecessor's value.

      Donn Cave, donn@u.washingt on.edu

      Comment

      • John Nagle

        #4
        Re: Obtaining SSL certificate info from SSL object - BUG?

        Donn Cave wrote:
        In article <453D95EA.10206 02@animats.com> ,
        John Nagle <nagle@animats. comwrote:
        >
        >
        > The Python SSL object offers two methods from obtaining
        >>the info from an SSL certificate, "server()" and "issuer()".
        >>The actual values in the certificate are a series of name/value
        >>pairs in ASN.1 binary format. But what "server()" and "issuer()"
        >>return are strings, with the pairs separated by "/". The
        >>documentati on at "http://docs.python.org/lib/ssl-objects.html"
        >>says "Returns a string containing the ASN.1 distinguished name identifying
        >>the
        >>server's certificate.
        >
        ...
        >
        >>"/O=VeriSign Trust Network/OU=VeriSign, Inc./OU=VeriSign International
        >>Server CA - Class 3/OU=www.verisign .com/CPS Incorp.by Ref. LIABILITY
        >>LTD.(c)97
        >>VeriSign".
        >>
        >>Note that
        >>
        > "OU=Terms of use at www.verisign.com/rpa (c)00"
        >>
        >>with a "/" in the middle of the value field.
        >
        ...
        >
        >>Is there a workaround for this? Without rebuilding Python
        >>and becoming incompatible?
        >
        >
        As a practical matter, I think it's fairly safe to assume
        there will be no values that include / in a context like
        really looks like that X.500 style distinguished name.
        Actually, we've just discovered an exploit. By
        ordering a low-level certificate with a "/" in the right
        place, you can create the illusion (at least for flawed
        implementations like this one) that the certificate
        belongs to someone else. Just order a certificate from
        GoDaddy, enter something like this in the "Name" field

        "Myphonynam e/C=US/ST=California/L=San Jose/O=eBay Inc./OU=Site
        Operations/CN=signin.ebay. com"

        and Python code will be spoofed into thinking you're eBay.

        Fortunately, browsers don't use Python code.

        The actual bug is in

        python/trunk/Modules/_ssl.c

        at

        if ((self->server_cert = SSL_get_peer_ce rtificate(self->ssl))) {
        X509_NAME_oneli ne(X509_get_sub ject_name(self->server_cert) ,
        self->server, X509_NAME_MAXLE N);
        X509_NAME_oneli ne(X509_get_iss uer_name(self->server_cert) ,
        self->issuer, X509_NAME_MAXLE N);

        The "X509_name_onel ine" function takes an X509_NAME structure, which is
        the certificate system's representation of a list, and flattens it
        into a printable string. This is a debug function, not one for use in
        production code. The SSL documentation for "X509_name_onel ine" says:

        "The functions X509_NAME_oneli ne() and X509_NAME_print () are legacy
        functions which produce a non standard output form, they don't handle
        multi character fields and have various quirks and inconsistencies .
        Their use is strongly discouraged in new applications."

        What OpenSSL callers are supposed to do is call X509_NAME_entry _count()
        to get the number of entries in an X509_NAME structure, then get each
        entry with X509_NAME_get_e ntry(). A few more calls will obtain
        the name/value pair from the entry, as UTF8 strings, which should
        be converted to Python UNICODE strings.

        X509_NAME_oneli ne() doesn't handle Unicode; it converts non-ASCII
        values to "\xnn" format. Again, it's for debug output only.

        So what's needed are two new functions for Python's SSL sockets to
        replace "issuer" and "server". The new functions should return
        lists of Unicode strings representing the key/value pairs.
        (A list is needed, not a dictionary; two strings with the same key
        are both possible and common.)

        The reason this now matters is that new "high assurance" certs,
        the ones that tell you how much a site can be trusted, are now being
        deployed, and to use them effectively, you need that info. Support for
        them is in Internet Explorer 7, so they're going to be widespread soon.
        Python needs to catch up.

        I'll submit a bug report.

        John Nagle
        Animats

        Comment

        • Paul Rubin

          #5
          Re: Obtaining SSL certificate info from SSL object - BUG?

          John Nagle <nagle@animats. comwrites:
          The reason this now matters is that new "high assurance" certs,
          the ones that tell you how much a site can be trusted, are now being
          deployed,
          Oh my, I hadn't heard about this. They come up with new scams all the
          time. I guess I'll check for info. It sounds sort of like the terror
          alert system, which tells us how scared to be on any particular day ;-)

          Comment

          • John Nagle

            #6
            Re: Obtaining SSL certificate info from SSL object - BUG?

            Paul Rubin wrote:
            John Nagle <nagle@animats. comwrites:
            >
            >>The reason this now matters is that new "high assurance" certs,
            >>the ones that tell you how much a site can be trusted, are now being
            >>deployed,
            >
            >
            Oh my, I hadn't heard about this. They come up with new scams all the
            time. I guess I'll check for info. It sounds sort of like the terror
            alert system, which tells us how scared to be on any particular day ;-)
            Anyway, I've submitted it as a Python bug report:

            [1583946] SSL "issuer" and "server" functions problems - security

            And for the record, here's a workaround: do a split with this
            regular expression:

            pparsecertstrin gre = re.compile(
            r"""(?:/)(\w(?:\w|))(?: =)""")

            You'll get lists of the form

            ['', key1, value1, key2, value2 ...]

            This isn't totally unspoofable, and won't work for Unicode certs,
            but it works for the few dozen common certs I've run through it.

            John Nagle
            Animats

            Comment

            • Heikki Toivonen

              #7
              Re: Obtaining SSL certificate info from SSL object - BUG?

              John Nagle wrote:
              The Python SSL object offers two methods from obtaining
              the info from an SSL certificate, "server()" and "issuer()".
              The actual values in the certificate are a series of name/value
              pairs in ASN.1 binary format. But what "server()" and "issuer()"
              return are strings, with the pairs separated by "/". The
              Is it an option for you to use 3rd party libraries (please note that the
              Python stdlib SSL library does not do certificate validation etc. which
              you'd typically want in a production application)?

              With M2Crypto you could do something like this:

              from M2Crypto import SSL

              ctx = SSL.Context()
              conn = SSL.Connection( ctx)
              conn.connect((' www.verisign.co m', 443))
              cert = conn.get_peer_c ert()
              print cert.get_issuer ().as_text()
              print cert.get_subjec t().as_text()
              try:
              print cert.get_ext('s ubjectAltName') .get_value()
              except LookupError:
              print 'no subjectAltName'
              try:
              print cert.get_subjec t().CN
              except AttributeError:
              print 'no commonName'

              Please note, however, that if you need the server name because you want
              to validate that you connected to the server you intended to, it would
              be better to let M2Crypto do it for you or use the M2Crypto SSL.Checker
              class explicitly yourself.

              Other Python crypto libraries probably have equivalent APIs.

              --
              Heikki Toivonen

              Comment

              • Michael Ströder

                #8
                Re: Obtaining SSL certificate info from SSL object - BUG?

                John Nagle wrote:
                The Python SSL object offers two methods from obtaining
                the info from an SSL certificate, "server()" and "issuer()".
                The actual values in the certificate are a series of name/value
                pairs in ASN.1 binary format. But what "server()" and "issuer()"
                return are strings, with the pairs separated by "/". The
                documentation at "http://docs.python.org/lib/ssl-objects.html"
                says "Returns a string containing the ASN.1 distinguished name
                identifying the server's certificate. (See below for an example showing
                what distinguished names look like.)" There is, however, no "below".
                >
                What you actually get back looks like this, which is Google's certificate:
                >
                "/C=US/ST=California/L=Mountain View/O=Google Inc/CN=www.google.c om"
                >
                So, no problem; just split on "/", right?
                >
                Unfortunately, "/" is a legal character in certificate values.
                You hit a really serious problem: There's no completely well-defined
                string representation format for distinguished names used in X.509
                certificates. The format above is what OpenSSL used in the beginning.
                Yuck! IMO this is also a security problem in some cases.

                The best thing would be to stick to RFC 4514 (formerly RFC 2253:
                Lightweight Directory Access Protocol (LDAP): String Representation of
                Distinguished Names). It defines a UTF-8-based string representation.

                Play around with OpenSSL's command-line option 'nameopt':
                openssl x509 -inform der -in VSIGN1.CER -subject -issuer -noout
                subject= /C=US/O=VeriSign, Inc./OU=Class 1 Public Primary Certification
                Authority
                issuer= /C=US/O=VeriSign, Inc./OU=Class 1 Public Primary Certification
                Authority
                openssl x509 -inform der -in VSIGN1.CER -subject -issuer -noout
                -nameopt rfc2253
                subject= OU=Class 1 Public Primary Certification Authority,O=Ver iSign\,
                Inc.,C=US
                issuer= OU=Class 1 Public Primary Certification Authority,O=Ver iSign\,
                Inc.,C=US

                Guess the second is what Python SSL object also should return. No idea
                whether this is available at OpenSSL's API level.

                Ciao, Michael.

                Comment

                • Michael Ströder

                  #9
                  Re: Obtaining SSL certificate info from SSL object - BUG?

                  Donn Cave wrote:
                  In article <453D95EA.10206 02@animats.com> ,
                  John Nagle <nagle@animats. comwrote:
                  >>
                  >>Note that
                  >>
                  > "OU=Terms of use at www.verisign.com/rpa (c)00"
                  >>
                  >>with a "/" in the middle of the value field.
                  >
                  ...
                  >
                  >>Is there a workaround for this? Without rebuilding Python
                  >>and becoming incompatible?
                  >
                  As a practical matter, I think it's fairly safe to assume
                  there will be no values that include / in a context like
                  really looks like that X.500 style distinguished name.
                  >
                  So if you parse out that string in those terms, and require
                  each of those key = value pairs to have reasonable values -
                  key has no embedded spaces, value has non-zero length - then
                  you should be OK. Re-join any invalid component to its
                  predecessor's value.
                  Don't make such assumptions when parsing DNs!
                  It's a major PITA in the long run.

                  Ciao, Michael.

                  Comment

                  • John Nagle

                    #10
                    Re: Obtaining SSL certificate info from SSL object - BUG?

                    Michael Ströder wrote:
                    John Nagle wrote:
                    >
                    > The Python SSL object offers two methods from obtaining
                    >>the info from an SSL certificate, "server()" and "issuer()".
                    >>The actual values in the certificate are a series of name/value
                    >>pairs in ASN.1 binary format. But what "server()" and "issuer()"
                    >>return are strings, with the pairs separated by "/". The
                    >>documentati on at "http://docs.python.org/lib/ssl-objects.html"
                    >>says "Returns a string containing the ASN.1 distinguished name
                    >>identifying the server's certificate. (See below for an example showing
                    >>what distinguished names look like.)" There is, however, no "below".
                    >>
                    >>What you actually get back looks like this, which is Google's certificate:
                    >>
                    >>"/C=US/ST=California/L=Mountain View/O=Google Inc/CN=www.google.c om"
                    >>
                    >>So, no problem; just split on "/", right?
                    >>
                    >>Unfortunately , "/" is a legal character in certificate values.
                    >
                    >
                    You hit a really serious problem: There's no completely well-defined
                    string representation format for distinguished names used in X.509
                    certificates. The format above is what OpenSSL used in the beginning.
                    Yuck! IMO this is also a security problem in some cases.
                    >
                    The best thing would be to stick to RFC 4514 (formerly RFC 2253:
                    Lightweight Directory Access Protocol (LDAP): String Representation of
                    Distinguished Names). It defines a UTF-8-based string representation.
                    ....
                    Guess the second is what Python SSL object also should return. No idea
                    whether this is available at OpenSSL's API level.
                    >
                    That's exactly what I suggested in my Python bug report update.

                    OpenSSL has all the right functions. Almost.

                    OpenSSL has "X509_NAME_onel ine()" which is deprecated, which Python
                    is using, and which uses "/" as a delimiter without escaping "/" in
                    content.

                    OpenSSL also has "X509_NAME_prin t_ex", which does the right
                    thing - outputs a UTF8 string in RFC 2253 format, with all the
                    right escapes and Unicode compatibility if you ask for Unicode
                    output.

                    Unfortunately, "X509_NAME_prin t_ex" is set up to output to
                    an I/O port, not a string. There's no comparable function in
                    OpenSSL to edit that info to a string.

                    All the right machinery to do the job is in

                    openssl/crypto/asn1/a_strex.c

                    but they ran into a classic C problem. They have code designed
                    to output to a stream of infinite length, and don't have a way
                    to get the target length down to the copy function. Take look at
                    "send_mem_chars " in that file, which is turned off. If it were
                    used, it would have buffer overflow potential. This could be
                    fixed, but it's a pain. It's local to that file, though;
                    someone who owns that code could fix it in an hour.

                    X509_NAME_oneli ne(), the deprecated function, is in a
                    completely separate file and doesn't handle the hard cases at all.

                    The same problem was reported in Apache mod_ssl back in 2004. See



                    And it had to be fixed in OpenCA. See



                    Also, there may be an exploitable bug in MySQL that depends on this. See



                    Get the OpenSSL people to fix their API, and the Python fix will
                    be a one-line change.


                    John Nagle

                    Comment

                    • John Nagle

                      #11
                      Re: Obtaining SSL certificate info from SSL object - proposal

                      John Nagle wrote:
                      Michael Ströder wrote:
                      >
                      >John Nagle wrote:
                      >>
                      >> The Python SSL object offers two methods from obtaining
                      >>the info from an SSL certificate, "server()" and "issuer()".
                      >>The actual values in the certificate are a series of name/value
                      >>pairs in ASN.1 binary format. But what "server()" and "issuer()"
                      >>return are strings, with the pairs separated by "/". The
                      >>documentati on at "http://docs.python.org/lib/ssl-objects.html"
                      >>says "Returns a string containing the ASN.1 distinguished name
                      >>identifying the server's certificate. (See below for an example showing
                      >>what distinguished names look like.)" There is, however, no "below".
                      Since I really need this, I'm looking at modifying the Python SSL
                      interface to SSL objects by adding a function "certificat e()" which
                      returns an X.509 certificate in the following format:

                      SSL certificates are trees, represented in a format, "ASN.1", which
                      allows storing numbers, strings, and flags.
                      Fields are identified by names or by assigned "OID numbers"
                      (see RFC 2459).

                      The tree is returned as tuples. The first element of the tuple
                      is always a string giving the name of the field, and the second
                      element is a string, Boolean, or number giving the value, or
                      a list of more tuples. The result is a tree, which will
                      resemble the tree typically displayed by browsers displaying
                      SSL certificates.

                      The top tuple's field name is the domain for which the certificate
                      applies.

                      Note that it is straightforward to implement "issuer" and "subject"
                      using "certificat e", which provides a way out of the current problems
                      with those fields.

                      Example:

                      ( 'www.google.com ',
                      ( 'Certificate',
                      [ ('Version', 3),
                      ( 'Serial Number',
                      '4B:A5:AE:59:DE :DD:1C:C7:80:7C :89:22:91:F0:E2 :43'),
                      ( 'Certificate Signature Algorithm',
                      'PKCS #1 MD5 With RSA Encryption'),
                      ( 'Issuer',
                      [ ('CN', 'Thawte SGC CA'),
                      ('O', 'Thawte Consulting (Pty) Ltd.'),
                      ('C', 'ZA')]),
                      ( 'Validity',
                      [ ('Not Before', '5/15/2006 23:18:11 PM GMT'),
                      ('Not After', '5/15/2007 23:18:11 PM GMT')]),
                      ( 'Subject',
                      [ ('CN', 'www.google.com '),
                      ('O', 'Google Inc'),
                      ('L', 'Mountain View'),
                      ('ST', 'California'),
                      ('C', 'US')]),
                      ( 'Subject Public Key Info',
                      [ ( 'Subjects Public Key Algorithm',
                      'PKCS #1 RSA Encryption'),
                      ( 'Subjects Public Key',
                      '30 81 89 02 81 81 00 e6 c5 c6 8d cd 0b a3 03
                      04dc ae cc c9 46 be bd cc 9d bc 73 34 48 fe d3 7564 d0 c9 c9 7
                      6 27 72 0f a9 96 1a 3b 81 f3 14 f6ae 90 56 e7 19 d2 73
                      68 a7 85 a4 ae ca 24 14 3000 ba e8 36 5d 81 73 3a 71 05 8f b1 af 11 87 da5c f
                      1 3e bf 53 51 84 6f 44 0e b7 e8 26 d7 2f b26f f2 f2 5d df a7 cf 8c a5 e9 1e 6f
                      30 48 94 210b 01 ad ba 0e 71 01 0d 10 ef bf ee 2c d3
                      8d fe54 a8 fe d3 97 8f cb 02 03 01 00 01')]),
                      ( 'Certificate Signature Algorithm',
                      'PKCS #1 MD5 With RSA Encryption'),
                      ( 'Certificate Signature Value',
                      '57 4b bc a4 43 e7 e0 01 92 a0 96 35 f9 18 08 881d 7b 70 19 8f
                      f9 36 b2 05 3a 05 ca 14 59 4d 240e e5 8a af 4e 87 5a
                      f7 1c 2a 96 8f cb 61 40 9ed2 b4 38 40 21 24 c1 4f 1f cb 13 4a 8f 95 02 df91 3d
                      d6 40 eb 11 6f 9b 10 a1 6f ce 91 5e 30 f66d 13 5e 15
                      a4 2e c2 18 9e 00 c3 d8 32 67 47 fcb8 1e 9a d9 9a 8e cc ff 7c 12 b7 03 bf 52 20
                      cf21 f4 f3 77 dd 12 15 f0 94 fa 90 d5 e3 59 68 81')]
                      ))

                      Comments?

                      John Nagle


                      Comment

                      • =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

                        #12
                        Re: Obtaining SSL certificate info from SSL object - proposal

                        John Nagle schrieb:
                        SSL certificates are trees, represented in a format, "ASN.1", which
                        allows storing numbers, strings, and flags.
                        Fields are identified by names or by assigned "OID numbers"
                        (see RFC 2459).
                        >
                        The tree is returned as tuples. The first element of the tuple
                        is always a string giving the name of the field, and the second
                        element is a string, Boolean, or number giving the value, or
                        a list of more tuples. The result is a tree, which will
                        resemble the tree typically displayed by browsers displaying
                        SSL certificates.
                        That looks like a bad choice of interface to me. If you want to expose
                        the entire certificate, you should do that using as a single byte
                        string, encoded in DER. The way you are representing it, you are losing
                        information (e.g. whether the string type was IA5String,
                        PrintableString , UTF8String), and I thought your complaint was that
                        the current interfaces lose information, so you should not add an
                        interface that makes the same mistake it tries to overcome.

                        Regards,
                        Martin

                        Comment

                        Working...