Problems with email.Generator.Generator

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Chris Withers

    #16
    Re: Problems with email.Generator .Generator

    Chris Withers wrote:
    print msg.as_string()
    >
    MIME-Version: 1.0
    Content-Type: text/plain; charset; charset="utf-8"
    ^^^^^^^
    Actually, even this isn't correct as you can see above...
    charset = Charset('utf-8')
    msg = MIMEText('','pl ain',None)
    msg.set_payload (u'Some text with chars that need encoding:\xa3', charset)
    >
    Traceback (most recent call last):
    File "C:\test_encodi ng.py", line 5, in ?
    msg.set_payload (u'Some text with chars that need
    encoding:\xa3', charset)
    File "c:\python24\li b\email\Message .py", line 218, in set_payload
    self.set_charse t(charset)
    File "c:\python24\li b\email\Message .py", line 260, in set_charset
    self._payload = charset.body_en code(self._payl oad)
    File "c:\python24\li b\email\Charset .py", line 366, in body_encode
    return email.base64MIM E.body_encode(s )
    File "c:\python24\li b\email\base64M IME.py", line 136, in encode
    enc = b2a_base64(s[i:i + max_unencoded])
    UnicodeEncodeEr ror: 'ascii' codec can't encode character u'\xa3' in
    position 40: ordinal not in range(128)
    ....and I'm still left with this problem...

    Has no-one ever successfully generated a correctly formatted email with
    email.MIMEText where the message includes non-ascii characters?!

    Chris

    --
    Simplistix - Content Management, Zope & Python Consulting
    - http://www.simplistix.co.uk

    Comment

    • Chris Withers

      #17
      Problems with email.Generator .Generator - Solved?

      Chris Withers wrote:
      Has no-one ever successfully generated a correctly formatted email with
      email.MIMEText where the message includes non-ascii characters?!
      I'm guessing not ;-)

      Well, I think I have a winner, but it required me to subclass MIMEText:

      from email.Charset import Charset,QP
      from email.MIMEText import MIMEText as OriginalMIMETex t
      from email.MIMENonMu ltipart import MIMENonMultipar t

      class MIMEText(Origin alMIMEText):

      def __init__(self, _text, _subtype='plain ', _charset='us-ascii'):
      if isinstance(_cha rset,Charset):
      cs = _charset.input_ charset
      else:
      cs = _charset
      if isinstance(_tex t,unicode):
      _text = _text.encode(ch arset.input_cha rset)
      MIMENonMultipar t.__init__(self , 'text', _subtype,
      **{'charset': cs})
      self.set_payloa d(_text, _charset)

      charset = Charset('utf-8')
      charset.body_en coding = QP
      txt = u'Some text with chars that need encoding:\xa3'
      msg = MIMEText(txt,'p lain',charset)
      print msg.as_string()

      Which gives:

      Content-Type: text/plain; charset="utf-8"
      MIME-Version: 1.0
      Content-Transfer-Encoding: quoted-printable

      Some text with chars that need encoding:=C2=A3

      It also works with non-QP charsets.

      The reason the subclass is needed is because the
      MIMNonMultipart .__init__ cannot handle a charset which isn't a simple
      string. Since it's needed for that reason, it seems like the right place
      to encode any incoming unicode.

      So, by my count, there are two bugs:

      1. email.MIMEText. MIMEText can't take a real Charset object to its
      __init__ method.

      2. email.Message.M essage.set_payl oad has no clue about unicode.

      Does that sounds fair? If so, should I open SF issues for them?

      cheers,

      Chris

      --
      Simplistix - Content Management, Zope & Python Consulting
      - http://www.simplistix.co.uk

      Comment

      • Max M

        #18
        Re: Problems with email.Generator .Generator

        Chris Withers wrote:
        Chris Withers wrote:
        >print msg.as_string()
        >>
        >MIME-Version: 1.0
        >Content-Type: text/plain; charset; charset="utf-8"
        ^^^^^^^
        Actually, even this isn't correct as you can see above...
        >
        >charset = Charset('utf-8')
        >msg = MIMEText('','pl ain',None)
        >msg.set_payloa d(u'Some text with chars that need encoding:\xa3', charset)
        Has no-one ever successfully generated a correctly formatted email with
        email.MIMEText where the message includes non-ascii characters?!
        What is the problem with encoding the message as utf-8 before setting
        the payload? That has always worked for me.


        pl = u'Some text with chars that need encoding:\xa3'. encode('utf-8')
        msg.set_payload (pl ,charset)

        From the docs:

        """
        The payload is either a string in the case of simple message objects or
        a list of Message objects for MIME container documents (e.g. multipart/*
        and message/rfc822)
        """

        Message objects are always encoded strings. I don't remember seeing that
        it should be possible to use a unicode string as a message.

        The charset passed in set_payload(pl ,charset) is the charset the the
        string *is* encoded in. Not the charset it *should* be encoded in.

        --

        hilsen/regards Max M, Denmark


        IT's Mad Science

        Phone: +45 66 11 84 94
        Mobile: +45 29 93 42 96

        Comment

        • Peter Otten

          #19
          Re: Problems with email.Generator .Generator

          Chris Withers wrote:
          Okay, more out of desperation than anything else, lets try this:
          >
          from email.Charset import Charset,QP
          from email.MIMEText import MIMEText
          from StringIO import StringIO
          from email import Generator,Messa ge
          Generator.Strin gIO = Message.StringI O = StringIO
          charset = Charset('utf-8')
          charset.body_en coding = QP
          msg = MIMEText(u'Some text with chars that need encoding: \xa3','plain')
          msg.set_charset (charset)
          print repr(msg.as_str ing())
          u'MIME-Version: 1.0\nContent-Transfer-Encoding: 8bit\nContent-Type:
          text/plain; charset="utf-8"\n\nSome text with chars that need encoding:
          \xa3'
          >
          Yay! No unicode error, but also no use:
          >
          File "c:\python24\li b\smtplib.py", line 692, in sendmail
          (code,resp) = self.data(msg)
          File "c:\python24\li b\smtplib.py", line 489, in data
          self.send(q)
          File "c:\python24\li b\smtplib.py", line 316, in send
          self.sock.senda ll(str)
          File "<string>", line 1, in sendall
          UnicodeEncodeEr ror: 'ascii' codec can't encode character u'\xa3' in
          position 297: ordinal not in range(128)
          Yes, it seemed to work with your original example, but of course you have to
          encode unicode somehow before sending it through a wire. A severe case of
          peephole debugging, sorry. I've looked into the email package source once
          more, but I fear to understand the relevant parts you have to understand it
          wholesale.

          As Max suggested, your safest choice is probably passing in utf-8 instead of
          unicode.

          Peter

          Comment

          • Chris Withers

            #20
            Re: Problems with email.Generator .Generator

            Max M wrote:
            From the docs:
            >
            """
            The payload is either a string in the case of simple message objects or
            a list of Message objects for MIME container documents (e.g. multipart/*
            and message/rfc822)
            """
            Where'd you find that? I must have missed it in my digging :-S
            Message objects are always encoded strings. I don't remember seeing that
            it should be possible to use a unicode string as a message.
            Yes, I guess I just find that surprising in today's "everything should
            be unicode" world.
            The charset passed in set_payload(pl ,charset) is the charset the the
            string *is* encoded in. Not the charset it *should* be encoded in.
            Indeed, although there's still the bug that while set_payload can accept
            a Charset instance for its _charset parameter, the __init__ method for
            MIMENonMultipar t cannot.

            Incidentally, here's the class I finally ended up with:

            from email.Charset import Charset
            from email.MIMEText import MIMEText as OriginalMIMETex t
            from email.MIMENonMu ltipart import MIMENonMultipar t

            class MTText(Original MIMEText):

            def __init__(self, _text, _subtype='plain ', _charset='us-ascii'):
            if not isinstance(_cha rset,Charset):
            _charset = Charset(_charse t)
            if isinstance(_tex t,unicode):
            _text = _text.encode(_c harset.input_ch arset)
            MIMENonMultipar t.__init__(self , 'text', _subtype,
            **{'charset': _charset.input_ charset})
            self.set_payloa d(_text, _charset)

            cheers,

            Chris

            --
            Simplistix - Content Management, Zope & Python Consulting
            - http://www.simplistix.co.uk

            Comment

            • Max M

              #21
              Re: Problems with email.Generator .Generator

              Chris Withers wrote:
              Max M wrote:
              > From the docs:
              >>
              >"""
              >The payload is either a string in the case of simple message objects
              >or a list of Message objects for MIME container documents (e.g.
              >multipart/* and message/rfc822)
              >"""
              >
              Where'd you find that? I must have missed it in my digging :-S

              End of third paragraph:




              --

              hilsen/regards Max M, Denmark


              IT's Mad Science

              Phone: +45 66 11 84 94
              Mobile: +45 29 93 42 96

              Comment

              Working...