Extracting email attachment when is_multipart() is False

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Davor Cengija

    Extracting email attachment when is_multipart() is False

    I need to write a script which should extract the attachment from a text
    file, which is saved as MIME mail message. Unfortunatelly,
    Message.is_mult ipart() returns False so msg.get_payload () returns the
    complete message. What I need is the attachment only. Is it possible to do
    that with standard email package without the actual string level parsing?

    This is how my file/message looks like:

    ====== start here ========
    This is a multi-part message in MIME format.

    ------=_NextPart_000_ 0026_01C3B347.D BEA9660
    Content-Type: text/plain;
    charset="us-ascii"
    Content-Transfer-Encoding: 7bit

    CONTENT

    signature, etc

    ------=_NextPart_000_ 0026_01C3B347.D BEA9660
    Content-Type: application/octet-stream;
    name="filename. csv"
    Content-Transfer-Encoding: 7bit
    Content-Disposition: attachment;
    filename="filen ame.csv"

    10012;20031118; 292.67;4
    101;23;19.98;2; 39.96
    102;24;21.89;4; 87.56

    ------=_NextPart_000_ 0026_01C3B347.D BEA9660--

    ====== end here ========

    So, I obviously need this part only:

    10012;20031118; 292.67;4
    101;23;19.98;2; 39.96
    102;24;21.89;4; 87.56

    Python 2.3.2 on windows.

    Thanks and regards,

    Davor


  • John J. Lee

    #2
    Re: Extracting email attachment when is_multipart() is False

    "Davor Cengija" <dcengija_IQ_Fi lter@inet.hr> writes:
    [color=blue]
    > I need to write a script which should extract the attachment from a text
    > file, which is saved as MIME mail message. Unfortunatelly,
    > Message.is_mult ipart() returns False so msg.get_payload () returns the[/color]
    [...][color=blue]
    > This is how my file/message looks like:
    >
    > ====== start here ========
    > This is a multi-part message in MIME format.
    >
    > ------=_NextPart_000_ 0026_01C3B347.D BEA9660
    > Content-Type: text/plain;[/color]
    [...]

    You seem to be missing the RFC 822 headers (From, To, Subject, etc.).


    John

    Comment

    • Davor Cengija

      #3
      Re: Extracting email attachment when is_multipart() is False

      John J. Lee wrote:[color=blue]
      > "Davor Cengija" <dcengija_IQ_Fi lter@inet.hr> writes:[color=green]
      >> This is a multi-part message in MIME format.
      >>
      >> ------=_NextPart_000_ 0026_01C3B347.D BEA9660
      >> Content-Type: text/plain;[/color]
      > [...]
      >
      > You seem to be missing the RFC 822 headers (From, To, Subject, etc.).[/color]

      Yes, that's true. The question is if it's easier to write a parser for that
      kind of messages or to force the message producing application to output the
      headers as well. We'll see...

      Thanks


      Comment

      • Derrick 'dman' Hudson

        #4
        Re: Extracting email attachment when is_multipart() is False

        On Wed, 26 Nov 2003 08:06:15 +0100, Davor Cengija wrote:[color=blue]
        > John J. Lee wrote:[color=green]
        >> "Davor Cengija" <dcengija_IQ_Fi lter@inet.hr> writes:[color=darkred]
        >>> This is a multi-part message in MIME format.
        >>>
        >>> ------=_NextPart_000_ 0026_01C3B347.D BEA9660
        >>> Content-Type: text/plain;[/color]
        >> [...]
        >>
        >> You seem to be missing the RFC 822 headers (From, To, Subject, etc.).[/color]
        >
        > Yes, that's true. The question is if it's easier to write a parser for that
        > kind of messages or to force the message producing application to output the
        > headers as well. We'll see...[/color]

        You have a third option, which I would try if you can't get the
        message producer to do it correctly: slap some RFC822 headers on the
        beginning, and then ignore them in the parsed message object. After
        all, if the rest of the data is correctly formatted, use the existing
        tested MIME parser. Prepending some "bogus" RFC822 headers would be
        rather trivial to do.

        -D

        --
        "Piracy is not a technological issue. It's a behavior issue."
        --Steve Jobs

        www: http://dman13.dyndns.org/~dman/ jabber: dman@dman13.dyn dns.org

        Comment

        • John J. Lee

          #5
          Re: Extracting email attachment when is_multipart() is False

          Derrick 'dman' Hudson <dman@dman13.dy ndns.org> writes:[color=blue]
          > On Wed, 26 Nov 2003 08:06:15 +0100, Davor Cengija wrote:[/color]
          [...][color=blue][color=green][color=darkred]
          > >> You seem to be missing the RFC 822 headers (From, To, Subject, etc.).[/color]
          > >
          > > Yes, that's true. The question is if it's easier to write a parser for that
          > > kind of messages or to force the message producing application to output the
          > > headers as well. We'll see...[/color]
          >
          > You have a third option, which I would try if you can't get the
          > message producer to do it correctly: slap some RFC822 headers on the
          > beginning, and then ignore them in the parsed message object. After[/color]
          [...]

          Or read the docs & code for the email module, to figure out how to
          persuade it to take the messages without the headers.


          John

          Comment

          Working...