struct unpack newline

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • grant@idscape.co.za

    struct unpack newline

    Hi All,

    I am pretty new to python and am having a problem
    intepreting binary data using struct.unpack.
    I am reading a file containing binary packed data
    using open with "rb". All the values are coming through
    fine when using (integer1,) = struct.unpack(' l', line[86:90])
    except when line[86:90] contains "carriage-return" "linefeed"
    which are valid binary packed values. Error = unpack
    string size dows not match format. It seems that
    struct, instead of reading 4 bytes for line[86:90]
    only reads 2 bytes if the second byte is CR or LF.

    Thanks
    Grant

  • Fredrik Lundh

    #2
    Re: struct unpack newline

    grant@idscape.c o.za wrote:
    [color=blue]
    > I am pretty new to python and am having a problem
    > intepreting binary data using struct.unpack.
    > I am reading a file containing binary packed data
    > using open with "rb". All the values are coming through
    > fine when using (integer1,) = struct.unpack(' l', line[86:90])
    > except when line[86:90] contains "carriage-return" "linefeed"
    > which are valid binary packed values. Error = unpack
    > string size dows not match format. It seems that
    > struct, instead of reading 4 bytes for line[86:90]
    > only reads 2 bytes if the second byte is CR or LF.[/color]

    verifying that struct doesn't care about newlines is of course
    pretty trivial:
    [color=blue][color=green][color=darkred]
    >>> import struct
    >>> struct.unpack(" l", "\0\0\0\0")[/color][/color][/color]
    (0,)[color=blue][color=green][color=darkred]
    >>> struct.unpack(" l", "\0\r\0\0")[/color][/color][/color]
    (3328,)[color=blue][color=green][color=darkred]
    >>> struct.unpack(" l", "\0\n\0\0")[/color][/color][/color]
    (2560,)

    have you verified that len(line) really is what you think?
    [color=blue][color=green][color=darkred]
    >>> struct.unpack(" l", "\0\r\n\0")[/color][/color][/color]
    (658688,)[color=blue][color=green][color=darkred]
    >>> struct.unpack(" l", "\0\n\0")[/color][/color][/color]
    Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    struct.error: unpack str size does not match format

    </F>



    Comment

    • Richard Brodie

      #3
      Re: struct unpack newline


      <grant@idscape. co.za> wrote in message news:1117189007 .980813.182830@ g14g2000cwa.goo glegroups.com.. .
      [color=blue]
      > except when line[86:90] contains "carriage-return" "linefeed"
      > which are valid binary packed values.[/color]

      You probably don't want to be reading binary data a
      line at a time, if that's what you're doing.


      Comment

      • grant@idscape.co.za

        #4
        Re: struct unpack newline

        Hi ,

        Thanks for the tip regarding checking the length of line. I discovered
        that on the problem record it was short by a few bytes. After changing
        the read method from "for line in.." to "infile.read(n) " my problem was
        solved,
        what concerns me though is that although the file is opened in binary
        mode,
        "for line.." has a problem reading the file correctly.

        Thanks
        Grant


        Fredrik Lundh wrote:[color=blue]
        > grant@idscape.c o.za wrote:
        >[color=green]
        > > I am pretty new to python and am having a problem
        > > intepreting binary data using struct.unpack.
        > > I am reading a file containing binary packed data
        > > using open with "rb". All the values are coming through
        > > fine when using (integer1,) = struct.unpack(' l', line[86:90])
        > > except when line[86:90] contains "carriage-return" "linefeed"
        > > which are valid binary packed values. Error = unpack
        > > string size dows not match format. It seems that
        > > struct, instead of reading 4 bytes for line[86:90]
        > > only reads 2 bytes if the second byte is CR or LF.[/color]
        >
        > verifying that struct doesn't care about newlines is of course
        > pretty trivial:
        >[color=green][color=darkred]
        > >>> import struct
        > >>> struct.unpack(" l", "\0\0\0\0")[/color][/color]
        > (0,)[color=green][color=darkred]
        > >>> struct.unpack(" l", "\0\r\0\0")[/color][/color]
        > (3328,)[color=green][color=darkred]
        > >>> struct.unpack(" l", "\0\n\0\0")[/color][/color]
        > (2560,)
        >
        > have you verified that len(line) really is what you think?
        >[color=green][color=darkred]
        > >>> struct.unpack(" l", "\0\r\n\0")[/color][/color]
        > (658688,)[color=green][color=darkred]
        > >>> struct.unpack(" l", "\0\n\0")[/color][/color]
        > Traceback (most recent call last):
        > File "<stdin>", line 1, in ?
        > struct.error: unpack str size does not match format
        >
        > </F>[/color]

        Comment

        • Peter Otten

          #5
          Re: struct unpack newline

          grant@idscape.c o.za wrote:
          [color=blue]
          > what concerns me though is that although the file is opened in binary
          > mode,
          > "for line.." has a problem reading the file correctly.[/color]

          There is _no_ correct way of splitting a file containing binary data in
          lines because binary data may contain newline bytes that do not indicate a
          new line. E. g.
          [color=blue][color=green][color=darkred]
          >>> struct.unpack(" l", "\r\n\r\n")[/color][/color][/color]
          (168626701,)

          how should Python know whether it just encountered two empty lines or the
          integer 168626701?

          Peter

          Comment

          • grant@idscape.co.za

            #6
            Re: struct unpack newline

            Good point. Hadn't thouhgt of that.

            Thanks
            Grant

            Comment

            Working...