Yet another binary file question

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • KevHill
    New Member
    • Jun 2007
    • 6

    Yet another binary file question

    I know people here have answered very similar questions before, I can find some of the answers, but I'm looking for a comprehensive overview on how to deal with binary files.

    My basic problem is that I am reverse engineering a binary file from a particular program. Sometimes I need to do odd things, like get a byte offset from a 3-byte chunk which has big-endian bytes, but little-endian bits within each byte, or use a multi-byte segment as a bit-mask to determine which fields are present.

    I think I could write my own convoluted code, that would take an array of bytes, change it to dec using ord() then change that to binary using code I've found here, flipping the order when needed, then back to decimal, but it all seems rather clunky (for example, why have a decimal intermediate). Also, I feel like someone probably already wrote this somewhere and probably in a more elegant fashion than I could.

    Can anyone point me in the right direction? Thanks.

    Oh, and while I've seen struct, but I can't seem to figure out how to make it unpack how I would like it to. For example, as above sometimes I need to deal with 3 unsigned bytes, which doesn't work with H (short, 2 bytes) or L (long, 4 bytes).
  • bartonc
    Recognized Expert Expert
    • Sep 2006
    • 6478

    #2
    Originally posted by KevHill
    <snip>Oh, and while I've seen struct, but I can't seem to figure out how to make it unpack how I would like it to. For example, as above sometimes I need to deal with 3 unsigned bytes, which doesn't work with H (short, 2 bytes) or L (long, 4 bytes).
    More later, but briefly, what's wrong with B (unsigned char)? As in 'BBB'.

    Comment

    • KevHill
      New Member
      • Jun 2007
      • 6

      #3
      Originally posted by bartonc
      More later, but briefly, what's wrong with B (unsigned char)? As in 'BBB'.
      For some reason I assumed it would actually spit back a char and not a number, silly me.

      However, it's really not any better than ord() because it doesn't combine the bytes

      for example unpack('BB','\x DA\x01') would give back (218, 1) instead of 55809

      maybe if I can't figure out anything else I'll just add on an extra '\x00' and then treat it as an unsigned long...

      Comment

      • bartonc
        Recognized Expert Expert
        • Sep 2006
        • 6478

        #4
        Originally posted by KevHill
        For some reason I assumed it would actually spit back a char and not a number, silly me.

        However, it's really not any better than ord() because it doesn't combine the bytes

        for example unpack('BB','\x DA\x01') would give back (218, 1) instead of 55809

        maybe if I can't figure out anything else I'll just add on an extra '\x00' and then treat it as an unsigned long...
        Maybe I'm not following along, or maybe it's simpler than you are seeing:[CODE=python]
        >>> val = '\xDA\x01'
        >>> import struct
        >>> valasbytes = struct.unpack(' BB', val)
        >>> valasbytes
        (218, 1)
        >>> valasword = struct.unpack(' >H', val)
        >>> valasword
        (55809,)
        >>> [/CODE]In the case of padding, you may need something along these lines:[CODE=python]
        >>> valasuint = struct.unpack(' >I', "\x00\x00%s " %val)
        >>> valasuint
        (55809L,)
        >>> [/CODE]

        Comment

        Working...