Parsing a struct with bytes

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • ssubbarayan

    Parsing a struct with bytes

    Dear all,
    I developed the following program:

    void parsebytes(unsi gned char* data);

    struct info
    {
    unsigned char day;
    unsigned char month;
    short year;
    };

    struct info info1;
    struct info info2;

    int
    main(int argc, char *argv[])
    {
    info1.day=12;
    info1.month=8;
    info1.year=2007 ;

    parsebytes((uns igned char*)&info1);
    system("PAUSE") ;
    return EXIT_SUCCESS;
    }

    void parsebytes(unsi gned char* data)
    {
    printf("day is %d\n", data[0]);
    printf("month is %d\n", data[1]);
    printf("year is %d\n", ((data[2] << 8) | data[3]));
    }

    The above program gives proper value of 12,8 for day and month.But
    year value I always get junk.What should be done to correct this and
    where have I gone wrong?

    Looking farward for your replies and advanced thanks,

    Regards,
    s.subbarayan

  • Jens Thoms Toerring

    #2
    Re: Parsing a struct with bytes

    ssubbarayan <ssubba@gmail.c omwrote:
    I developed the following program:
    void parsebytes(unsi gned char* data);
    struct info
    {
    unsigned char day;
    unsigned char month;
    short year;
    };
    struct info info1;
    struct info info2;
    int
    main(int argc, char *argv[])
    {
    info1.day=12;
    info1.month=8;
    info1.year=2007 ;
    parsebytes((uns igned char*)&info1);
    system("PAUSE") ;
    return EXIT_SUCCESS;
    }
    void parsebytes(unsi gned char* data)
    {
    printf("day is %d\n", data[0]);
    printf("month is %d\n", data[1]);
    printf("year is %d\n", ((data[2] << 8) | data[3]));
    }
    The above program gives proper value of 12,8 for day and month.But
    year value I always get junk.What should be done to correct this and
    where have I gone wrong?
    There are at least two aspects that lead to problems. First
    of all you can't assume that the members of a structure are
    all following each other directly without any "spacing" in
    between. A compiler is allowed to put as many "padding bytes"
    as it want's between the members of a structure. This normally
    happens due to alignement issues - some types of variables
    can't start at arbitrary addresses and the compiler must make
    sure that those members start at allowed addresses.

    The second problem is that you make some assumptions about
    the way a short int is stored in memory which might be wrong.
    You assume that a short int only consists of two bytes and that
    the most-significant byte is stored at a lower address than the
    least-significant byte. Both assumptions can be correct on your
    machine but they don't are generally correct. While a short int
    requires at least two 8-bit bytes (but there are also machines
    with more bits in a byte, e.g. 16 bits, so a short int may be
    stored in a single byte) it can be longer. And the assumption
    about the ordering of the two bytes is, assuming that two 8-bit
    bytes are used, only true on big-endian machines, on many (low-
    endian) machines the least-significant byte is stored at the
    lower address.
    Regards, Jens
    --
    \ Jens Thoms Toerring ___ jt@toerring.de
    \______________ ____________ http://toerring.de

    Comment

    • Bartc

      #3
      Re: Parsing a struct with bytes


      "Chad" <cdalten@gmail. comwrote in message
      news:d5246df9-8f65-4b55-a1d5-9b11369f6455@r3 5g2000prm.googl egroups.com...
      On Aug 11, 6:25 am, Richard Heathfield <r...@see.sig.i nvalidwrote:
      info1.year=2007 ;
      >Let me guess: 55047?
      How did you know that it was 55047?
      That's what you get when you swap the 2 bytes of 2007.

      --
      Bartc

      Comment

      • Chad

        #4
        Re: Parsing a struct with bytes

        On Aug 11, 8:03 am, "Bartc" <b...@freeuk.co mwrote:
        "Chad" <cdal...@gmail. comwrote in message
        >
        news:d5246df9-8f65-4b55-a1d5-9b11369f6455@r3 5g2000prm.googl egroups.com...
        On Aug 11, 6:25 am, Richard Heathfield <r...@see.sig.i nvalidwrote:
        >
        info1.year=2007 ;
        Let me guess: 55047?
        How did you know that it was 55047?
        >
        That's what you get when you swap the 2 bytes of 2007.
        >
        But yet

        printf("day is %d\n", data[0]);
        printf("month is %d\n", data[1]);

        Produced the 'correct'values . Do I dare ask why.

        Chad

        Comment

        • Richard Heathfield

          #5
          Re: Parsing a struct with bytes

          Chad said:
          On Aug 11, 8:03 am, "Bartc" <b...@freeuk.co mwrote:
          >"Chad" <cdal...@gmail. comwrote in message
          >>
          >>
          news:d5246df9-8f65-4b55-a1d5-9b11369f6455@r3 5g2000prm.googl egroups.com...
          >On Aug 11, 6:25 am, Richard Heathfield <r...@see.sig.i nvalidwrote:
          >>
          info1.year=2007 ;
          >Let me guess: 55047?
          How did you know that it was 55047?
          >>
          >That's what you get when you swap the 2 bytes of 2007.
          >>
          >
          But yet
          >
          printf("day is %d\n", data[0]);
          printf("month is %d\n", data[1]);
          >
          Produced the 'correct'values . Do I dare ask why.
          info1.day and info1.month are unsigned chars, which are by definition a
          single byte wide. It's tricky to get a single byte the wrong way round.
          (Yes, it can be done, with practice - but it's tricky.)

          --
          Richard Heathfield <http://www.cpax.org.uk >
          Email: -http://www. +rjh@
          Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
          "Usenet is a strange place" - dmr 29 July 1999

          Comment

          • Ben Bacarisse

            #6
            Re: Parsing a struct with bytes

            ssubbarayan <ssubba@gmail.c omwrites:
            On Aug 12, 12:50 pm, Nick Keighley <nick_keighley_ nos...@hotmail. com>
            wrote:
            <snip>
            >and so? Is that not what you wanted? I repeat, what are you
            >trying to do? What output do you want?
            >>
            >--
            >Nick Keighley- Hide quoted text -
            Please edit out sig blocks like this.
            I was expecting it to print 2007.But due to wrong byte swapping,it was
            showing 55047 instead of 2007.
            The idea behind asking this question is,I have got stream of data in
            different data types and already an existing function recieves it and
            parses it by bytes.I was trying to experiment with the structure and
            see If I could extract it byte wise and get correct values.Incase I am
            successful,I would go ahead and implement the same in our product.
            This sounds like you have concluded that you should not use this
            method, but you are 100% right. In most cases, a program that is to
            read and interpret some external, binary, data format should read it
            in bytes and put these together using shifts (or arithmetic). The
            format of the data usually dictates which numbered byte is the least
            significant and you use this to determine how to put the value back
            together.

            The "other" way -- where the program just takes the data an "overlays"
            it onto an in-memory object is not portable. It can even break
            between compiler releases if structure padding is changed, for
            example. I wanted to assure you that you have probably got it right,
            despite getting the wrong answer!

            --
            Ben.

            Comment

            • Keith Thompson

              #7
              Re: Parsing a struct with bytes

              Bart <bc@freeuk.comw rites:
              On Aug 12, 9:44 am, ssubbarayan <ssu...@gmail.c omwrote:
              [...]
              >I was expecting it to print 2007.But due to wrong byte swapping,it was
              >showing 55047 instead of 2007.
              >The idea behind asking this question is,I have got stream of data in
              >different data types and already an existing function recieves it and
              >parses it by bytes.
              >
              In this particular case you might know the year must be, say, 1900 to
              2100. Any byte swapping will give year values outside this range
              (except for year 2056 which is unaffected). In that case just reverse
              the two bytes when the year is not already 1900 to 2100.
              Sure, that will work in this particular case, but it's not a great
              technique in general. Tomorrow you might have to handle data values
              that are still plausible after being byte-swapped.

              A more general approach is either to know in advance what the byte
              ordering of the input file happens to be, or, if that's not feasible,
              to have something in the file with a known value that will
              unambigously tell you the byte ordering.

              For example, a file might contain fixed fields containing the values
              0x0102 (16 bits) and 0x01020304 (32 bits). Examining the values of
              those fields should tell you what adjustments you need to perform for
              other 16-bit and 32-bit fields.

              Note that there are byte orderings others than big-endian and
              little-endian. For a 32-bit value, 0x02010403 is a possibility on
              some systems (but not on any systems you're particularly likely to
              encounter).

              For reasonable portability, the only values you probably need to worry
              about are (0x0102, 0x01020304) and (0x0201, 0x04030201), as long as
              you treat any other values as an error.

              --
              Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
              Nokia
              "We must do something. This is something. Therefore, we must do this."
              -- Antony Jay and Jonathan Lynn, "Yes Minister"

              Comment

              Working...