Accessing individual bytes of an integer

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Daniel Lidström

    Accessing individual bytes of an integer

    Hello!

    I want to work with individual bytes of integers. I know that ints are
    32-bit and will always be. Sometimes I want to work with the entire
    32-bits, and other times I want to modify just the first 8-bits for
    example. For me, I think it would be best if I can declare the 32-bits
    like this:

    unsigned char bits[4];

    When I want to treat this as a 32-bits integer, can I do something
    like this?

    unsigned int bits32 = *((unsigned int*)bits);

    I'm unsure of the syntax. I don't need to work in-place so to speak. It is
    fine to work with a copy.

    Thanks in advance!

    --
    Daniel

  • Andrew Koenig

    #2
    Re: Accessing individual bytes of an integer

    "Daniel Lidström" <someone@micros oft.com> wrote in message
    news:pan.2005.0 2.17.18.16.21.8 19630@microsoft .com...
    [color=blue]
    > When I want to treat this as a 32-bits integer, can I do something
    > like this?
    >
    > unsigned int bits32 = *((unsigned int*)bits);[/color]

    Yes you can, but you have absolutely no assurance as to what the results
    will be :-)

    What's wrong with

    (bits>>n) & 0xff

    where n is 0, 8, 16, or 24?


    Comment

    • MatrixV

      #3
      Re: Accessing individual bytes of an integer


      "Daniel Lidström" <someone@micros oft.com> ????
      news:pan.2005.0 2.17.18.16.21.8 19630@microsoft .com...[color=blue]
      > Hello!
      >
      > I want to work with individual bytes of integers. I know that ints are
      > 32-bit and will always be. Sometimes I want to work with the entire
      > 32-bits, and other times I want to modify just the first 8-bits for
      > example. For me, I think it would be best if I can declare the 32-bits
      > like this:
      >
      > unsigned char bits[4];
      >
      > When I want to treat this as a 32-bits integer, can I do something
      > like this?
      >
      > unsigned int bits32 = *((unsigned int*)bits);
      >
      > I'm unsure of the syntax. I don't need to work in-place so to speak. It is
      > fine to work with a copy.
      >
      > Thanks in advance!
      >
      > --
      > Daniel
      >[/color]

      Unconsidering the byte sequence, you are correct.
      A better way is using a union like:
      union xxx
      {
      unsigned char bits[4];
      unsigned int i;
      };


      Comment

      • Thomas Matthews

        #4
        Re: Accessing individual bytes of an integer

        MatrixV wrote:[color=blue]
        > "Daniel Lidström" <someone@micros oft.com> ????
        > news:pan.2005.0 2.17.18.16.21.8 19630@microsoft .com...
        >[color=green]
        >>Hello!
        >>
        >>I want to work with individual bytes of integers. I know that ints are
        >>32-bit and will always be. Sometimes I want to work with the entire
        >>32-bits, and other times I want to modify just the first 8-bits for
        >>example. For me, I think it would be best if I can declare the 32-bits
        >>like this:
        >>
        >>unsigned char bits[4];
        >>
        >>When I want to treat this as a 32-bits integer, can I do something
        >>like this?
        >>
        >>unsigned int bits32 = *((unsigned int*)bits);
        >>
        >>I'm unsure of the syntax. I don't need to work in-place so to speak. It is
        >>fine to work with a copy.
        >>
        >>Thanks in advance!
        >>
        >>--
        >>Daniel
        >>[/color]
        >
        >
        > Unconsidering the byte sequence, you are correct.
        > A better way is using a union like:
        > union xxx
        > {
        > unsigned char bits[4];
        > unsigned int i;
        > };
        >
        >[/color]

        How about this:
        union xxx
        {
        unsigned char bytes[sizeof(unsigned int))];
        unsigned int i;
        };
        This makes no assumptions about how many bytes are
        in an integer.


        --
        Thomas Matthews

        C++ newsgroup welcome message:

        C++ Faq: http://www.parashift.com/c++-faq-lite
        C Faq: http://www.eskimo.com/~scs/c-faq/top.html
        alt.comp.lang.l earn.c-c++ faq:

        Other sites:
        http://www.josuttis.com -- C++ STL Library book
        http://www.sgi.com/tech/stl -- Standard Template Library

        Comment

        • Andrew Koenig

          #5
          Re: Accessing individual bytes of an integer

          "MatrixV" <training@kcoll ege.com> wrote in message
          news:37k6j3F5ef altU1@individua l.net...
          [color=blue]
          > Unconsidering the byte sequence, you are correct.
          > A better way is using a union like:
          > union xxx
          > {
          > unsigned char bits[4];
          > unsigned int i;
          > };[/color]

          Not really. When you use a union, you have no assurance about the effect
          that giving a value to one member of a union will have on other members.


          Comment

          • Old Wolf

            #6
            Re: Accessing individual bytes of an integer

            MatrixV wrote:[color=blue]
            > "Daniel Lidström" <someone@micros oft.com> ????[color=green]
            > >
            > > I want to work with individual bytes of integers. I know that ints[/color][/color]
            are[color=blue][color=green]
            > > 32-bit and will always be. Sometimes I want to work with the entire
            > > 32-bits, and other times I want to modify just the first 8-bits for
            > > example. For me, I think it would be best if I can declare the[/color][/color]
            32-bits[color=blue][color=green]
            > > like this:
            > >
            > > unsigned char bits[4];
            > >
            > > When I want to treat this as a 32-bits integer, can I do something
            > > like this?
            > >
            > > unsigned int bits32 = *((unsigned int*)bits);[/color][/color]

            Bad - if 'bits' is not correctly aligned for an int, then
            you have undefined behaviour.
            [color=blue][color=green]
            > > I'm unsure of the syntax. I don't need to work in-place so to[/color][/color]
            speak.[color=blue][color=green]
            > > It is fine to work with a copy.[/color][/color]

            You can work in-place with:
            unsigned int bits32;
            and then to access the chars:
            ((unsigned char *)&bits32)[0]
            etc. Note that the contents of the chars could be anything
            (eg. big endian, little endian, or something more exotic),
            and if you modify one of those chars then you aren't guaranteed
            to have anything sensible left in bits32.

            If you don't want to work in-place then you could memcpy
            between the int and the char (with the same caveats I mentioned
            already).

            To work portably (assuming a 32-bit int and 8-bit char),
            you can use bit-shifts and masks to extract the four bytes
            and replace them. A good compiler would optimise this code
            into a single instruction, if it could.
            [color=blue]
            > Unconsidering the byte sequence, you are correct.
            > A better way is using a union like:
            > union xxx
            > {
            > unsigned char bits[4];
            > unsigned int i;
            > };[/color]

            Undefined behaviour if you access a member of a union that
            wasn't the one you just set.

            Comment

            • Clark S. Cox III

              #7
              Re: Accessing individual bytes of an integer

              On 2005-02-17 15:38:50 -0500, "Andrew Koenig" <ark@acm.org> said:
              [color=blue]
              > "MatrixV" <training@kcoll ege.com> wrote in message
              > news:37k6j3F5ef altU1@individua l.net...
              >[color=green]
              >> Unconsidering the byte sequence, you are correct.
              >> A better way is using a union like:
              >> union xxx
              >> {
              >> unsigned char bits[4];
              >> unsigned int i;
              >> };[/color]
              >
              > Not really. When you use a union, you have no assurance about the
              > effect that giving a value to one member of a union will have on other
              > members.[/color]

              You do when one of them is an array of unsigned char.

              --
              Clark S. Cox, III
              clarkcox3@gmail .com

              Comment

              • Ioannis Vranos

                #8
                Re: Accessing individual bytes of an integer

                Daniel Lidström wrote:
                [color=blue]
                > Hello!
                >
                > I want to work with individual bytes of integers. I know that ints are
                > 32-bit and will always be. Sometimes I want to work with the entire
                > 32-bits, and other times I want to modify just the first 8-bits for
                > example. For me, I think it would be best if I can declare the 32-bits
                > like this:
                >
                > unsigned char bits[4];
                >
                > When I want to treat this as a 32-bits integer, can I do something
                > like this?
                >
                > unsigned int bits32 = *((unsigned int*)bits);[/color]


                Yes but not like this because array bits is not initialised.


                [color=blue]
                > I'm unsure of the syntax. I don't need to work in-place so to speak. It is
                > fine to work with a copy.[/color]


                What you can do is read an unsigned int or any other POD type as a
                sequence of unsigned chars (or plain chars) - that is bytes, copy it
                byte by byte to another unsigned char sequence (which includes possible
                padding bits), and deal the new char sequence as another unsigned int.


                The following example uses an int and is portable:


                #include <iostream>

                int main()
                {
                int integer=0;

                unsigned char *puc= reinterpret_cas t<unsigned char *>(&integer);


                unsigned char otherInt[sizeof(integer)];

                // Read integer byte by byte and copy it to otherInt
                for(unsigned i=0; i<sizeof(intege r); ++i)
                otherInt[i]= puc[i];


                // We treat the new unsigned char sequence as an int
                int *p= reinterpret_cas t<int *>(otherInt);

                // Assign another value to the integer otherInt!
                *p=7;

                std::cout<<*p<< "\n";
                }




                --
                Ioannis Vranos


                Comment

                • Jack Klein

                  #9
                  Re: Accessing individual bytes of an integer

                  On Thu, 17 Feb 2005 19:16:24 +0100, Daniel Lidström
                  <someone@micros oft.com> wrote in comp.lang.c++:
                  [color=blue]
                  > Hello!
                  >
                  > I want to work with individual bytes of integers. I know that ints are
                  > 32-bit and will always be.[/color]

                  No, you don't. You just think you do. But you are mistaken.

                  --
                  Jack Klein
                  Home: http://JK-Technology.Com
                  FAQs for
                  comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
                  comp.lang.c++ http://www.parashift.com/c++-faq-lite/
                  alt.comp.lang.l earn.c-c++

                  Comment

                  • Andrew Koenig

                    #10
                    Re: Accessing individual bytes of an integer

                    >> Not really. When you use a union, you have no assurance about the effect[color=blue][color=green]
                    >> that giving a value to one member of a union will have on other members.[/color][/color]
                    [color=blue]
                    > You do when one of them is an array of unsigned char.[/color]

                    Can you show me where in the C++ standard it says that? The text that I
                    think is relevant can be found in subclause 9.5:

                    In a union, at most one of the data members can be active at any time, that
                    is, the value of at most one of the data members can be stored in a union at
                    any time. [Note: one special guarantee is made in order to simplify the use
                    of unions: If a POD-union contains several POD-structs that share a common
                    initial sequence (9.2), and if an object of this POD-union type contains one
                    of the POD-structs, it is permitted to inspect the common initial sequence
                    of any of POD-struct members; see 9.2. ]

                    I think that "Only one of the data members can be active at any time" is
                    pretty clear, and the one exception to that rule says nothing about array of
                    unsigned character.





                    Comment

                    • Larry Brasfield

                      #11
                      Re: Accessing individual bytes of an integer

                      "Ioannis Vranos" <ivr@remove.thi s.grad.com> wrote in message
                      news:1108680624 .856546@athnrd0 2...
                      ....[color=blue]
                      > What you can do is read an unsigned int or any other POD type as a sequence of unsigned chars (or plain chars) - that is bytes,
                      > copy it byte by byte to another unsigned char sequence (which includes possible padding bits), and deal the new char sequence as
                      > another unsigned int.
                      >
                      >
                      > The following example uses an int and is portable:[/color]

                      I have to disagree with your "is portable" claim.
                      Comments inserted below in your code.
                      [color=blue]
                      > #include <iostream>
                      >
                      > int main()
                      > {
                      > int integer=0;
                      >
                      > unsigned char *puc= reinterpret_cas t<unsigned char *>(&integer);
                      >
                      >
                      > unsigned char otherInt[sizeof(integer)];
                      >
                      > // Read integer byte by byte and copy it to otherInt
                      > for(unsigned i=0; i<sizeof(intege r); ++i)
                      > otherInt[i]= puc[i];
                      >
                      >
                      > // We treat the new unsigned char sequence as an int
                      > int *p= reinterpret_cas t<int *>(otherInt);[/color]

                      There is no assurance that the attempt to access an int
                      at the starting address of otherInt will succeed. On some
                      machines, it could produce an alignment fault.
                      [color=blue]
                      > // Assign another value to the integer otherInt!
                      > *p=7;[/color]

                      The above access could also produce an alignment fault.
                      [color=blue]
                      > std::cout<<*p<< "\n";
                      > }[/color]

                      I think "works on some platforms" versus "can fault on
                      some platforms" is a good example of "not portable".

                      --
                      --Larry Brasfield
                      email: donotspam_larry _brasfield@hotm ail.com
                      Above views may belong only to me.


                      Comment

                      • Ron Natalie

                        #12
                        Re: Accessing individual bytes of an integer

                        Andrew Koenig wrote:
                        [color=blue]
                        > I think that "Only one of the data members can be active at any time" is
                        > pretty clear, and the one exception to that rule says nothing about array of
                        > unsigned character.[/color]

                        I believfe he is referring to the passage at 3.10 p 15:
                        If a program attempts to access the stored value of an object through an lvalue of other than one of the following
                        types the behavior is undefined48):
                        — the dynamic type of the object,
                        — a cv-qualified version of the dynamic type of the object,
                        — a type that is the signed or unsigned type corresponding to the dynamic type of the object,
                        — a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of
                        the object,
                        — an aggregate or union type that includes one of the aforementioned types among its members (including,
                        recursively, a member of a subaggregate or contained union),
                        — a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
                        — a char or unsigned char type.

                        Comment

                        • Ioannis Vranos

                          #13
                          Re: Accessing individual bytes of an integer

                          Andrew Koenig wrote:
                          [color=blue]
                          > Can you show me where in the C++ standard it says that? The text that I
                          > think is relevant can be found in subclause 9.5:
                          >
                          > In a union, at most one of the data members can be active at any time, that
                          > is, the value of at most one of the data members can be stored in a union at
                          > any time. [Note: one special guarantee is made in order to simplify the use
                          > of unions: If a POD-union contains several POD-structs that share a common
                          > initial sequence (9.2), and if an object of this POD-union type contains one
                          > of the POD-structs, it is permitted to inspect the common initial sequence
                          > of any of POD-struct members; see 9.2. ]
                          >
                          > I think that "Only one of the data members can be active at any time" is
                          > pretty clear, and the one exception to that rule says nothing about array of
                          > unsigned character.[/color]


                          However the interesting part is that the entire union can be read as an
                          unsigned char/plain char array, as is the case with all POD types.




                          --
                          Ioannis Vranos


                          Comment

                          • Ioannis Vranos

                            #14
                            Re: Accessing individual bytes of an integer

                            Larry Brasfield wrote:
                            [color=blue]
                            > There is no assurance that the attempt to access an int
                            > at the starting address of otherInt will succeed. On some
                            > machines, it could produce an alignment fault.[/color]
                            [color=blue]
                            > I think "works on some platforms" versus "can fault on
                            > some platforms" is a good example of "not portable".[/color]


                            The standard guarantees that we can both read a POD type as an array of
                            unsigned chars/plain chars, copy its contents to another array of
                            unsigned chars/plain chars of the same size, and the new array is an
                            exact copy of the initial POD object.


                            That's why you can copy byte by byte or use memcpy() for this, an entire
                            array of ints for example. The same applies to an individual int.




                            --
                            Ioannis Vranos


                            Comment

                            • Old Wolf

                              #15
                              Re: Accessing individual bytes of an integer

                              Ron Natalie wrote:[color=blue]
                              > Andrew Koenig wrote:
                              >[color=green]
                              > > I think that "Only one of the data members can be active at any[/color][/color]
                              time" is[color=blue][color=green]
                              > > pretty clear, and the one exception to that rule says nothing about[/color][/color]
                              [color=blue][color=green]
                              > > array of unsigned character.[/color]
                              >
                              > I believfe he is referring to the passage at 3.10 p 15:
                              > If a program attempts to access the stored value of an object through[/color]
                              an lvalue of other than one of the following[color=blue]
                              > types the behavior is undefined48):
                              > - the dynamic type of the object,
                              > - a cv-qualified version of the dynamic type of the object,
                              > - a type that is the signed or unsigned type corresponding to the[/color]
                              dynamic type of the object,[color=blue]
                              > - a type that is the signed or unsigned type corresponding to a[/color]
                              cv-qualified version of the dynamic type of[color=blue]
                              > the object,
                              > - an aggregate or union type that includes one of the[/color]
                              aforementioned types among its members (including,[color=blue]
                              > recursively, a member of a subaggregate or contained union),
                              > - a type that is a (possibly cv-qualified) base class type of the[/color]
                              dynamic type of the object,[color=blue]
                              > - a char or unsigned char type.[/color]

                              That passage is irrelevant to this situation. It says
                              "If (conditions) then the behaviour is undefined". The union
                              example in question does not meet (conditions). QED.

                              To put it another way, the passage you quoted doesn't say that
                              the behaviour is defined for those listed bullet points.

                              Comment

                              Working...