Accessing individual bytes of an integer

**Andrew Koenig** · Jul 23 '05, 12:58 AM

Re: Accessing individual bytes of an integer

"Daniel Lidström" <someone@micros oft.com> wrote in message
news:pan.2005.0 2.17.18.16.21.8 19630@microsoft .com...
[color=blue]
> When I want to treat this as a 32-bits integer, can I do something
> like this?
>
> unsigned int bits32 = *((unsigned int*)bits);[/color]

Yes you can, but you have absolutely no assurance as to what the results
will be :-)

What's wrong with

(bits>>n) & 0xff

where n is 0, 8, 16, or 24?

**MatrixV** · Jul 23 '05, 12:58 AM

Re: Accessing individual bytes of an integer

"Daniel Lidström" <someone@micros oft.com> ????
news:pan.2005.0 2.17.18.16.21.8 19630@microsoft .com...[color=blue]
> Hello!
>
> I want to work with individual bytes of integers. I know that ints are
> 32-bit and will always be. Sometimes I want to work with the entire
> 32-bits, and other times I want to modify just the first 8-bits for
> example. For me, I think it would be best if I can declare the 32-bits
> like this:
>
> unsigned char bits[4];
>
> When I want to treat this as a 32-bits integer, can I do something
> like this?
>
> unsigned int bits32 = *((unsigned int*)bits);
>
> I'm unsure of the syntax. I don't need to work in-place so to speak. It is
> fine to work with a copy.
>
> Thanks in advance!
>
> --
> Daniel
>[/color]

Unconsidering the byte sequence, you are correct.
A better way is using a union like:
union xxx
{
unsigned char bits[4];
unsigned int i;
};

**Thomas Matthews** · Jul 23 '05, 12:58 AM

Re: Accessing individual bytes of an integer

MatrixV wrote:[color=blue]
> "Daniel Lidström" <someone@micros oft.com> ????
> news:pan.2005.0 2.17.18.16.21.8 19630@microsoft .com...
>[color=green]
>>Hello!
>>
>>I want to work with individual bytes of integers. I know that ints are
>>32-bit and will always be. Sometimes I want to work with the entire
>>32-bits, and other times I want to modify just the first 8-bits for
>>example. For me, I think it would be best if I can declare the 32-bits
>>like this:
>>
>>unsigned char bits[4];
>>
>>When I want to treat this as a 32-bits integer, can I do something
>>like this?
>>
>>unsigned int bits32 = *((unsigned int*)bits);
>>
>>I'm unsure of the syntax. I don't need to work in-place so to speak. It is
>>fine to work with a copy.
>>
>>Thanks in advance!
>>
>>--
>>Daniel
>>[/color]
>
>
> Unconsidering the byte sequence, you are correct.
> A better way is using a union like:
> union xxx
> {
> unsigned char bits[4];
> unsigned int i;
> };
>
>[/color]

How about this:
union xxx
{
unsigned char bytes[sizeof(unsigned int))];
unsigned int i;
};
This makes no assumptions about how many bytes are
in an integer.

--
Thomas Matthews

C++ newsgroup welcome message:

404 Not Found

http://www.slack.net/~shiva/welcome.txt

C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.l earn.c-c++ faq:

Comeau Computing

http://www.comeaucomputing.com/learn/faq/

Tech Magazine 2024

Other sites:
http://www.josuttis.com -- C++ STL Library book
http://www.sgi.com/tech/stl -- Standard Template Library

**Andrew Koenig** · Jul 23 '05, 12:58 AM

Re: Accessing individual bytes of an integer

"MatrixV" <training@kcoll ege.com> wrote in message
news:37k6j3F5ef altU1@individua l.net...
[color=blue]
> Unconsidering the byte sequence, you are correct.
> A better way is using a union like:
> union xxx
> {
> unsigned char bits[4];
> unsigned int i;
> };[/color]

Not really. When you use a union, you have no assurance about the effect
that giving a value to one member of a union will have on other members.

**Old Wolf** · Jul 23 '05, 12:58 AM

Re: Accessing individual bytes of an integer

MatrixV wrote:[color=blue]
> "Daniel Lidström" <someone@micros oft.com> ????[color=green]
> >
> > I want to work with individual bytes of integers. I know that ints[/color][/color]
are[color=blue][color=green]
> > 32-bit and will always be. Sometimes I want to work with the entire
> > 32-bits, and other times I want to modify just the first 8-bits for
> > example. For me, I think it would be best if I can declare the[/color][/color]
32-bits[color=blue][color=green]
> > like this:
> >
> > unsigned char bits[4];
> >
> > When I want to treat this as a 32-bits integer, can I do something
> > like this?
> >
> > unsigned int bits32 = *((unsigned int*)bits);[/color][/color]

Bad - if 'bits' is not correctly aligned for an int, then
you have undefined behaviour.
[color=blue][color=green]
> > I'm unsure of the syntax. I don't need to work in-place so to[/color][/color]
speak.[color=blue][color=green]
> > It is fine to work with a copy.[/color][/color]

You can work in-place with:
unsigned int bits32;
and then to access the chars:
((unsigned char *)&bits32)[0]
etc. Note that the contents of the chars could be anything
(eg. big endian, little endian, or something more exotic),
and if you modify one of those chars then you aren't guaranteed
to have anything sensible left in bits32.

If you don't want to work in-place then you could memcpy
between the int and the char (with the same caveats I mentioned
already).

To work portably (assuming a 32-bit int and 8-bit char),
you can use bit-shifts and masks to extract the four bytes
and replace them. A good compiler would optimise this code
into a single instruction, if it could.
[color=blue]
> Unconsidering the byte sequence, you are correct.
> A better way is using a union like:
> union xxx
> {
> unsigned char bits[4];
> unsigned int i;
> };[/color]

Undefined behaviour if you access a member of a union that
wasn't the one you just set.

**Clark S. Cox III** · Jul 23 '05, 12:58 AM

Re: Accessing individual bytes of an integer

On 2005-02-17 15:38:50 -0500, "Andrew Koenig" <ark@acm.org> said:
[color=blue]
> "MatrixV" <training@kcoll ege.com> wrote in message
> news:37k6j3F5ef altU1@individua l.net...
>[color=green]
>> Unconsidering the byte sequence, you are correct.
>> A better way is using a union like:
>> union xxx
>> {
>> unsigned char bits[4];
>> unsigned int i;
>> };[/color]
>
> Not really. When you use a union, you have no assurance about the
> effect that giving a value to one member of a union will have on other
> members.[/color]

You do when one of them is an array of unsigned char.

--
Clark S. Cox, III
clarkcox3@gmail .com

**Ioannis Vranos** · Jul 23 '05, 12:59 AM

Re: Accessing individual bytes of an integer

Daniel Lidström wrote:
[color=blue]
> Hello!
>
> I want to work with individual bytes of integers. I know that ints are
> 32-bit and will always be. Sometimes I want to work with the entire
> 32-bits, and other times I want to modify just the first 8-bits for
> example. For me, I think it would be best if I can declare the 32-bits
> like this:
>
> unsigned char bits[4];
>
> When I want to treat this as a 32-bits integer, can I do something
> like this?
>
> unsigned int bits32 = *((unsigned int*)bits);[/color]

Yes but not like this because array bits is not initialised.

[color=blue]
> I'm unsure of the syntax. I don't need to work in-place so to speak. It is
> fine to work with a copy.[/color]

What you can do is read an unsigned int or any other POD type as a
sequence of unsigned chars (or plain chars) - that is bytes, copy it
byte by byte to another unsigned char sequence (which includes possible
padding bits), and deal the new char sequence as another unsigned int.

The following example uses an int and is portable:

#include <iostream>

int main()
{
int integer=0;

unsigned char *puc= reinterpret_cas t<unsigned char *>(&integer);

unsigned char otherInt[sizeof(integer)];

// Read integer byte by byte and copy it to otherInt
for(unsigned i=0; i<sizeof(intege r); ++i)
otherInt[i]= puc[i];

// We treat the new unsigned char sequence as an int
int *p= reinterpret_cas t<int *>(otherInt);

// Assign another value to the integer otherInt!
*p=7;

std::cout<<*p<< "\n";
}

--
Ioannis Vranos

IIS 7.5 Detailed Error - 404.0 - Not Found

http://www23.brinkster.com/noicys

**Jack Klein** · Jul 23 '05, 12:59 AM

Re: Accessing individual bytes of an integer

On Thu, 17 Feb 2005 19:16:24 +0100, Daniel Lidström
<someone@micros oft.com> wrote in comp.lang.c++:
[color=blue]
> Hello!
>
> I want to work with individual bytes of integers. I know that ints are
> 32-bit and will always be.[/color]

No, you don't. You just think you do. But you are mistaken.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.l earn.c-c++

404 Not Found

http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html

**Andrew Koenig** · Jul 23 '05, 01:00 AM

Re: Accessing individual bytes of an integer

>> Not really. When you use a union, you have no assurance about the effect[color=blue][color=green]
>> that giving a value to one member of a union will have on other members.[/color][/color]
[color=blue]
> You do when one of them is an array of unsigned char.[/color]

Can you show me where in the C++ standard it says that? The text that I
think is relevant can be found in subclause 9.5:

In a union, at most one of the data members can be active at any time, that
is, the value of at most one of the data members can be stored in a union at
any time. [Note: one special guarantee is made in order to simplify the use
of unions: If a POD-union contains several POD-structs that share a common
initial sequence (9.2), and if an object of this POD-union type contains one
of the POD-structs, it is permitted to inspect the common initial sequence
of any of POD-struct members; see 9.2. ]

I think that "Only one of the data members can be active at any time" is
pretty clear, and the one exception to that rule says nothing about array of
unsigned character.

**Larry Brasfield** · Jul 23 '05, 01:00 AM

Re: Accessing individual bytes of an integer

"Ioannis Vranos" <ivr@remove.thi s.grad.com> wrote in message
news:1108680624 .856546@athnrd0 2...
....[color=blue]
> What you can do is read an unsigned int or any other POD type as a sequence of unsigned chars (or plain chars) - that is bytes,
> copy it byte by byte to another unsigned char sequence (which includes possible padding bits), and deal the new char sequence as
> another unsigned int.
>
>
> The following example uses an int and is portable:[/color]

I have to disagree with your "is portable" claim.
Comments inserted below in your code.
[color=blue]
> #include <iostream>
>
> int main()
> {
> int integer=0;
>
> unsigned char *puc= reinterpret_cas t<unsigned char *>(&integer);
>
>
> unsigned char otherInt[sizeof(integer)];
>
> // Read integer byte by byte and copy it to otherInt
> for(unsigned i=0; i<sizeof(intege r); ++i)
> otherInt[i]= puc[i];
>
>
> // We treat the new unsigned char sequence as an int
> int *p= reinterpret_cas t<int *>(otherInt);[/color]

There is no assurance that the attempt to access an int
at the starting address of otherInt will succeed. On some
machines, it could produce an alignment fault.
[color=blue]
> // Assign another value to the integer otherInt!
> *p=7;[/color]

The above access could also produce an alignment fault.
[color=blue]
> std::cout<<*p<< "\n";
> }[/color]

I think "works on some platforms" versus "can fault on
some platforms" is a good example of "not portable".

--
--Larry Brasfield
email: donotspam_larry _brasfield@hotm ail.com
Above views may belong only to me.

**Ron Natalie** · Jul 23 '05, 01:00 AM

Re: Accessing individual bytes of an integer

Andrew Koenig wrote:
[color=blue]
> I think that "Only one of the data members can be active at any time" is
> pretty clear, and the one exception to that rule says nothing about array of
> unsigned character.[/color]

I believfe he is referring to the passage at 3.10 p 15:
If a program attempts to access the stored value of an object through an lvalue of other than one of the following
types the behavior is undefined48):
— the dynamic type of the object,
— a cv-qualified version of the dynamic type of the object,
— a type that is the signed or unsigned type corresponding to the dynamic type of the object,
— a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of
the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including,
recursively, a member of a subaggregate or contained union),
— a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
— a char or unsigned char type.

**Ioannis Vranos** · Jul 23 '05, 01:00 AM

Re: Accessing individual bytes of an integer

Andrew Koenig wrote:
[color=blue]
> Can you show me where in the C++ standard it says that? The text that I
> think is relevant can be found in subclause 9.5:
>
> In a union, at most one of the data members can be active at any time, that
> is, the value of at most one of the data members can be stored in a union at
> any time. [Note: one special guarantee is made in order to simplify the use
> of unions: If a POD-union contains several POD-structs that share a common
> initial sequence (9.2), and if an object of this POD-union type contains one
> of the POD-structs, it is permitted to inspect the common initial sequence
> of any of POD-struct members; see 9.2. ]
>
> I think that "Only one of the data members can be active at any time" is
> pretty clear, and the one exception to that rule says nothing about array of
> unsigned character.[/color]

However the interesting part is that the entire union can be read as an
unsigned char/plain char array, as is the case with all POD types.

--
Ioannis Vranos

IIS 7.5 Detailed Error - 404.0 - Not Found

http://www23.brinkster.com/noicys

**Ioannis Vranos** · Jul 23 '05, 01:00 AM

Re: Accessing individual bytes of an integer

Larry Brasfield wrote:
[color=blue]
> There is no assurance that the attempt to access an int
> at the starting address of otherInt will succeed. On some
> machines, it could produce an alignment fault.[/color]
[color=blue]
> I think "works on some platforms" versus "can fault on
> some platforms" is a good example of "not portable".[/color]

The standard guarantees that we can both read a POD type as an array of
unsigned chars/plain chars, copy its contents to another array of
unsigned chars/plain chars of the same size, and the new array is an
exact copy of the initial POD object.

That's why you can copy byte by byte or use memcpy() for this, an entire
array of ints for example. The same applies to an individual int.

--
Ioannis Vranos

IIS 7.5 Detailed Error - 404.0 - Not Found

http://www23.brinkster.com/noicys

**Old Wolf** · Jul 23 '05, 01:00 AM

Re: Accessing individual bytes of an integer

Ron Natalie wrote:[color=blue]
> Andrew Koenig wrote:
>[color=green]
> > I think that "Only one of the data members can be active at any[/color][/color]
time" is[color=blue][color=green]
> > pretty clear, and the one exception to that rule says nothing about[/color][/color]
[color=blue][color=green]
> > array of unsigned character.[/color]
>
> I believfe he is referring to the passage at 3.10 p 15:
> If a program attempts to access the stored value of an object through[/color]
an lvalue of other than one of the following[color=blue]
> types the behavior is undefined48):
> - the dynamic type of the object,
> - a cv-qualified version of the dynamic type of the object,
> - a type that is the signed or unsigned type corresponding to the[/color]
dynamic type of the object,[color=blue]
> - a type that is the signed or unsigned type corresponding to a[/color]
cv-qualified version of the dynamic type of[color=blue]
> the object,
> - an aggregate or union type that includes one of the[/color]
aforementioned types among its members (including,[color=blue]
> recursively, a member of a subaggregate or contained union),
> - a type that is a (possibly cv-qualified) base class type of the[/color]
dynamic type of the object,[color=blue]
> - a char or unsigned char type.[/color]

That passage is irrelevant to this situation. It says
"If (conditions) then the behaviour is undefined". The union
example in question does not meet (conditions). QED.

To put it another way, the passage you quoted doesn't say that
the behaviour is defined for those listed bullet points.