signed and unsigned char

**Dan Pop** · Nov 14 '05, 03:28 AM

Re: signed and unsigned char

In <c0gehn$ek0$1@c hessie.cirr.com > Christopher Benson-Manica <ataru@nospam.c yberspace.org> writes:
[color=blue]
>signed char str_a[]="Hello, world!\n";
>unsigned char str_b[]="Hello, world!\n";
>
>what is the difference, if any, between the following two statements?
>
>printf( "%s", str_a );
>printf( "%s", str_b );[/color]

None.

s If no l length modifier is present, the argument shall
be a pointer to the initial element of an array of
character type.
^^^^^^^^^^^^^^
Both str_a and str_b are arrays of character type.
[color=blue]
>If there is a difference, what is the best way to compare *str_a with
>0xFF? (On my implementation, unadorned char is signed, and so I'm
>using
>
>if( *str_a == (signed char)0xFF ) ...
>
>to quiet compiler warnings.)[/color]

It's not clear what exactly you want to achieve here. If you want to see
if the respective character value has a certain representation, the most
portable approach is to use a pointer to unsigned char:

if( *(unsigned char *)str_a == 0xFF ) ...

This works even if this pattern is a trap representation for the type
signed char.

OTOH, if you want to check that your character has a certain value,
simply compare against that value:

if( *str_a == -1 ) ...

Comparing an object with a value it cannot possibly take, as in your
example, doesn't make much sense a priori, so you have to explain your
exact intentions.

BTW, if str_a were an array of plain char, you had the following solution:

if( *str_a == '\xff' ) ...

but still not guaranteed to work if this bit pattern is a trap
representation for plain char.

As other people have already mentioned, (signed char)0xFF is useless for
your purpose, in a portability context, because the result need not be
the signed char value corresponding to that bit pattern. Casts really
are *conversion* operators and not devices for silencing the compilers.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Dan.Pop@ifh.de

**Eric Sosman** · Nov 14 '05, 03:28 AM

Re: signed and unsigned char

Richard Bos wrote:[color=blue]
>
> Eric Sosman <Eric.Sosman@su n.com> wrote:
>[color=green][color=darkred]
> > > Christopher Benson-Manica wrote:
> > > > (On my implementation, unadorned char is signed, and so I'm
> > > > using
> > > >
> > > > if( *str_a == (signed char)0xFF ) ...[/color]
> >
> > Undefined behavior, I think.[/color]
>
> Implementation-defined, surely?[/color]

Harrumph. I guess so, but the distinction seems not
to be very important. When `(signed char)0xFF' is evaluated

"... either the result is implementation-defined or an
implementation-defined signal is raised." (6.3.1.3/3)

On the face of it, that's implementation-defined behavior and
not undefined behavior. But what if the implementation takes
the second alternative and raises a signal? If a function has
been installed to handle the signal

"If and when the function returns, if the value of _sig_
is [...] or any other implementation-defined value
corresponding to a computational exception, the behavior
is undefined; [...]" (7.14.1.1/3)

So if there's a handler, it cannot return without invoking
undefined behavior. I guess that means it must call abort()
or _Exit() or run an infinite loop; all of these have defined
effects, but are sufficiently unfortunate that they ought to
be avoided just about as strenuously as undefined behavior.
No nasal demons, surely, but no happy outcome either.

If there's not a handler, the implementation-defined signal
is treated as if one of SIG_IGN or SIG_DFL had been set up (the
choice is implementation-defined). If the handling is equivalent
to SIG_IGN, I think we're back in U.B. territory again: we're
told that we'll get *either* a result *or* a signal, not both.
Thus, we can't count on getting a result of any kind if a signal
is raised and ignored; the Standard doesn't specify any behavior,
so the behavior is undefined by omission (c.f. 3.4.3).

In the SIG_DFL case, the handling of the implementation-defined
signal is implementation-defined, not undefined. But right here
in the documentation I see

"The default handling for SIGBITROT causes demons
to fly out of your nose." (DS 9000 programmer's
manual, courtesy Armed Response Technologies)

.... which is not undefined behavior, but might seem so to a
casual observer. ;-)

Summary:

- You're right: `(signed char)0xFF' produces implementation-
defined, not undefined, behavior. My apologies.

- ... but since the I.B. is just about as unpredictable as
U.B., the programmer would be well-advised to avoid it.

- The *real* solution, I think, is to use `unsigned' types
whenever you want to deal with bits as bits. To ask the
question "Does this byte have all its bits set?", one
should not use potentially signed arithmetic. To answer
the question "Does this byte have the value 42?", either
signed or unsigned arithmetic will do.

- And, of course, all this is just another c.l.c exercise
in taking a census on a pinhead. We know perfectly well
that two's complement has won the game and extinguished
its competitors, right? And we're certain that it's the
ultimate in integer representations , and will never ever
be supplanted, right? Computer design is immune to the
vagaries of fashion, right?

(Ahem.) "Right?"

(I know you're out there; I can hear you breathing. C'mon,
stand up and be counted -- in two's complement ...)

--
Eric.Sosman@sun .com

**Dan Pop** · Nov 14 '05, 03:28 AM

Re: signed and unsigned char

In <402CF3D3.3AE73 34F@sun.com> Eric Sosman <Eric.Sosman@su n.com> writes:
[color=blue]
>Richard Bos wrote:[color=green]
>>
>> Eric Sosman <Eric.Sosman@su n.com> wrote:
>>[color=darkred]
>> > > Christopher Benson-Manica wrote:
>> > > > (On my implementation, unadorned char is signed, and so I'm
>> > > > using
>> > > >
>> > > > if( *str_a == (signed char)0xFF ) ...
>> >
>> > Undefined behavior, I think.[/color]
>>
>> Implementation-defined, surely?[/color]
>
> Harrumph. I guess so, but the distinction seems not
>to be very important. When `(signed char)0xFF' is evaluated
>
> "... either the result is implementation-defined or an
> implementation-defined signal is raised." (6.3.1.3/3)
>
>On the face of it, that's implementation-defined behavior and
>not undefined behavior. But what if the implementation takes
>the second alternative and raises a signal?[/color]

It won't, for backward compatibility with C89, which doesn't allow any
signal to be raised because of this. Breaking perfectly correct C89
code is not an option any serious implementor is going to adopt, *if* it
can be avoided.

This is a typical case where C99 fixed something that wasn't broken in
C89. And the person responsible for it couldn't produce a *convincing*
rationale...

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Dan.Pop@ifh.de

**Michael Wojcik** · Nov 14 '05, 03:29 AM

Re: signed and unsigned char

In article <lnad3nq3td.fsf @nuthaus.mib.or g>, Keith Thompson <kst-u@mib.org> writes:[color=blue]
>
> In ASCII, all such characters happen to have values in the range
> 32..126. In EBCDIC, if I recall correctly, some basic characters have
> codes greater than 127; I think this implies that in an implementation
> that uses EBCDIC as its execution character set, type char must be
> unsigned (assuming CHAR_BIT==8).[/color]

You recall correctly; the EBCDIC decimal digits, for example, are 0xF0
through 0xF9. It hadn't occurred to me earlier that this implied that
an EBCDIC implementation where CHAR_BIT==8 would have to make plain
char unsigned, but I suppose it would.

(I could check an EBCDIC implementation or two if anyone's curious, but
of course that wouldn't prove anything one way or the other.)

--
Michael Wojcik michael.wojcik@ microfocus.com

Viewers are bugs for famous brands.
-- unknown subtitler, Jackie Chan's _Thunderbolt_

**Larry Jones** · Nov 14 '05, 03:30 AM

Re: signed and unsigned char

Michael Wojcik <mwojcik@newsgu y.com> wrote:[color=blue]
>
> (I could check an EBCDIC implementation or two if anyone's curious, but
> of course that wouldn't prove anything one way or the other.)[/color]

For what it's worth, every EBCDIC implementation I've ever seen -- and
I've seen a few -- has had plain char unsigned.

-Larry Jones

The problem with the future is that it keeps turning into the present.
-- Hobbes

signed and unsigned char

Comment

Comment

Comment

Comment

Comment