signed and unsigned char

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Dan Pop

    #16
    Re: signed and unsigned char

    In <c0gehn$ek0$1@c hessie.cirr.com > Christopher Benson-Manica <ataru@nospam.c yberspace.org> writes:
    [color=blue]
    >signed char str_a[]="Hello, world!\n";
    >unsigned char str_b[]="Hello, world!\n";
    >
    >what is the difference, if any, between the following two statements?
    >
    >printf( "%s", str_a );
    >printf( "%s", str_b );[/color]

    None.

    s If no l length modifier is present, the argument shall
    be a pointer to the initial element of an array of
    character type.
    ^^^^^^^^^^^^^^
    Both str_a and str_b are arrays of character type.
    [color=blue]
    >If there is a difference, what is the best way to compare *str_a with
    >0xFF? (On my implementation, unadorned char is signed, and so I'm
    >using
    >
    >if( *str_a == (signed char)0xFF ) ...
    >
    >to quiet compiler warnings.)[/color]

    It's not clear what exactly you want to achieve here. If you want to see
    if the respective character value has a certain representation, the most
    portable approach is to use a pointer to unsigned char:

    if( *(unsigned char *)str_a == 0xFF ) ...

    This works even if this pattern is a trap representation for the type
    signed char.

    OTOH, if you want to check that your character has a certain value,
    simply compare against that value:

    if( *str_a == -1 ) ...

    Comparing an object with a value it cannot possibly take, as in your
    example, doesn't make much sense a priori, so you have to explain your
    exact intentions.

    BTW, if str_a were an array of plain char, you had the following solution:

    if( *str_a == '\xff' ) ...

    but still not guaranteed to work if this bit pattern is a trap
    representation for plain char.

    As other people have already mentioned, (signed char)0xFF is useless for
    your purpose, in a portability context, because the result need not be
    the signed char value corresponding to that bit pattern. Casts really
    are *conversion* operators and not devices for silencing the compilers.

    Dan
    --
    Dan Pop
    DESY Zeuthen, RZ group
    Email: Dan.Pop@ifh.de

    Comment

    • Eric Sosman

      #17
      Re: signed and unsigned char

      Richard Bos wrote:[color=blue]
      >
      > Eric Sosman <Eric.Sosman@su n.com> wrote:
      >[color=green][color=darkred]
      > > > Christopher Benson-Manica wrote:
      > > > > (On my implementation, unadorned char is signed, and so I'm
      > > > > using
      > > > >
      > > > > if( *str_a == (signed char)0xFF ) ...[/color]
      > >
      > > Undefined behavior, I think.[/color]
      >
      > Implementation-defined, surely?[/color]

      Harrumph. I guess so, but the distinction seems not
      to be very important. When `(signed char)0xFF' is evaluated

      "... either the result is implementation-defined or an
      implementation-defined signal is raised." (6.3.1.3/3)

      On the face of it, that's implementation-defined behavior and
      not undefined behavior. But what if the implementation takes
      the second alternative and raises a signal? If a function has
      been installed to handle the signal

      "If and when the function returns, if the value of _sig_
      is [...] or any other implementation-defined value
      corresponding to a computational exception, the behavior
      is undefined; [...]" (7.14.1.1/3)

      So if there's a handler, it cannot return without invoking
      undefined behavior. I guess that means it must call abort()
      or _Exit() or run an infinite loop; all of these have defined
      effects, but are sufficiently unfortunate that they ought to
      be avoided just about as strenuously as undefined behavior.
      No nasal demons, surely, but no happy outcome either.

      If there's not a handler, the implementation-defined signal
      is treated as if one of SIG_IGN or SIG_DFL had been set up (the
      choice is implementation-defined). If the handling is equivalent
      to SIG_IGN, I think we're back in U.B. territory again: we're
      told that we'll get *either* a result *or* a signal, not both.
      Thus, we can't count on getting a result of any kind if a signal
      is raised and ignored; the Standard doesn't specify any behavior,
      so the behavior is undefined by omission (c.f. 3.4.3).

      In the SIG_DFL case, the handling of the implementation-defined
      signal is implementation-defined, not undefined. But right here
      in the documentation I see

      "The default handling for SIGBITROT causes demons
      to fly out of your nose." (DS 9000 programmer's
      manual, courtesy Armed Response Technologies)

      .... which is not undefined behavior, but might seem so to a
      casual observer. ;-)

      Summary:

      - You're right: `(signed char)0xFF' produces implementation-
      defined, not undefined, behavior. My apologies.

      - ... but since the I.B. is just about as unpredictable as
      U.B., the programmer would be well-advised to avoid it.

      - The *real* solution, I think, is to use `unsigned' types
      whenever you want to deal with bits as bits. To ask the
      question "Does this byte have all its bits set?", one
      should not use potentially signed arithmetic. To answer
      the question "Does this byte have the value 42?", either
      signed or unsigned arithmetic will do.

      - And, of course, all this is just another c.l.c exercise
      in taking a census on a pinhead. We know perfectly well
      that two's complement has won the game and extinguished
      its competitors, right? And we're certain that it's the
      ultimate in integer representations , and will never ever
      be supplanted, right? Computer design is immune to the
      vagaries of fashion, right?

      (Ahem.) "Right?"

      (I know you're out there; I can hear you breathing. C'mon,
      stand up and be counted -- in two's complement ...)

      --
      Eric.Sosman@sun .com

      Comment

      • Dan Pop

        #18
        Re: signed and unsigned char

        In <402CF3D3.3AE73 34F@sun.com> Eric Sosman <Eric.Sosman@su n.com> writes:
        [color=blue]
        >Richard Bos wrote:[color=green]
        >>
        >> Eric Sosman <Eric.Sosman@su n.com> wrote:
        >>[color=darkred]
        >> > > Christopher Benson-Manica wrote:
        >> > > > (On my implementation, unadorned char is signed, and so I'm
        >> > > > using
        >> > > >
        >> > > > if( *str_a == (signed char)0xFF ) ...
        >> >
        >> > Undefined behavior, I think.[/color]
        >>
        >> Implementation-defined, surely?[/color]
        >
        > Harrumph. I guess so, but the distinction seems not
        >to be very important. When `(signed char)0xFF' is evaluated
        >
        > "... either the result is implementation-defined or an
        > implementation-defined signal is raised." (6.3.1.3/3)
        >
        >On the face of it, that's implementation-defined behavior and
        >not undefined behavior. But what if the implementation takes
        >the second alternative and raises a signal?[/color]

        It won't, for backward compatibility with C89, which doesn't allow any
        signal to be raised because of this. Breaking perfectly correct C89
        code is not an option any serious implementor is going to adopt, *if* it
        can be avoided.

        This is a typical case where C99 fixed something that wasn't broken in
        C89. And the person responsible for it couldn't produce a *convincing*
        rationale...

        Dan
        --
        Dan Pop
        DESY Zeuthen, RZ group
        Email: Dan.Pop@ifh.de

        Comment

        • Michael Wojcik

          #19
          Re: signed and unsigned char


          In article <lnad3nq3td.fsf @nuthaus.mib.or g>, Keith Thompson <kst-u@mib.org> writes:[color=blue]
          >
          > In ASCII, all such characters happen to have values in the range
          > 32..126. In EBCDIC, if I recall correctly, some basic characters have
          > codes greater than 127; I think this implies that in an implementation
          > that uses EBCDIC as its execution character set, type char must be
          > unsigned (assuming CHAR_BIT==8).[/color]

          You recall correctly; the EBCDIC decimal digits, for example, are 0xF0
          through 0xF9. It hadn't occurred to me earlier that this implied that
          an EBCDIC implementation where CHAR_BIT==8 would have to make plain
          char unsigned, but I suppose it would.

          (I could check an EBCDIC implementation or two if anyone's curious, but
          of course that wouldn't prove anything one way or the other.)

          --
          Michael Wojcik michael.wojcik@ microfocus.com

          Viewers are bugs for famous brands.
          -- unknown subtitler, Jackie Chan's _Thunderbolt_

          Comment

          • Larry Jones

            #20
            Re: signed and unsigned char

            Michael Wojcik <mwojcik@newsgu y.com> wrote:[color=blue]
            >
            > (I could check an EBCDIC implementation or two if anyone's curious, but
            > of course that wouldn't prove anything one way or the other.)[/color]

            For what it's worth, every EBCDIC implementation I've ever seen -- and
            I've seen a few -- has had plain char unsigned.

            -Larry Jones

            The problem with the future is that it keeps turning into the present.
            -- Hobbes

            Comment

            Working...