Sort of mystified from an earlier thread

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Chad

    Sort of mystified from an earlier thread

    This was taken from the following:



    And I quote:

    "Well, that's also ok for char**, since string literals are of type
    char * in c. The general idea still stands, though.

    The thing that irritates me is that despite all this, it's _trivial_
    to violate const in C without resorting to all this.

    const char foo[] = "mystring";
    char *constviol = strchr(foo,*foo ); "

    What I don't get is that that 'const char f[]="mystring" ' is defined
    as a char, but the prototype is defined as the following:

    char *strchr(const char *s, int c);

    When foo gets de-referenced (ie *foo), how come the compiler doesn't
    complain about the difference between 'int' and 'char'?

    Thanks in advance.

    Chad

  • Arctic Fidelity

    #2
    Re: Sort of mystified from an earlier thread

    On Sat, 29 Oct 2005 21:51:46 -0400, Chad <cdalten@gmail. com> wrote:
    [color=blue]
    > "Well, that's also ok for char**, since string literals are of type
    > char * in c. The general idea still stands, though.
    >
    > The thing that irritates me is that despite all this, it's _trivial_
    > to violate const in C without resorting to all this.
    >
    > const char foo[] = "mystring";
    > char *constviol = strchr(foo,*foo ); "
    >
    > What I don't get is that that 'const char f[]="mystring" ' is defined
    > as a char, but the prototype is defined as the following:
    >
    > char *strchr(const char *s, int c);
    >
    > When foo gets de-referenced (ie *foo), how come the compiler doesn't
    > complain about the difference between 'int' and 'char'?[/color]

    I have actually been wondering about this as well. I know that char is an
    integer type, but I still would have thought that int and char would have
    brought up some kind of warning or what not. I'm not sure how I understand
    how that all does it's thing.

    - Arctic

    --
    Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

    Comment

    • Chad

      #3
      Re: Sort of mystified from an earlier thread

      Arctic Fidelity wrote:[color=blue]
      > On Sat, 29 Oct 2005 21:51:46 -0400, Chad <cdalten@gmail. com> wrote:
      >[color=green]
      > > "Well, that's also ok for char**, since string literals are of type
      > > char * in c. The general idea still stands, though.
      > >
      > > The thing that irritates me is that despite all this, it's _trivial_
      > > to violate const in C without resorting to all this.
      > >
      > > const char foo[] = "mystring";
      > > char *constviol = strchr(foo,*foo ); "
      > >
      > > What I don't get is that that 'const char f[]="mystring" ' is defined
      > > as a char, but the prototype is defined as the following:
      > >
      > > char *strchr(const char *s, int c);
      > >
      > > When foo gets de-referenced (ie *foo), how come the compiler doesn't
      > > complain about the difference between 'int' and 'char'?[/color]
      >
      > I have actually been wondering about this as well. I know that char is an
      > integer type, but I still would have thought that int and char would have
      > brought up some kind of warning or what not. I'm not sure how I understand
      > how that all does it's thing.
      >
      > - Arctic
      >
      > --
      > Using Opera's revolutionary e-mail client: http://www.opera.com/mail/[/color]

      Outside of the sloppy wording, here is my best guess on what is going
      on.

      When we go *foo, we are getting each character from the string.
      Internally at each pass, we would have a varibale storing 'm', tnen
      'y', etc. This would be the same has we done something like

      char internal_string = 'm';

      Then char would be automatically converted to integer (on the strchr
      int c parameter). This might explain wny the gnu compiler didn't
      complain even when I hard warning flags enabled.

      Comment

      • pete

        #4
        Re: Sort of mystified from an earlier thread

        Chad wrote:[color=blue]
        >
        > This was taken from the following:
        >
        > http://groups.google.com/group/comp....3e9afae83d061c
        >
        > And I quote:
        >
        > "Well, that's also ok for char**, since string literals are of type
        > char * in c. The general idea still stands, though.
        >
        > The thing that irritates me is that despite all this, it's _trivial_
        > to violate const in C without resorting to all this.
        >
        > const char foo[] = "mystring";
        > char *constviol = strchr(foo,*foo ); "
        >
        > What I don't get is that that 'const char f[]="mystring" ' is defined
        > as a char, but the prototype is defined as the following:
        >
        > char *strchr(const char *s, int c);
        >
        > When foo gets de-referenced (ie *foo), how come the compiler doesn't
        > complain about the difference between 'int' and 'char'?[/color]

        Because there's no problem converting a char to an int,
        unless (CHAR_MAX > INT_MAX)
        which doesn't seem to be the case in any hosted sysytems.

        --
        pete

        Comment

        • pete

          #5
          Re: Sort of mystified from an earlier thread

          pete wrote:[color=blue]
          >
          > Chad wrote:[color=green]
          > >
          > > This was taken from the following:
          > >
          > > http://groups.google.com/group/comp....3e9afae83d061c
          > >
          > > And I quote:
          > >
          > > "Well, that's also ok for char**, since string literals are of type
          > > char * in c. The general idea still stands, though.
          > >
          > > The thing that irritates me is that despite all this, it's _trivial_
          > > to violate const in C without resorting to all this.
          > >
          > > const char foo[] = "mystring";
          > > char *constviol = strchr(foo,*foo ); "
          > >
          > > What I don't get is that that 'const char f[]="mystring" ' is defined
          > > as a char, but the prototype is defined as the following:
          > >
          > > char *strchr(const char *s, int c);
          > >
          > > When foo gets de-referenced (ie *foo), how come the compiler doesn't
          > > complain about the difference between 'int' and 'char'?[/color]
          >
          > Because there's no problem converting a char to an int,
          > unless (CHAR_MAX > INT_MAX)
          > which doesn't seem to be the case in any hosted sysytems.[/color]

          There's also

          N869
          6.3.1 Arithmetic operands
          6.3.1.1 Boolean, characters, and integers

          [#2] The following may be used in an expression wherever an
          int or unsigned int may be used:
          -- An object or expression with an integer type whose
          integer conversion rank is less than the rank of int
          and unsigned int.

          --
          pete

          Comment

          • Greg Comeau

            #6
            Re: Sort of mystified from an earlier thread

            In article <1130637106.638 913.16850@g14g2 000cwa.googlegr oups.com>,
            Chad <cdalten@gmail. com> wrote:[color=blue]
            >...const char foo[] = "mystring";
            >char *constviol = strchr(foo,*foo ); "
            >
            >What I don't get is that that 'const char f[]="mystring" ' is defined
            >as a char,[/color]

            No, it's defined as a conat char[9]. I'm assuming you mean
            *f but that's a const char.
            [color=blue]
            > but the prototype is defined as the following:
            >
            >char *strchr(const char *s, int c);[/color]

            Correct.
            [color=blue]
            >When foo gets de-referenced (ie *foo), how come the compiler doesn't
            >complain about the difference between 'int' and 'char'?[/color]

            Because it is one of the implicit conversions. You can do this with
            no problem:

            const char c = 'x';
            int i;

            i = c;

            A similar thing happens during argument passing, since the prototype
            specifically says the argument should be an int.

            Why it is an int is a seperate story, but no doubt has to do
            with the fact that routines such as getchar return int's (as
            they, for better or worse, accomodate for the returned character
            _and_ signals such as EOF).
            --
            Greg Comeau / Celebrating 20 years of Comeauity!
            Comeau C/C++ ONLINE ==> http://www.comeaucomputing.com/tryitout
            World Class Compilers: Breathtaking C++, Amazing C99, Fabulous C90.
            Comeau C/C++ with Dinkumware's Libraries... Have you tried it?

            Comment

            • Old Wolf

              #7
              Re: Sort of mystified from an earlier thread

              Arctic Fidelity wrote:[color=blue]
              > Chad wrote:
              >[color=green]
              >> const char foo[] = "mystring";
              >> char *constviol = strchr(foo,*foo ); "
              >>
              >> char *strchr(const char *s, int c);
              >>
              >> When foo gets de-referenced (ie *foo), how come the compiler doesn't
              >> complain about the difference between 'int' and 'char'?[/color]
              >
              > I have actually been wondering about this as well.[/color]

              In C there is an implicit conversion from char to int.
              This means that if there is a context expecting an int, but you
              supply a char, then C will silently convert the char to an int.

              Some people use this to call C a "weakly typed" language, and
              say C has "holes in its type system". However those people are
              usually Lisp trolls.

              This means that the following code works:

              char c = 5;
              int i = c;
              /* now 'i' has the value of 5 */

              If C did not have this implicit conversion then you would have
              to write something ugly like:

              int i = (int)c;

              To me, this is less type-safe than the real situation, as it
              encourages the use of casts.

              C also has an implicit conversion from int to char:

              int i = 5;
              char c = i;
              /* now 'c' has a value of 5. */

              But if 'i' had a value that couldn't be held by a char, then
              we would have undefined behaviour (to cut a long story short).
              Some compilers will issue a warning when you do a so-called
              "narrowing conversion" like this.

              C in fact has implicit conversions between all of the integral
              and floating point types, with silent UB if the value can't
              be represented.

              By contrast, Java has implicit widening conversions, but no
              implicit narrowing conversions. Java trolls often bring this up.

              Comment

              • Keith Thompson

                #8
                Re: Sort of mystified from an earlier thread

                "Old Wolf" <oldwolf@inspir e.net.nz> writes:
                [...][color=blue]
                > C also has an implicit conversion from int to char:
                >
                > int i = 5;
                > char c = i;
                > /* now 'c' has a value of 5. */
                >
                > But if 'i' had a value that couldn't be held by a char, then
                > we would have undefined behaviour (to cut a long story short).[/color]

                Cutting a long story short never works around here. 8-)}
                [color=blue]
                > Some compilers will issue a warning when you do a so-called
                > "narrowing conversion" like this.
                >
                > C in fact has implicit conversions between all of the integral
                > and floating point types, with silent UB if the value can't
                > be represented.[/color]

                Actually, overflow on a conversion has different rules than overflow
                on an arithmetic operator. For arithmetic operators, overflow on a
                signed integer type invokes undefined behavior. For conversion, it
                either yields an implementation-defined result or raises an
                implementation-defined signal (the latter is new in C99).

                So, given
                int i = <whatever>;
                char c = i;
                the implicit conversion of i to type char doesn't cause undefined
                behavior -- and if plain char is unsigned, it yields a well-defined
                result.

                --
                Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
                San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
                We must do something. This is something. Therefore, we must do this.

                Comment

                • Old Wolf

                  #9
                  Re: Sort of mystified from an earlier thread

                  Keith Thompson wrote:[color=blue]
                  > "Old Wolf" <oldwolf@inspir e.net.nz> writes:[color=green]
                  >> But if 'i' had a value that couldn't be held by a char, then
                  >> we would have undefined behaviour (to cut a long story short).[/color]
                  >
                  > Cutting a long story short never works around here. 8-)}
                  >
                  > It either yields an implementation-defined result or raises an
                  > implementation-defined signal (the latter is new in C99).[/color]

                  An implementation-defined signal might as well be UB, in practice.
                  I think the only thing you can do safely in a signal handler is set
                  a flag, or call exit(). What is the status of the char after the
                  signal handler returns? If it's indeterminate, then the subsequent
                  use of it will cause UB.

                  Comment

                  • Keith Thompson

                    #10
                    Re: Sort of mystified from an earlier thread

                    "Old Wolf" <oldwolf@inspir e.net.nz> writes:[color=blue]
                    > Keith Thompson wrote:[color=green]
                    >> "Old Wolf" <oldwolf@inspir e.net.nz> writes:[color=darkred]
                    >>> But if 'i' had a value that couldn't be held by a char, then
                    >>> we would have undefined behaviour (to cut a long story short).[/color]
                    >>
                    >> Cutting a long story short never works around here. 8-)}
                    >>
                    >> It either yields an implementation-defined result or raises an
                    >> implementation-defined signal (the latter is new in C99).[/color]
                    >
                    > An implementation-defined signal might as well be UB, in practice.
                    > I think the only thing you can do safely in a signal handler is set
                    > a flag, or call exit(). What is the status of the char after the
                    > signal handler returns? If it's indeterminate, then the subsequent
                    > use of it will cause UB.[/color]

                    So you set a flag in the signal handler; if the flag is set, you don't
                    look at the variable. It's not going to have anything useful in it
                    anyway.

                    I don't know of any implementation that takes advantage of the new
                    permission to raise a signal on overflow, and since the signal is
                    implementation-defined, you can't use it portably.

                    BTW, I think type char is guaranteed not to have any trap
                    representations , so an indeterminate value will just be one of the
                    values in the range CHAR_MIN..CHAR_ MAX.

                    --
                    Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
                    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
                    We must do something. This is something. Therefore, we must do this.

                    Comment

                    • Jordan Abel

                      #11
                      Re: Sort of mystified from an earlier thread

                      On 2005-10-31, Old Wolf <oldwolf@inspir e.net.nz> wrote:[color=blue]
                      > Keith Thompson wrote:[color=green]
                      >> "Old Wolf" <oldwolf@inspir e.net.nz> writes:[color=darkred]
                      >>> But if 'i' had a value that couldn't be held by a char, then
                      >>> we would have undefined behaviour (to cut a long story short).[/color]
                      >>
                      >> Cutting a long story short never works around here. 8-)}
                      >>
                      >> It either yields an implementation-defined result or raises an
                      >> implementation-defined signal (the latter is new in C99).[/color]
                      >
                      > An implementation-defined signal might as well be UB, in practice.
                      > I think the only thing you can do safely in a signal handler is set
                      > a flag, or call exit().[/color]

                      Such a signal would most likely be SIGFPE. [I can't imagine what
                      else it would be], but the point is that it's something you can look
                      at the implementation' s documents and find out what it does, and
                      that it'll do the same thing every time.
                      [color=blue]
                      > What is the status of the char after the signal handler returns?
                      > If it's indeterminate, then the subsequent use of it will cause
                      > UB.[/color]

                      Comment

                      • Jordan Abel

                        #12
                        Re: Sort of mystified from an earlier thread

                        On 2005-10-31, Keith Thompson <kst-u@mib.org> wrote:[color=blue]
                        > "Old Wolf" <oldwolf@inspir e.net.nz> writes:[color=green]
                        >> Keith Thompson wrote:[color=darkred]
                        >>> "Old Wolf" <oldwolf@inspir e.net.nz> writes:
                        >>>> But if 'i' had a value that couldn't be held by a char, then
                        >>>> we would have undefined behaviour (to cut a long story short).
                        >>>
                        >>> Cutting a long story short never works around here. 8-)}
                        >>>
                        >>> It either yields an implementation-defined result or raises an
                        >>> implementation-defined signal (the latter is new in C99).[/color]
                        >>
                        >> An implementation-defined signal might as well be UB, in practice.
                        >> I think the only thing you can do safely in a signal handler is set
                        >> a flag, or call exit(). What is the status of the char after the
                        >> signal handler returns? If it's indeterminate, then the subsequent
                        >> use of it will cause UB.[/color]
                        >
                        > So you set a flag in the signal handler; if the flag is set, you don't
                        > look at the variable. It's not going to have anything useful in it
                        > anyway.
                        >
                        > I don't know of any implementation that takes advantage of the new
                        > permission to raise a signal on overflow, and since the signal is
                        > implementation-defined, you can't use it portably.
                        >
                        > BTW, I think type char is guaranteed not to have any trap
                        > representations , so an indeterminate value will just be one of the
                        > values in the range CHAR_MIN..CHAR_ MAX.[/color]

                        what about 100000000 on a signed-magnitude system?

                        unsigned types are guaranteed not to have any trap representations .
                        signed types are not. and char's signed-ness is implementation-specified.

                        Comment

                        • Keith Thompson

                          #13
                          Re: Sort of mystified from an earlier thread

                          Jordan Abel <jmabel@purdue. edu> writes:[color=blue]
                          > On 2005-10-31, Keith Thompson <kst-u@mib.org> wrote:[/color]
                          [...][color=blue][color=green]
                          >> BTW, I think type char is guaranteed not to have any trap
                          >> representations , so an indeterminate value will just be one of the
                          >> values in the range CHAR_MIN..CHAR_ MAX.[/color]
                          >
                          > what about 100000000 on a signed-magnitude system?
                          >
                          > unsigned types are guaranteed not to have any trap representations .
                          > signed types are not. and char's signed-ness is implementation-specified.[/color]

                          unsigned char is specifically guaranteed not to have trap
                          representations (it's represented using a pure binary notation).
                          Other unsigned types have no such guarantee; they can have padding
                          bits. (I think the presence of padding bits allows, but does not
                          require the existence of trap representations .)

                          But C99 6.2.6.1p5 says:

                          Certain object representations need not represent a value of the
                          object type. If the stored value of an object has such a
                          representation and is read by an lvalue expression that does not
                          have character type, the behavior is undefined. If such a
                          representation is produced by a side effect that modifies all or
                          any part of the object by an lvalue expression that does not have
                          character type, the behavior is undefined. Such a representation
                          is called a _trap representation_ .

                          This is the definition of "trap representation" (the term is in
                          italics). I think the "does not have character type" wording implies
                          that plain char, even if it's signed, cannot have any trap
                          representations , but I'd be happier if I could find a clearer
                          statement to that effect.

                          Assume a 2's-complement signed representation for plain char, with
                          CHAR_BIT==8. Without the above statement, the binary value 11111111
                          could be a trap representation; with it, it must represent the value
                          -128.

                          --
                          Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
                          San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
                          We must do something. This is something. Therefore, we must do this.

                          Comment

                          Working...