Allocating memory for strings

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Win Sock

    Allocating memory for strings

    Hi All,
    somebody told me this morning that the following is leagal.

    char *a = "Hello wrold";

    The memory is automatically allocated on the fly. Is this correct?

  • santosh

    #2
    Re: Allocating memory for strings

    Win Sock wrote:
    Hi All,
    somebody told me this morning that the following is leagal.
    >
    char *a = "Hello wrold";
    >
    The memory is automatically allocated on the fly. Is this correct?
    The storage for the string is set aside during translation and the
    pointer 'a' is set to point to it's beginning.

    If the pointer is not static and no other pointers point to the string,
    then the string becomes irretrievable when 'a' goes out of scope.

    Comment

    • christian.bau

      #3
      Re: Allocating memory for strings

      On Oct 6, 6:45 pm, Win Sock <nos...@nospam. comwrote:
      Hi All,
      somebody told me this morning that the following is leagal.
      >
      char *a = "Hello wrold";
      >
      The memory is automatically allocated on the fly. Is this correct?
      No.

      A string literal like "Hello wrold" works exactly as if you had a
      static array of const char, and got a pointer to that array, cast to
      char* instead of const char*. So

      char *a = "Hello wrold";

      works exactly the same as

      const char secret_array [] = "Hello wrold";
      char *a = (char *) secret_array;



      Comment

      • Joe Wright

        #4
        Re: Allocating memory for strings

        Win Sock wrote:
        Hi All,
        somebody told me this morning that the following is leagal.
        >
        char *a = "Hello wrold";
        >
        The memory is automatically allocated on the fly. Is this correct?
        >
        Not in the sense of malloc() and friends. free(a) is Undefined. The
        constant string "Hello wrold" is placed somewhere in memory as an
        anonymous array of char, the address of which is placed in a.

        Spelling? legal and world.

        --
        Joe Wright
        "Everything should be made as simple as possible, but not simpler."
        --- Albert Einstein ---

        Comment

        • Ben Pfaff

          #5
          Re: Allocating memory for strings

          "christian. bau" <christian.bau@ cbau.wanadoo.co .ukwrites:
          char *a = "Hello wrold";
          >
          works exactly the same as
          >
          const char secret_array [] = "Hello wrold";
          char *a = (char *) secret_array;
          If it's outside any function, yet; otherwise, secret_array must
          be declared static.
          --
          "For those who want to translate C to Pascal, it may be that a lobotomy
          serves your needs better." --M. Ambuhl

          "Here are the steps to create a C-to-Turbo-Pascal translator..." --H. Schildt

          Comment

          • Keith Thompson

            #6
            Re: Allocating memory for strings

            "christian. bau" <christian.bau@ cbau.wanadoo.co .ukwrites:
            On Oct 6, 6:45 pm, Win Sock <nos...@nospam. comwrote:
            >somebody told me this morning that the following is leagal.
            >>
            >char *a = "Hello wrold";
            >>
            >The memory is automatically allocated on the fly. Is this correct?
            >
            No.
            >
            A string literal like "Hello wrold" works exactly as if you had a
            static array of const char, and got a pointer to that array, cast to
            char* instead of const char*. So
            >
            char *a = "Hello wrold";
            >
            works exactly the same as
            >
            const char secret_array [] = "Hello wrold";
            char *a = (char *) secret_array;
            Except that string literals aren't const. (Attempting to modify a
            string literal invokes undefined behavior, but only because the
            standard explicitly says so.) It would be better if string literals
            *were* const, but that would have broken existing code back in 1989
            when the ANSI standard first introduced the "const" keyword.

            --
            Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
            San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
            "We must do something. This is something. Therefore, we must do this."
            -- Antony Jay and Jonathan Lynn, "Yes Minister"

            Comment

            • Tor Rustad

              #7
              Re: Allocating memory for strings

              Win Sock wrote:
              Hi All,
              somebody told me this morning that the following is leagal.
              >
              char *a = "Hello wrold";
              >
              The memory is automatically allocated on the fly. Is this correct?
              String literals may be placed in read-only memory, and it's undefined
              behavior (UB) altering what *a points to. Hence, you should rather use:

              const char *a = "Hello wrold";

              Note that, else the compiler might not catch this error:

              $ cat -n main.c
              1 #include <stdio.h>
              2
              3
              4 int main(void)
              5 {
              6 char *a = "Hello";
              7 const char *b = "Hello";
              8
              9 printf("%s %s\n", a, b);
              10
              11 a[0]='\0';
              12 b[0]='\0';
              13
              14 return 0;
              15 }

              $ gcc -ansi -pedantic -W -Wall main.c
              main.c: In function âmainâ:
              main.c:12: error: assignment of read-only location

              above, there was no warning about the UB at line 11!

              --
              Tor <torust [at] online [dot] no>

              "There are two ways of constructing a software design. One way is to
              make it so simple that there are obviously no deficiencies. And the
              other way is to make it so complicated that there are no obvious
              deficiencies"

              Comment

              • santosh

                #8
                Re: Allocating memory for strings

                Tor Rustad wrote:
                Win Sock wrote:
                >Hi All,
                >somebody told me this morning that the following is leagal.
                >>
                >char *a = "Hello wrold";
                >>
                >The memory is automatically allocated on the fly. Is this correct?
                >
                String literals may be placed in read-only memory, and it's undefined
                behavior (UB) altering what *a points to. Hence, you should rather
                use:
                >
                const char *a = "Hello wrold";
                >
                Note that, else the compiler might not catch this error:
                >
                $ cat -n main.c
                1 #include <stdio.h>
                2
                3
                4 int main(void)
                5 {
                6 char *a = "Hello";
                7 const char *b = "Hello";
                8
                9 printf("%s %s\n", a, b);
                10
                11 a[0]='\0';
                12 b[0]='\0';
                13
                14 return 0;
                15 }
                >
                $ gcc -ansi -pedantic -W -Wall main.c
                main.c: In function âmainâ:
                main.c:12: error: assignment of read-only location
                >
                above, there was no warning about the UB at line 11!
                Interestingly if the const qualifier is removed, compilation succeeds
                under gcc, but the executable terminates with a segmentation fault.
                This indicates that gcc places the strings in read-only storage.

                On the other hand under the lcc-linux32 compiler nothing unexpected
                happens. Apparently string literals are _not_ placed into read-only
                storage by lcc-linux32.

                Comment

                • Ben Pfaff

                  #9
                  Re: Allocating memory for strings

                  Keith Thompson <kst-u@mib.orgwrites :
                  "christian. bau" <christian.bau@ cbau.wanadoo.co .ukwrites:
                  > char *a = "Hello wrold";
                  >>
                  >works exactly the same as
                  >>
                  > const char secret_array [] = "Hello wrold";
                  > char *a = (char *) secret_array;
                  >
                  Except that string literals aren't const. (Attempting to modify a
                  string literal invokes undefined behavior, but only because the
                  standard explicitly says so.)
                  I think that's why Christian included the cast to char *. With
                  the cast, the effect is the same.
                  --
                  char a[]="\n .CJacehknorstu" ;int putchar(int);in t main(void){unsi gned long b[]
                  ={0x67dffdff,0x 9aa9aa6a,0xa77f fda9,0x7da6aa6a ,0xa67f6aaa,0xa a9aa9f6,0x11f6} ,*p
                  =b,i=24;for(;p+ =!*p;*p/=4)switch(0[p]&3)case 0:{return 0;for(p--;i--;i--)case+
                  2:{i++;if(i)bre ak;else default:continu e;if(0)case 1:putchar(a[i&15]);break;}}}

                  Comment

                  • Tor Rustad

                    #10
                    Re: Allocating memory for strings

                    santosh wrote:

                    [...]
                    Interestingly if the const qualifier is removed, compilation succeeds
                    under gcc, but the executable terminates with a segmentation fault.
                    This indicates that gcc places the strings in read-only storage.
                    Yes.
                    On the other hand under the lcc-linux32 compiler nothing unexpected
                    happens. Apparently string literals are _not_ placed into read-only
                    storage by lcc-linux32.
                    Did you get a warning with lcc-linux32? I don't have the lcc-linux32
                    compiler, but it could use the same storage location for those two
                    string literals, which I purpose used "Hello" for both.

                    So something "unexpected " could still happen.

                    Note that splint issue two warnings for the sample code i posted.

                    --
                    Tor <torust [at] online [dot] no>

                    "There are two ways of constructing a software design. One way is to
                    make it so simple that there are obviously no deficiencies. And the
                    other way is to make it so complicated that there are no obvious
                    deficiencies"

                    Comment

                    • santosh

                      #11
                      Re: Allocating memory for strings

                      Tor Rustad wrote:
                      santosh wrote:
                      >
                      [...]
                      >
                      >Interestingl y if the const qualifier is removed, compilation succeeds
                      >under gcc, but the executable terminates with a segmentation fault.
                      >This indicates that gcc places the strings in read-only storage.
                      >
                      Yes.
                      >
                      >On the other hand under the lcc-linux32 compiler nothing unexpected
                      >happens. Apparently string literals are _not_ placed into read-only
                      >storage by lcc-linux32.
                      >
                      Did you get a warning with lcc-linux32? I don't have the lcc-linux32
                      compiler, but it could use the same storage location for those two
                      string literals, which I purpose used "Hello" for both.
                      >
                      So something "unexpected " could still happen.
                      >
                      Note that splint issue two warnings for the sample code i posted.
                      Actually for the sake of brevity I omitted to mention that the test
                      program was not what you provided, but a similar one I wrote. Below is
                      it's source:

                      #include <stdio.h>

                      int main(void)
                      {
                      char *a = "Hello ";
                      char *b = "world!\n";

                      printf("%s%s", a, b);
                      *a = *b;
                      *b = *(a+2);
                      printf("%s%s", a, b);
                      return 0;
                      }

                      $ gcc -Wall -Wextra -ansi -pedantic -o t11_gcc t11.c
                      $ ./t11_gcc
                      Hello world!
                      Segmentation fault
                      $

                      $ lcc -ansic t11.c
                      $ gcc -o t11_lcc t11.o
                      [This is needed because lcc-linux32 does not yet do linking]
                      $ ./t11_lcc
                      Hello world!
                      wello lorld!
                      $

                      This seems to indicate that gcc places the string literals in read-only
                      storage while lcc-linux32 doesn't. Both are of course perfectly
                      conforming behaviour and the difference in behaviour is merely a QoI
                      issue.

                      Comment

                      • Tor Rustad

                        #12
                        Re: Allocating memory for strings

                        santosh wrote:
                        Tor Rustad wrote:
                        >
                        >santosh wrote:
                        >>
                        >[...]
                        >>
                        >>Interesting ly if the const qualifier is removed, compilation succeeds
                        >>under gcc, but the executable terminates with a segmentation fault.
                        >>This indicates that gcc places the strings in read-only storage.
                        >Yes.
                        >>
                        >>On the other hand under the lcc-linux32 compiler nothing unexpected
                        >>happens. Apparently string literals are _not_ placed into read-only
                        >>storage by lcc-linux32.
                        >Did you get a warning with lcc-linux32? I don't have the lcc-linux32
                        >compiler, but it could use the same storage location for those two
                        >string literals, which I purpose used "Hello" for both.
                        >>
                        >So something "unexpected " could still happen.
                        >>
                        >Note that splint issue two warnings for the sample code i posted.
                        >
                        Actually for the sake of brevity I omitted to mention that the test
                        program was not what you provided, but a similar one I wrote. Below is
                        it's source:
                        >
                        #include <stdio.h>
                        >
                        int main(void)
                        {
                        char *a = "Hello ";
                        char *b = "world!\n";
                        >
                        printf("%s%s", a, b);
                        *a = *b;
                        *b = *(a+2);
                        printf("%s%s", a, b);
                        return 0;
                        }
                        >
                        $ gcc -Wall -Wextra -ansi -pedantic -o t11_gcc t11.c
                        $ ./t11_gcc
                        Hello world!
                        Segmentation fault
                        $
                        >
                        $ lcc -ansic t11.c
                        $ gcc -o t11_lcc t11.o
                        [This is needed because lcc-linux32 does not yet do linking]
                        $ ./t11_lcc
                        Hello world!
                        wello lorld!
                        $
                        >
                        This seems to indicate that gcc places the string literals in read-only
                        storage while lcc-linux32 doesn't. Both are of course perfectly
                        conforming behaviour and the difference in behaviour is merely a QoI
                        issue.
                        Could you try e.g. this:

                        char *a = "Hello";
                        char *b = "Hello";

                        and check if the lcc use the same storage location?

                        --
                        Tor <torust [at] online [dot] no>

                        "There are two ways of constructing a software design. One way is to
                        make it so simple that there are obviously no deficiencies. And the
                        other way is to make it so complicated that there are no obvious
                        deficiencies"

                        Comment

                        • santosh

                          #13
                          Re: Allocating memory for strings

                          Tor Rustad wrote:
                          santosh wrote:
                          >Tor Rustad wrote:
                          >>
                          >>santosh wrote:
                          >>>
                          >>[...]
                          >>>
                          >>>Interestingl y if the const qualifier is removed, compilation
                          >>>succeeds under gcc, but the executable terminates with a
                          >>>segmentati on fault. This indicates that gcc places the strings in
                          >>>read-only storage.
                          >>Yes.
                          >>>
                          >>>On the other hand under the lcc-linux32 compiler nothing unexpected
                          >>>happens. Apparently string literals are _not_ placed into read-only
                          >>>storage by lcc-linux32.
                          >>Did you get a warning with lcc-linux32? I don't have the lcc-linux32
                          >>compiler, but it could use the same storage location for those two
                          >>string literals, which I purpose used "Hello" for both.
                          >>>
                          >>So something "unexpected " could still happen.
                          >>>
                          >>Note that splint issue two warnings for the sample code i posted.
                          >>
                          >Actually for the sake of brevity I omitted to mention that the test
                          >program was not what you provided, but a similar one I wrote. Below
                          >is it's source:
                          >>
                          >#include <stdio.h>
                          >>
                          >int main(void)
                          >{
                          > char *a = "Hello ";
                          > char *b = "world!\n";
                          >>
                          > printf("%s%s", a, b);
                          > *a = *b;
                          > *b = *(a+2);
                          > printf("%s%s", a, b);
                          > return 0;
                          >}
                          >>
                          >$ gcc -Wall -Wextra -ansi -pedantic -o t11_gcc t11.c
                          >$ ./t11_gcc
                          >Hello world!
                          >Segmentation fault
                          >$
                          >>
                          >$ lcc -ansic t11.c
                          >$ gcc -o t11_lcc t11.o
                          >[This is needed because lcc-linux32 does not yet do linking]
                          >$ ./t11_lcc
                          >Hello world!
                          >wello lorld!
                          >$
                          >>
                          >This seems to indicate that gcc places the string literals in
                          >read-only storage while lcc-linux32 doesn't. Both are of course
                          >perfectly conforming behaviour and the difference in behaviour is
                          >merely a QoI issue.
                          >
                          Could you try e.g. this:
                          >
                          char *a = "Hello";
                          char *b = "Hello";
                          >
                          and check if the lcc use the same storage location?
                          With your changes to the above program, and an additional line of the
                          form:

                          printf("a = %p\tb = %p\n", (void *)a, (void *)b);

                          I get for gcc:

                          $ ./t11_gcc
                          a = 0x80484e4 b = 0x80484e4
                          Segmentation fault
                          $

                          and for lcc:

                          $ ./t11_lcc
                          a = 0x80495dc b = 0x80495dc
                          HelloHellolello lello
                          $

                          So the same storage location is being used for both strings by both
                          compilers with the difference that gcc seems to be placing them in
                          read-only storage while lcc-linux32 places them in modifiable storage.

                          Of course since the code invokes undefined behaviour any result
                          is "correct."

                          Comment

                          • Tor Rustad

                            #14
                            Re: Allocating memory for strings

                            santosh wrote:
                            Tor Rustad wrote:
                            [...]
                            >Could you try e.g. this:
                            >>
                            >char *a = "Hello";
                            >char *b = "Hello";
                            >>
                            >and check if the lcc use the same storage location?
                            >
                            With your changes to the above program, and an additional line of the
                            form:
                            >
                            printf("a = %p\tb = %p\n", (void *)a, (void *)b);
                            >
                            I get for gcc:
                            >
                            $ ./t11_gcc
                            a = 0x80484e4 b = 0x80484e4
                            Segmentation fault
                            $
                            >
                            and for lcc:
                            >
                            $ ./t11_lcc
                            a = 0x80495dc b = 0x80495dc
                            HelloHellolello lello
                            $
                            >
                            So the same storage location is being used for both strings by both
                            compilers with the difference that gcc seems to be placing them in
                            read-only storage while lcc-linux32 places them in modifiable storage.
                            Thanks santosh, I think this last example program illustrate quite well
                            to OP the dangers of modifying string literals. The "unexpected " can
                            happen, including for the current version of the lcc compiler.
                            Of course since the code invokes undefined behaviour any result
                            is "correct."
                            Let us just call the result undefined, like 0/0 is in mathematics. :)

                            --
                            Tor <torust [at] online [dot] no>

                            "There are two ways of constructing a software design. One way is to
                            make it so simple that there are obviously no deficiencies. And the
                            other way is to make it so complicated that there are no obvious
                            deficiencies"

                            Comment

                            • Charles Richmond

                              #15
                              Re: Allocating memory for strings

                              santosh wrote:
                              Win Sock wrote:
                              >
                              >Hi All,
                              >somebody told me this morning that the following is leagal.
                              >>
                              >char *a = "Hello wrold";
                              >>
                              >The memory is automatically allocated on the fly. Is this correct?
                              >
                              The storage for the string is set aside during translation and the
                              pointer 'a' is set to point to it's beginning.
                              >
                              If the pointer is not static and no other pointers point to the string,
                              then the string becomes irretrievable when 'a' goes out of scope.
                              >
                              If the pointer a goes out of scope or is set to a different
                              value, the string may *not* be irretrievable. Some compilers
                              only store *one* copy of each string literal and uses that
                              one copy everywhere the literal appears in the source.

                              So it is possible that this literal may be accessed in other
                              ways than through pointer a.

                              --
                              +----------------------------------------------------------------+
                              | Charles and Francis Richmond richmond at plano dot net |
                              +----------------------------------------------------------------+

                              Comment

                              Working...