confused: vector<char*> and malloc

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Richard

    confused: vector<char*> and malloc

    vector<char*> m_Text;
    m_Text.resize(1 );
    char* foo = "FOO";
    char* bar = "BAR";
    char* foobar = (char*)malloc(s trlen(foo) + strlen(bar) + 1);
    if (foobar)
    {
    strcpy(foobar, foo);
    strcat(foobar, bar);
    }
    m_Text[0] = foobar;

    // Will m_Text[0] get freed when m_Text goes out of scope? If not, should I
    call free(m_Text[0]) in the destructor?



  • Victor Bazarov

    #2
    Re: confused: vector&lt;char* &gt; and malloc

    Richard wrote:[color=blue]
    > vector<char*> m_Text;
    > m_Text.resize(1 );[/color]

    This makes 'm_Text' to be 1 element long. And since 'm_Text' was empty
    prior to that, it adds 1 pointer to it and makes it null.
    [color=blue]
    > char* foo = "FOO";
    > char* bar = "BAR";
    > char* foobar = (char*)malloc(s trlen(foo) + strlen(bar) + 1);
    > if (foobar)
    > {
    > strcpy(foobar, foo);
    > strcat(foobar, bar);
    > }[/color]

    At this point 'foobar' is a pointer to 7 character array, allocated in
    the free store. The array has 'F', 'O', 'O', 'O', 'B', 'A', 'R', '\0'
    in it, in sequence.
    [color=blue]
    > m_Text[0] = foobar;[/color]

    This makes the value of the only element in the vector 'm_Text' to be the
    same as the pointer to that 7-character array.
    [color=blue]
    > // Will m_Text[0] get freed when m_Text goes out of scope?[/color]

    No.
    [color=blue]
    > If not, should I
    > call free(m_Text[0]) in the destructor?[/color]

    Probably. Assuming you're talking about the destructor of the class in
    which 'm_Text' is a data member.

    V

    Comment

    • Kyle

      #3
      Re: confused: vector&lt;char* &gt; and malloc

      Richard wrote:[color=blue]
      > vector<char*> m_Text;
      > m_Text.resize(1 );
      > char* foo = "FOO";
      > char* bar = "BAR";
      > char* foobar = (char*)malloc(s trlen(foo) + strlen(bar) + 1);[/color]

      malloc is C, why dont you use new ?
      [color=blue]
      > if (foobar)
      > {
      > strcpy(foobar, foo);
      > strcat(foobar, bar);
      > }
      > m_Text[0] = foobar;
      >
      > // Will m_Text[0] get freed when m_Text goes out of scope? If not, should I
      > call free(m_Text[0]) in the destructor?
      >[/color]

      try this if you dont want to manage memory on your own

      vector<string> m_Text;

      char* foo = "FOO";
      char* bar = "BAR";

      string foobar(foo);
      foobar.append( bar );

      m_Text.push_bac k( foobar );

      Comment

      • Ali Çehreli

        #4
        Re: confused: vector&lt;char* &gt; and malloc

        "Kyle" <invalid@e.mail .com> wrote in message
        news:ddtj91$c16 $1@nemesis.news .tpi.pl...[color=blue]
        > Richard wrote:[color=green]
        >> vector<char*> m_Text;
        >> m_Text.resize(1 );
        >> char* foo = "FOO";
        >> char* bar = "BAR";
        >> char* foobar = (char*)malloc(s trlen(foo) + strlen(bar) + 1);[/color]
        >
        > malloc is C, why dont you use new ?[/color]

        malloc is C++ too.

        It is used to allocate raw memory in C++. Since new (and new[]) allocates
        space and constructs object(s), new (and new[]) is not suitable when there
        is not enough information to construct the object(s).
        [color=blue]
        > try this if you dont want to manage memory on your own
        >
        > vector<string> m_Text;[/color]

        Great advice!

        Ali

        Comment

        • Richard

          #5
          Re: confused: vector&lt;char* &gt; and malloc


          "Kyle" <invalid@e.mail .com> wrote in message
          news:ddtj91$c16 $1@nemesis.news .tpi.pl...[color=blue]
          > Richard wrote:[color=green]
          > > vector<char*> m_Text;
          > > m_Text.resize(1 );
          > > char* foo = "FOO";
          > > char* bar = "BAR";
          > > char* foobar = (char*)malloc(s trlen(foo) + strlen(bar) + 1);[/color]
          >
          > malloc is C, why dont you use new ?
          >[color=green]
          > > if (foobar)
          > > {
          > > strcpy(foobar, foo);
          > > strcat(foobar, bar);
          > > }
          > > m_Text[0] = foobar;
          > >
          > > // Will m_Text[0] get freed when m_Text goes out of scope? If not,[/color][/color]
          should I[color=blue][color=green]
          > > call free(m_Text[0]) in the destructor?
          > >[/color]
          >
          > try this if you dont want to manage memory on your own
          >
          > vector<string> m_Text;
          >
          > char* foo = "FOO";
          > char* bar = "BAR";
          >
          > string foobar(foo);
          > foobar.append( bar );
          >
          > m_Text.push_bac k( foobar );[/color]

          That leads me to another question. Is it common practice to pass std::string
          as a function argument? Is there much overhead?

          void Foo(string& str)

          If I went with vector<string>, would I need to covert all my functions that
          currently use char* as an argument?

          Thanks.


          Comment

          • Victor Bazarov

            #6
            Re: confused: vector&lt;char* &gt; and malloc

            Richard wrote:[color=blue]
            > [...] Is it common practice to pass std::string
            > as a function argument?[/color]

            By value, no. By a reference, or by a reference to const, yes.
            [color=blue]
            > Is there much overhead?[/color]

            There can be. Just like with any other UDT.
            [color=blue]
            > void Foo(string& str)
            >
            > If I went with vector<string>, would I need to covert all my functions that
            > currently use char* as an argument?[/color]

            Yes, most likely. BTW, if your functions that currently use 'char*' do
            not change the characters in the arrays, you should declare 'char const*'
            as the argument type. Then you don't have to change much, but you will
            need to use 'c_str' member:

            void my_function(cha r const*);
            ...
            my_function(v[i].c_str());

            V

            Comment

            • Richard

              #7
              Re: confused: vector&lt;char* &gt; and malloc


              "Victor Bazarov" <v.Abazarov@com Acast.net> wrote in message
              news:X%sMe.3014 5$Tf5.10167@new sread1.mlpsca01 .us.to.verio.ne t...[color=blue]
              > Richard wrote:[color=green]
              > > [...] Is it common practice to pass std::string
              > > as a function argument?[/color]
              >
              > By value, no. By a reference, or by a reference to const, yes.
              >[color=green]
              > > Is there much overhead?[/color]
              >
              > There can be. Just like with any other UDT.
              >[color=green]
              > > void Foo(string& str)
              > >
              > > If I went with vector<string>, would I need to covert all my functions[/color][/color]
              that[color=blue][color=green]
              > > currently use char* as an argument?[/color]
              >
              > Yes, most likely. BTW, if your functions that currently use 'char*' do
              > not change the characters in the arrays, you should declare 'char const*'
              > as the argument type. Then you don't have to change much, but you will
              > need to use 'c_str' member:
              >
              > void my_function(cha r const*);
              > ...
              > my_function(v[i].c_str());
              >
              > V[/color]


              My array of strings is very large. I did a small test using char* vs string
              and the results were really bad. I must be doing something wrong:

              vector<char*> test;
              test.resize(100 00000);
              for(int t = 0; t != 10000000; ++t)
              {
              test[t] = "TEST 123";
              }

              That takes up a minimal amount of memory if using char*. But if you change
              it to vector<string> test; it takes up around 1000 times more memory! What
              am I doing wrong?


              Comment

              • Victor Bazarov

                #8
                Re: confused: vector&lt;char* &gt; and malloc

                Richard wrote:[color=blue]
                > My array of strings is very large. I did a small test using char* vs string
                > and the results were really bad. I must be doing something wrong:
                >
                > vector<char*> test;
                > test.resize(100 00000);
                > for(int t = 0; t != 10000000; ++t)
                > {
                > test[t] = "TEST 123";
                > }
                >
                > That takes up a minimal amount of memory if using char*.[/color]

                Yeah... You got a vector all elements of which are the same. No extra
                memory is allocated, just 10 million pointers. The consumption of memory
                (not considering the overhead for dynamic memory management) is quite easy
                to calculate:

                sizeof(vector<c har*>) +
                sizeof(char*) * test.size() + sizeof("TEST 123")

                (which should give about 40000000, give or take a few bytes)
                [color=blue]
                > But if you change
                > it to vector<string> test; it takes up around 1000 times more memory! What
                > am I doing wrong?[/color]

                I am not sure, to be honest with you. Every 'string' maintains its own
                array of char internally. When you resize the 'test' vector to contain
                ten millions of 'string' objects, it first puts a bunch of empty ones
                there, and then when you in the loop assign those values, every string
                in the vector needs to allocate its own small array (and possibly a bit
                larger than asked for), which may lead to severe fragmentation of memory,
                especially considering that a temporary is probably created to accommodate
                your "TEST 123" literal... The final memory cost should be around

                sizeof(vector<s tring>) +
                sizeof(string) * test.size() +
                sizeof("TEST 123") * test.size()

                It is hard to believe it takes "around 1000 times more memory". The
                string objects themselves are not that big, so you might be looking at
                four, maybe ten, times the memory consumption, but definitely not the
                thousand times. Are you running on a 64-bit machine? 1000 times more
                with a vector of 10 million pointers is beyond what a 32-bit machine can
                handle, that's for sure.

                V

                Comment

                • Default User

                  #9
                  Re: confused: vector&lt;char* &gt; and malloc

                  Richard wrote:

                  [color=blue]
                  > My array of strings is very large. I did a small test using char* vs
                  > string and the results were really bad. I must be doing something
                  > wrong:
                  >
                  > vector<char*> test;
                  > test.resize(100 00000);
                  > for(int t = 0; t != 10000000; ++t)
                  > {
                  > test[t] = "TEST 123";
                  > }
                  >
                  > That takes up a minimal amount of memory if using char*. But if you
                  > change it to vector<string> test; it takes up around 1000 times more
                  > memory! What am I doing wrong?[/color]


                  You stuff 100 million copies of a pointer to the same piece of memory
                  into the vector. In the real application, you would presumably have a
                  different string in each slot in the vector.

                  To do a fairer comparison:

                  vector<char*> test;
                  test.resize(100 00000);
                  char* tmp = "TEST 123";

                  for(int t = 0; t != 10000000; ++t)
                  {
                  test[t] = new char[9];
                  strcpy(test[t], tmp);
                  }




                  Brian

                  Comment

                  • Nick Keighley

                    #10
                    Re: confused: vector&lt;char* &gt; and malloc

                    Victor Bazarov wrote:[color=blue]
                    > Richard wrote:[/color]
                    [color=blue][color=green]
                    > > vector<char*> m_Text;
                    > > m_Text.resize(1 );[/color][/color]

                    <snip>
                    [color=blue][color=green]
                    > > char* foo = "FOO";
                    > > char* bar = "BAR";
                    > > char* foobar = (char*)malloc(s trlen(foo) + strlen(bar) + 1);
                    > > if (foobar)
                    > > {
                    > > strcpy(foobar, foo);
                    > > strcat(foobar, bar);
                    > > }[/color]
                    >
                    > At this point 'foobar' is a pointer to 7 character array, allocated in
                    > the free store. The array has 'F', 'O', 'O', 'O', 'B', 'A', 'R', '\0'
                    > in it, in sequence.[/color]

                    'F', 'O', 'O', 'B', 'A', 'R', '\0'

                    <snip>

                    Comment

                    • Stuart MacMartin

                      #11
                      Re: confused: vector&lt;char* &gt; and malloc

                      > It is hard to believe it takes "around 1000 times more memory".

                      The pointer case:

                      sizeof(vector<c har*> + sizeof(char*) * test.size() + sizeof("TEST 123")
                      = 16 + 4 * 10,000,000 + 9

                      The string case: each string, since it's given a const char*, will
                      make a copy, I assume (sorry, I don't use std::string but that would be
                      reasonable behaviour: share string only if it can reference count the
                      memory)

                      So we have:
                      sizeof(vector<s tring>) + sizeof(string) * test.size() + sizeof("TEST
                      123") * test.size + heap overhead * test.size()

                      Sorry, don't know size of string, but perhaps 12 bytes (ref count,
                      length, alloc length)
                      On PC, request for 8 bytes requires 40 bytes including heap overhead.
                      I don't recall the overhead on linux, but it's less. Perhaps this uses
                      24 bytes.

                      So our guess of memory usage is:
                      16 + (12 + 40) * 10,000,000 on PC, or 13 times the amount of memory
                      needed
                      by your single pointer case.

                      If you are seeing something significantly different then something is
                      odd.
                      Perhaps 1,000,000 vs. 10,000,000: easy typo.

                      Stuart

                      Comment

                      • Stuart MacMartin

                        #12
                        Re: confused: vector&lt;char* &gt; and malloc

                        >> malloc is C, why dont you use new ?[color=blue]
                        >malloc is C++ too.[/color]

                        malloc is there for compatibility. It doesn't handle objects so is less
                        general.

                        It requires a corresponding free() whereas new requires corresponding
                        delete or delete []. Since you need to use new elsewhere in your code,
                        why risk using the wrong free/delete call? You just confuse the issue
                        by intermixing malloc and new and this can cause a bug that takes
                        months to find (experience talking). What if you change a structure to
                        something requiring a destructor (e.g. you change a const char* member
                        variable to a string). Do you want to go through all your code to
                        change malloc/free to new/delete assuming you even notice the problem?

                        malloc is dangerous.

                        Stuart

                        Comment

                        • Old Wolf

                          #13
                          Re: confused: vector&lt;char* &gt; and malloc

                          Richard wrote:[color=blue]
                          >
                          > My array of strings is very large. I did a small test using char* vs string
                          > and the results were really bad. I must be doing something wrong:
                          >
                          > vector<char*> test;
                          > test.resize(100 00000);
                          > for(int t = 0; t != 10000000; ++t)
                          > {
                          > test[t] = "TEST 123";
                          > }
                          >
                          > That takes up a minimal amount of memory if using char*. But if you change
                          > it to vector<string> test; it takes up around 1000 times more memory! What
                          > am I doing wrong?[/color]

                          You are comparing apples with oranges. This test program maintains
                          one string and keeps 10 million pointers to it. If you modify one
                          string then they will all change. But with the string example, there
                          are 10 million fat pointers and 10 million strings.

                          I doubt it takes 1000 times more memory, unless you have 40 gigabytes
                          of RAM, as Victor pointed out.

                          Change your test program to:

                          vector<char*> test;
                          test.resize(100 00000);
                          for(int t = 0; t != 10000000; ++t)
                          {
                          test[t] = (char *)malloc(9);
                          std::strcpy( test[t], "TEST 123" );
                          }

                          and then compare the memory usage. (You will probably find
                          this example slightly smaller than the string example, but
                          not by a lot).

                          Finally, what compiler do you use. Many standard libraries
                          use SSO (Small String Optimisation), meaning that if the
                          string data is smaller than sizeof(string), then it actually
                          stores the entire string internally, without having to
                          allocate more memory. In this case, the string example
                          might even use less memory than the malloc example.

                          Comment

                          • Ali Çehreli

                            #14
                            Re: confused: vector&lt;char* &gt; and malloc

                            "Stuart MacMartin" <sjm@igs.net> wrote in message
                            news:1124551155 .857118.190770@ g14g2000cwa.goo glegroups.com.. .[color=blue][color=green][color=darkred]
                            >>> malloc is C, why dont you use new ?[/color]
                            >>malloc is C++ too.[/color]
                            >
                            > malloc is there for compatibility. It doesn't handle objects so is less
                            > general.[/color]

                            That's why I said that it's used to allocate raw memory. A quote from my
                            e-mail:

                            <quote>
                            It is used to allocate raw memory in C++.
                            </quote>
                            [color=blue]
                            > It requires a corresponding free() whereas new requires corresponding
                            > delete or delete []. Since you need to use new elsewhere in your code,
                            > why risk using the wrong free/delete call?[/color]

                            Because like any good C++ code, my code doesn't contain a single delete or
                            delete[]. Dynamic objects are handled by smart pointers (the RAII idiom).
                            Codes like mine are immune from such problems.

                            Having said that, I don't use malloc either.
                            [color=blue]
                            > You just confuse the issue
                            > by intermixing malloc and new and this can cause a bug that takes
                            > months to find (experience talking). What if you change a structure to
                            > something requiring a destructor (e.g. you change a const char* member
                            > variable to a string).[/color]

                            No, malloc is for raw buffers only. structs are not necessarily raw buffers.
                            For example, I don't keep POD structs around either. Probably all of them
                            have constructors.
                            [color=blue]
                            > malloc is dangerous.[/color]

                            Sorry, I never heard that one before and I don't agree with the statement.
                            (A search for "malloc is dangerous" on Google finds only your statement.) I
                            agree that we can do bad things with malloc, but malloc would not be
                            dangerous alone.

                            Ali

                            Comment

                            Working...