Efficient strcat implementation.

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Mahesh

    Efficient strcat implementation.

    Hi,

    I am looking for efficient string cancatination c code. I did it using
    ptr [using default method] but my code i think is not efficient.

    Please help.

    Thanks a lot
  • Ian Collins

    #2
    Re: Efficient strcat implementation.

    Mahesh wrote:
    Hi,
    >
    I am looking for efficient string cancatination c code. I did it using
    ptr [using default method] but my code i think is not efficient.
    >
    Please help.
    >
    Show us what you have.

    --
    Ian Collins.

    Comment

    • Joachim Schmitz

      #3
      Re: Efficient strcat implementation.

      Mahesh wrote:
      I am looking for efficient string cancatination c code. I did it using
      ptr [using default method] but my code i think is not efficient.
      Just use the strcat your implementation provides for you. On any decent
      implementation it is pretty unlikely that you can come up with something
      more efficient.

      Bye, Jojo


      Comment

      • Ben Pfaff

        #4
        Re: Efficient strcat implementation.

        Mahesh <mmp.cse@gmail. comwrites:
        I am looking for efficient string cancatination c code. I did it using
        ptr [using default method] but my code i think is not efficient.
        One possible implementation:
        strcpy(strchr(d st, '\0'), src);

        But here is what the GNU libc manual says about strcat:

        ----------------------------------------------------------------------

        Programmers using the `strcat' or `wcscat' function (or the
        following `strncat' or `wcsncar' functions for that matter) can easily
        be recognized as lazy and reckless. In almost all situations the
        lengths of the participating strings are known (it better should be
        since how can one otherwise ensure the allocated size of the buffer is
        sufficient?) Or at least, one could know them if one keeps track of the
        results of the various function calls. But then it is very inefficient
        to use `strcat'/`wcscat'. A lot of time is wasted finding the end of
        the destination string so that the actual copying can start. This is a
        common example:

        /* This function concatenates arbitrarily many strings. The last
        parameter must be `NULL'. */
        char *
        concat (const char *str, ...)
        {
        va_list ap, ap2;
        size_t total = 1;
        const char *s;
        char *result;

        va_start (ap, str);
        /* Actually `va_copy', but this is the name more gcc versions
        understand. */
        __va_copy (ap2, ap);

        /* Determine how much space we need. */
        for (s = str; s != NULL; s = va_arg (ap, const char *))
        total += strlen (s);

        va_end (ap);

        result = (char *) malloc (total);
        if (result != NULL)
        {
        result[0] = '\0';

        /* Copy the strings. */
        for (s = str; s != NULL; s = va_arg (ap2, const char *))
        strcat (result, s);
        }

        va_end (ap2);

        return result;
        }

        This looks quite simple, especially the second loop where the strings
        are actually copied. But these innocent lines hide a major performance
        penalty. Just imagine that ten strings of 100 bytes each have to be
        concatenated. For the second string we search the already stored 100
        bytes for the end of the string so that we can append the next string.
        For all strings in total the comparisons necessary to find the end of
        the intermediate results sums up to 5500! If we combine the copying
        with the search for the allocation we can write this function more
        efficient:

        char *
        concat (const char *str, ...)
        {
        va_list ap;
        size_t allocated = 100;
        char *result = (char *) malloc (allocated);

        if (result != NULL)
        {
        char *newp;
        char *wp;

        va_start (ap, str);

        wp = result;
        for (s = str; s != NULL; s = va_arg (ap, const char *))
        {
        size_t len = strlen (s);

        /* Resize the allocated memory if necessary. */
        if (wp + len + 1 result + allocated)
        {
        allocated = (allocated + len) * 2;
        newp = (char *) realloc (result, allocated);
        if (newp == NULL)
        {
        free (result);
        return NULL;
        }
        wp = newp + (wp - result);
        result = newp;
        }

        wp = mempcpy (wp, s, len);
        }

        /* Terminate the result string. */
        *wp++ = '\0';

        /* Resize memory to the optimal size. */
        newp = realloc (result, wp - result);
        if (newp != NULL)
        result = newp;

        va_end (ap);
        }

        return result;
        }

        With a bit more knowledge about the input strings one could fine-tune
        the memory allocation. The difference we are pointing to here is that
        we don't use `strcat' anymore. We always keep track of the length of
        the current intermediate result so we can safe us the search for the
        end of the string and use `mempcpy'. Please note that we also don't
        use `stpcpy' which might seem more natural since we handle with
        strings. But this is not necessary since we already know the length of
        the string and therefore can use the faster memory copying function.
        The example would work for wide characters the same way.

        Whenever a programmer feels the need to use `strcat' she or he
        should think twice and look through the program whether the code cannot
        be rewritten to take advantage of already calculated results. Again: it
        is almost always unnecessary to use `strcat'.

        --
        "Some people *are* arrogant, and others read the FAQ."
        --Chris Dollin

        Comment

        • Ben Bacarisse

          #5
          Re: Efficient strcat implementation.

          Ben Pfaff <blp@cs.stanfor d.eduwrites:
          Mahesh <mmp.cse@gmail. comwrites:
          >
          >I am looking for efficient string cancatination c code. I did it using
          >ptr [using default method] but my code i think is not efficient.
          >
          One possible implementation:
          strcpy(strchr(d st, '\0'), src);
          >
          But here is what the GNU libc manual says about strcat:
          <snip>
          char *
          concat (const char *str, ...)
          {
          va_list ap;
          size_t allocated = 100;
          char *result = (char *) malloc (allocated);
          >
          if (result != NULL)
          {
          char *newp;
          char *wp;
          >
          va_start (ap, str);
          >
          wp = result;
          for (s = str; s != NULL; s = va_arg (ap, const char *))
          {
          size_t len = strlen (s);
          >
          /* Resize the allocated memory if necessary. */
          if (wp + len + 1 result + allocated)
          {
          allocated = (allocated + len) * 2;
          newp = (char *) realloc (result, allocated);
          if (newp == NULL)
          {
          free (result);
          return NULL;
          }
          wp = newp + (wp - result);
          result = newp;
          It is probably worth pointing out (since there was a thread about this
          recently) that the above should really use something like:

          size_t offset = wp - result;
          allocated = (allocated + len) * 2;
          newp = (char *) realloc (result, allocated);
          if (newp == NULL)
          {
          free (result);
          return NULL;
          }
          wp = newp + offset;
          result = newp;

          to avoid the UB of using the old pointer.
          }
          >
          wp = mempcpy (wp, s, len);
          }
          >
          /* Terminate the result string. */
          *wp++ = '\0';
          >
          /* Resize memory to the optimal size. */
          newp = realloc (result, wp - result);
          if (newp != NULL)
          result = newp;
          >
          va_end (ap);
          }
          >
          return result;
          }
          --
          Ben.

          Comment

          • Richard

            #6
            Re: Efficient strcat implementation.

            Ben Bacarisse <ben.usenet@bsb .me.ukwrites:
            Ben Pfaff <blp@cs.stanfor d.eduwrites:
            >
            >Mahesh <mmp.cse@gmail. comwrites:
            >>
            >>I am looking for efficient string cancatination c code. I did it using
            >>ptr [using default method] but my code i think is not efficient.
            >>
            >One possible implementation:
            > strcpy(strchr(d st, '\0'), src);
            >>
            >But here is what the GNU libc manual says about strcat:
            <snip>
            > char *
            > concat (const char *str, ...)
            > {
            > va_list ap;
            > size_t allocated = 100;
            > char *result = (char *) malloc (allocated);
            >>
            > if (result != NULL)
            > {
            > char *newp;
            > char *wp;
            >>
            > va_start (ap, str);
            >>
            > wp = result;
            > for (s = str; s != NULL; s = va_arg (ap, const char *))
            > {
            > size_t len = strlen (s);
            >>
            > /* Resize the allocated memory if necessary. */
            > if (wp + len + 1 result + allocated)
            > {
            > allocated = (allocated + len) * 2;
            > newp = (char *) realloc (result, allocated);
            > if (newp == NULL)
            > {
            > free (result);
            > return NULL;
            > }
            > wp = newp + (wp - result);
            > result = newp;
            >
            It is probably worth pointing out (since there was a thread about this
            recently) that the above should really use something like:
            >
            size_t offset = wp - result;
            allocated = (allocated + len) * 2;
            newp = (char *) realloc (result, allocated);
            Why do you cast the realloc? Is it different in that respect from
            malloc?

            Comment

            • Ben Bacarisse

              #7
              Re: Efficient strcat implementation.

              Richard <devr_@gmail.co mwrites:
              Ben Bacarisse <ben.usenet@bsb .me.ukwrites:
              >
              >Ben Pfaff <blp@cs.stanfor d.eduwrites:
              >>But here is what the GNU libc manual says about strcat:
              ><snip>
              <snip>
              >> char *result = (char *) malloc (allocated);
              >> newp = (char *) realloc (result, allocated);
              <snip>
              >
              Why do you cast the realloc?
              I don't. The code's original author did for reasons one can only
              speculate on.
              Is it different in that respect from malloc?
              No (and they also did it for the malloc call).

              --
              Ben.

              Comment

              • jacob navia

                #8
                Re: Efficient strcat implementation.

                Ben Pfaff wrote:
                Mahesh <mmp.cse@gmail. comwrites:
                >
                >I am looking for efficient string cancatination c code. I did it using
                >ptr [using default method] but my code i think is not efficient.
                >
                One possible implementation:
                strcpy(strchr(d st, '\0'), src);
                >
                But here is what the GNU libc manual says about strcat:
                >
                ----------------------------------------------------------------------
                >
                Programmers using the `strcat' or `wcscat' function (or the
                following `strncat' or `wcsncar' functions for that matter) can easily
                be recognized as lazy and reckless. In almost all situations the
                lengths of the participating strings are known (it better should be
                since how can one otherwise ensure the allocated size of the buffer is
                sufficient?) Or at least, one could know them if one keeps track of the
                results of the various function calls. But then it is very inefficient
                to use `strcat'/`wcscat'. A lot of time is wasted finding the end of
                the destination string so that the actual copying can start. This is a
                common example:
                >
                /* This function concatenates arbitrarily many strings. The last
                parameter must be `NULL'. */
                char *
                concat (const char *str, ...)
                {
                va_list ap, ap2;
                size_t total = 1;
                const char *s;
                char *result;
                >
                va_start (ap, str);
                /* Actually `va_copy', but this is the name more gcc versions
                understand. */
                __va_copy (ap2, ap);
                >
                /* Determine how much space we need. */
                for (s = str; s != NULL; s = va_arg (ap, const char *))
                total += strlen (s);
                >
                va_end (ap);
                >
                result = (char *) malloc (total);
                if (result != NULL)
                {
                result[0] = '\0';
                >
                /* Copy the strings. */
                for (s = str; s != NULL; s = va_arg (ap2, const char *))
                strcat (result, s);
                }
                >
                va_end (ap2);
                >
                return result;
                }
                >
                This looks quite simple, especially the second loop where the strings
                are actually copied. But these innocent lines hide a major performance
                penalty. Just imagine that ten strings of 100 bytes each have to be
                concatenated. For the second string we search the already stored 100
                bytes for the end of the string so that we can append the next string.
                For all strings in total the comparisons necessary to find the end of
                the intermediate results sums up to 5500! If we combine the copying
                with the search for the allocation we can write this function more
                efficient:
                >
                char *
                concat (const char *str, ...)
                {
                va_list ap;
                size_t allocated = 100;
                char *result = (char *) malloc (allocated);
                >
                if (result != NULL)
                {
                char *newp;
                char *wp;
                >
                va_start (ap, str);
                >
                wp = result;
                for (s = str; s != NULL; s = va_arg (ap, const char *))
                {
                size_t len = strlen (s);
                >
                /* Resize the allocated memory if necessary. */
                if (wp + len + 1 result + allocated)
                {
                allocated = (allocated + len) * 2;
                newp = (char *) realloc (result, allocated);
                if (newp == NULL)
                {
                free (result);
                return NULL;
                }
                wp = newp + (wp - result);
                result = newp;
                }
                >
                wp = mempcpy (wp, s, len);
                }
                >
                /* Terminate the result string. */
                *wp++ = '\0';
                >
                /* Resize memory to the optimal size. */
                newp = realloc (result, wp - result);
                if (newp != NULL)
                result = newp;
                >
                va_end (ap);
                }
                >
                return result;
                }
                >
                With a bit more knowledge about the input strings one could fine-tune
                the memory allocation. The difference we are pointing to here is that
                we don't use `strcat' anymore. We always keep track of the length of
                the current intermediate result so we can safe us the search for the
                end of the string and use `mempcpy'. Please note that we also don't
                use `stpcpy' which might seem more natural since we handle with
                strings. But this is not necessary since we already know the length of
                the string and therefore can use the faster memory copying function.
                The example would work for wide characters the same way.
                >
                Whenever a programmer feels the need to use `strcat' she or he
                should think twice and look through the program whether the code cannot
                be rewritten to take advantage of already calculated results. Again: it
                is almost always unnecessary to use `strcat'.
                >
                The problem is not a lazy programmer but
                zero terminated strings!

                Using counted strings is much more efficient


                --
                jacob navia
                jacob at jacob point remcomp point fr
                logiciels/informatique

                Comment

                • Willem

                  #9
                  Re: Efficient strcat implementation.

                  jacob wrote:
                  ) The problem is not a lazy programmer but
                  ) zero terminated strings!
                  )
                  ) Using counted strings is much more efficient

                  In some cases, yes.


                  SaSW, Willem
                  --
                  Disclaimer: I am in no way responsible for any of the statements
                  made in the above text. For all I know I might be
                  drugged or something..
                  No I'm not paranoid. You all think I'm paranoid, don't you !
                  #EOT

                  Comment

                  • Ben Pfaff

                    #10
                    Re: Efficient strcat implementation.

                    Richard <devr_@gmail.co mwrites:
                    Ben Bacarisse <ben.usenet@bsb .me.ukwrites:
                    >
                    >Ben Pfaff <blp@cs.stanfor d.eduwrites:
                    >> newp = (char *) realloc (result, allocated);
                    >>
                    >It is probably worth pointing out (since there was a thread about this
                    >recently) that the above should really use something like:
                    >>
                    > size_t offset = wp - result;
                    > allocated = (allocated + len) * 2;
                    > newp = (char *) realloc (result, allocated);
                    >
                    Why do you cast the realloc? Is it different in that respect from
                    malloc?
                    I don't recommend casting the result of realloc() or malloc(),
                    for the same reasons. I assume that Ben B. cast the result
                    because the code in the GNU C library manual did so. I don't
                    know whether the authors of that code had a good reason to do so;
                    I would guess not.
                    --
                    char a[]="\n .CJacehknorstu" ;int putchar(int);in t main(void){unsi gned long b[]
                    ={0x67dffdff,0x 9aa9aa6a,0xa77f fda9,0x7da6aa6a ,0xa67f6aaa,0xa a9aa9f6,0x11f6} ,*p
                    =b,i=24;for(;p+ =!*p;*p/=4)switch(0[p]&3)case 0:{return 0;for(p--;i--;i--)case+
                    2:{i++;if(i)bre ak;else default:continu e;if(0)case 1:putchar(a[i&15]);break;}}}

                    Comment

                    • CBFalconer

                      #11
                      Re: Efficient strcat implementation.

                      Ben Pfaff wrote:
                      >
                      .... snip ...
                      >
                      Whenever a programmer feels the need to use `strcat' she or he
                      should think twice and look through the program whether the code
                      cannot be rewritten to take advantage of already calculated
                      results. Again: it is almost always unnecessary to use `strcat'.
                      Besides which the use of strlcat and strlcpy is much easier and
                      more accurate. These are non-standard functions. One public
                      domain implementation, with source and documentation, in standard
                      C, is available at:

                      <http://cbfalconer.home .att.net/download/>

                      --
                      [mail]: Chuck F (cbfalconer at maineline dot net)
                      [page]: <http://cbfalconer.home .att.net>
                      Try the download section.


                      ** Posted from http://www.teranews.com **

                      Comment

                      • Ben Pfaff

                        #12
                        Re: Efficient strcat implementation.

                        CBFalconer <cbfalconer@yah oo.comwrites:
                        Ben Pfaff wrote:
                        >Whenever a programmer feels the need to use `strcat' she or he
                        >should think twice and look through the program whether the code
                        >cannot be rewritten to take advantage of already calculated
                        >results. Again: it is almost always unnecessary to use `strcat'.
                        I was quoting the GNU C library manual when I wrote that.
                        --
                        int main(void){char p[]="ABCDEFGHIJKLM NOPQRSTUVWXYZab cdefghijklmnopq rstuvwxyz.\
                        \n",*q="kl BIcNBFr.NKEzjwC IxNJC";int i=sizeof p/2;char *strchr();int putchar(\
                        );while(*q){i+= strchr(p,*q++)-p;if(i>=(int)si zeof p)i-=sizeof p-1;putchar(p[i]\
                        );}return 0;}

                        Comment

                        • Mahesh

                          #13
                          Re: Efficient strcat implementation.

                          On Apr 22, 4:10 am, Ian Collins <ian-n...@hotmail.co mwrote:
                          Maheshwrote:
                          Hi,
                          >
                          I am looking forefficientstr ingcancatinatio n c code. I did it using
                          ptr [using default method] but my code i think is notefficient.
                          >
                          Please help.
                          >
                          Show us what you have.
                          >
                          --
                          Ian Collins.
                          here is my code: its very very simple prog.

                          #include <stdio.h>
                          #include <malloc.h>
                          #include <string.h>
                          int main(void)
                          {
                          char *src = "String";
                          char *dest = "Concatenat ion Code";
                          char *temp;
                          char *temp1;
                          temp = (char*)calloc(1 , sizeof(src)+siz eof(dest)+2);
                          temp1 = temp;
                          while(*src != '\0' && (*temp++ = *src++));
                          *(temp++) = ' ';
                          while((*temp++ = *dest++));
                          printf("Final Result = %s \n", temp1);
                          }

                          Comment

                          • Mahesh

                            #14
                            Re: Efficient strcat implementation.

                            On Apr 22, 11:34 am, Ben Pfaff <b...@cs.stanfo rd.eduwrote:
                            Mahesh<mmp....@ gmail.comwrites :
                            I am looking forefficientstr ingcancatinatio n c code. I did it using
                            ptr [using default method] but my code i think is notefficient.
                            >
                            One possible implementation:
                            strcpy(strchr(d st, '\0'), src);
                            >
                            But here is what the GNU libc manual says about strcat:
                            >
                            ----------------------------------------------------------------------
                            >
                            Programmers using the `strcat' or `wcscat' function (or the
                            following `strncat' or `wcsncar' functions for that matter) can easily
                            be recognized as lazy and reckless. In almost all situations the
                            lengths of the participating strings are known (it better should be
                            since how can one otherwise ensure the allocated size of the buffer is
                            sufficient?) Or at least, one could know them if one keeps track of the
                            results of the various function calls. But then it is very inefficient
                            to use `strcat'/`wcscat'. A lot of time is wasted finding the end of
                            the destinationstri ngso that the actual copying can start. This is a
                            common example:
                            >
                            /* This function concatenates arbitrarily many strings. The last
                            parameter must be `NULL'. */
                            char *
                            concat (const char *str, ...)
                            {
                            va_list ap, ap2;
                            size_t total = 1;
                            const char *s;
                            char *result;
                            >
                            va_start (ap, str);
                            /* Actually `va_copy', but this is the name more gcc versions
                            understand. */
                            __va_copy (ap2, ap);
                            >
                            /* Determine how much space we need. */
                            for (s = str; s != NULL; s = va_arg (ap, const char *))
                            total += strlen (s);
                            >
                            va_end (ap);
                            >
                            result = (char *) malloc (total);
                            if (result != NULL)
                            {
                            result[0] = '\0';
                            >
                            /* Copy the strings. */
                            for (s = str; s != NULL; s = va_arg (ap2, const char *))
                            strcat (result, s);
                            }
                            >
                            va_end (ap2);
                            >
                            return result;
                            }
                            >
                            This looks quite simple, especially the second loop where the strings
                            are actually copied. But these innocent lines hide a major performance
                            penalty. Just imagine that ten strings of 100 bytes each have to be
                            concatenated. For the secondstringwe search the already stored 100
                            bytes for the end of thestringso that we can append the nextstring.
                            For all strings in total the comparisons necessary to find the end of
                            the intermediate results sums up to 5500! If we combine the copying
                            with the search for the allocation we can write this function moreefficient:
                            >
                            char *
                            concat (const char *str, ...)
                            {
                            va_list ap;
                            size_t allocated = 100;
                            char *result = (char *) malloc (allocated);
                            >
                            if (result != NULL)
                            {
                            char *newp;
                            char *wp;
                            >
                            va_start (ap, str);
                            >
                            wp = result;
                            for (s = str; s != NULL; s = va_arg (ap, const char *))
                            {
                            size_t len = strlen (s);
                            >
                            /* Resize the allocated memory if necessary. */
                            if (wp + len + 1 result + allocated)
                            {
                            allocated = (allocated + len) * 2;
                            newp = (char *) realloc (result, allocated);
                            if (newp == NULL)
                            {
                            free (result);
                            return NULL;
                            }
                            wp = newp + (wp - result);
                            result = newp;
                            }
                            >
                            wp = mempcpy (wp, s, len);
                            }
                            >
                            /* Terminate the resultstring. */
                            *wp++ = '\0';
                            >
                            /* Resize memory to the optimal size. */
                            newp = realloc (result, wp - result);
                            if (newp != NULL)
                            result = newp;
                            >
                            va_end (ap);
                            }
                            >
                            return result;
                            }
                            >
                            With a bit more knowledge about the input strings one could fine-tune
                            the memory allocation. The difference we are pointing to here is that
                            we don't use `strcat' anymore. We always keep track of the length of
                            the current intermediate result so we can safe us the search for the
                            end of thestringand use `mempcpy'. Please note that we also don't
                            use `stpcpy' which might seem more natural since we handle with
                            strings. But this is not necessary since we already know the length of
                            thestringand therefore can use the faster memory copying function.
                            The example would work for wide characters the same way.
                            >
                            Whenever a programmer feels the need to use `strcat' she or he
                            should think twice and look through the program whether the code cannot
                            be rewritten to take advantage of already calculated results. Again: it
                            is almost always unnecessary to use `strcat'.
                            >
                            --
                            "Some people *are* arrogant, and others read the FAQ."
                            --Chris Dollin
                            Thanks a lot Chris and all who assisted me in this regard. Much Glad
                            about response.

                            Comment

                            • santosh

                              #15
                              Re: Efficient strcat implementation.

                              Mahesh wrote:
                              On Apr 22, 4:10 am, Ian Collins <ian-n...@hotmail.co mwrote:
                              >Maheshwrote:
                              Hi,
                              >>
                              I am looking forefficientstr ingcancatinatio n c code. I did it using
                              ptr [using default method] but my code i think is notefficient.
                              >>
                              Please help.
                              >>
                              >Show us what you have.
                              >>
                              >--
                              >Ian Collins.
                              >
                              here is my code: its very very simple prog.
                              >
                              #include <stdio.h>
                              #include <malloc.h>
                              This is not a standardised header. Malloc and it's related functions are
                              defined, as per the standard, in stdlib.h. So the above line is
                              unnecessary on any ANSI conforming compiler; indeed it's non-portable.
                              #include <string.h>
                              int main(void)
                              {
                              char *src = "String";
                              char *dest = "Concatenat ion Code";
                              char *temp;
                              char *temp1;
                              temp = (char*)calloc(1 , sizeof(src)+siz eof(dest)+2);
                              This is not what you want. Firstly in C you don't need to cast the
                              return value as there is implicit conversion between void * and any
                              other object pointer type. Secondly you want strlen instead of sizeof.
                              sizeof will only give you the size of the pointers src and dest.

                              temp = calloc(strlen(s rc)+strlen(dest )+1, 1);
                              temp1 = temp;
                              while(*src != '\0' && (*temp++ = *src++));
                              The first test is unnecessary. Or you could simply use strcpy or memcpy
                              for this purpose. If you do remove the check against '\0' then don't
                              forget to decrement temp by one after the loop is done.
                              *(temp++) = ' ';
                              Why?
                              while((*temp++ = *dest++));
                              printf("Final Result = %s \n", temp1);
                              }

                              Comment

                              Working...