Copy algorithm with insert iterators...

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • deancoo

    Copy algorithm with insert iterators...

    I have gotten into the habit of often using copy along with an insert
    iterator. There are scenarios where I process quite a lot of data this way.
    Can someone give me a general feel as to how much of a performance hit I'm
    taking using this technique versus using 'copy' to copy directly into a
    container with elements in place?

    Thanks,
    d


  • Cy Edmunds

    #2
    Re: Copy algorithm with insert iterators...

    "deancoo" <s2cuts555@yaho o.ca> wrote in message
    news:k7KWd.1182 3$KI2.9259@clgr ps12...[color=blue]
    >I have gotten into the habit of often using copy along with an insert
    >iterator. There are scenarios where I process quite a lot of data this
    >way. Can someone give me a general feel as to how much of a performance hit
    >I'm taking using this technique versus using 'copy' to copy directly into a
    >container with elements in place?
    >
    > Thanks,
    > d
    >[/color]

    As always, if you have critical performance requirements you should do
    timing studies. However, you asked for a general feel. Generally it's not
    bad. For instance with std::vector you can do

    std::vector<dou ble> v(10000);
    std::copy(ptr, ptr+10000, &v[0]);

    which will save doing 10000 push_back operations, but now you have to
    initialize 10000 doubles to 0.0 just before overwriting them. The
    alternative you are using which is presumably

    std::vector<dou ble> v;
    v.reserve(10000 ); // don't forget this
    std::copy(ptr, ptr+10000, std::back_inser ter(v));

    does a little bookkeeping on each step -- probably incrementing a "last"
    pointer -- but initializes using a copy constructor. If you don't reserve
    enough memory before your std::copy the object will have to reallocate
    memory several times.

    --
    Cy



    Comment

    • Andrew Koenig

      #3
      Re: Copy algorithm with insert iterators...

      "Cy Edmunds" <cedmunds@spaml ess.rochester.r r.com> wrote in message
      news:wnKWd.7718 5$H05.39061@twi ster.nyroc.rr.c om...[color=blue]
      > "deancoo" <s2cuts555@yaho o.ca> wrote in message
      > news:k7KWd.1182 3$KI2.9259@clgr ps12...[/color]
      [color=blue]
      > The alternative you are using which is presumably
      >
      > std::vector<dou ble> v;
      > v.reserve(10000 ); // don't forget this
      > std::copy(ptr, ptr+10000, std::back_inser ter(v));
      >
      > does a little bookkeeping on each step -- probably incrementing a "last"
      > pointer -- but initializes using a copy constructor. If you don't reserve
      > enough memory before your std::copy the object will have to reallocate
      > memory several times.[/color]

      I guess the real question is why you don't do this:

      v.insert(v.end( ), ptr, ptr+10000);

      That way, vector::insert knows how many elements will be inserted before it
      starts its work, so (at least in principle) it can allocate the right amount
      of memory once and be done with it.


      Comment

      • Cy Edmunds

        #4
        Re: Copy algorithm with insert iterators...

        "Andrew Koenig" <ark@acm.org> wrote in message
        news:PILWd.3382 87$w62.160215@b gtnsc05-news.ops.worldn et.att.net...[color=blue]
        > "Cy Edmunds" <cedmunds@spaml ess.rochester.r r.com> wrote in message
        > news:wnKWd.7718 5$H05.39061@twi ster.nyroc.rr.c om...[color=green]
        >> "deancoo" <s2cuts555@yaho o.ca> wrote in message
        >> news:k7KWd.1182 3$KI2.9259@clgr ps12...[/color]
        >[color=green]
        >> The alternative you are using which is presumably
        >>
        >> std::vector<dou ble> v;
        >> v.reserve(10000 ); // don't forget this
        >> std::copy(ptr, ptr+10000, std::back_inser ter(v));
        >>
        >> does a little bookkeeping on each step -- probably incrementing a "last"
        >> pointer -- but initializes using a copy constructor. If you don't reserve
        >> enough memory before your std::copy the object will have to reallocate
        >> memory several times.[/color]
        >
        > I guess the real question is why you don't do this:
        >
        > v.insert(v.end( ), ptr, ptr+10000);
        >
        > That way, vector::insert knows how many elements will be inserted before
        > it starts its work, so (at least in principle) it can allocate the right
        > amount of memory once and be done with it.
        >[/color]


        Your version might be OK depending on the implementation. The function can't
        just subtract the iterators to find the length of the sequence because that
        would assume random access iterators. And if you use that form with
        something other than random access iterators it can't get the sequence
        length even in principle. Using reserve() as I suggested works with any type
        of iterator.

        --
        Cy



        Comment

        • Clark S. Cox III

          #5
          Re: Copy algorithm with insert iterators...

          On 2005-03-06 23:01:14 -0500, "Cy Edmunds"
          <cedmunds@spaml ess.rochester.r r.com> said:
          [color=blue]
          > "Andrew Koenig" <ark@acm.org> wrote in message
          > news:PILWd.3382 87$w62.160215@b gtnsc05-news.ops.worldn et.att.net...[color=green]
          >> "Cy Edmunds" <cedmunds@spaml ess.rochester.r r.com> wrote in message
          >> news:wnKWd.7718 5$H05.39061@twi ster.nyroc.rr.c om...[color=darkred]
          >>> "deancoo" <s2cuts555@yaho o.ca> wrote in message
          >>> news:k7KWd.1182 3$KI2.9259@clgr ps12...[/color]
          >>[color=darkred]
          >>> The alternative you are using which is presumably
          >>>
          >>> std::vector<dou ble> v;
          >>> v.reserve(10000 ); // don't forget this
          >>> std::copy(ptr, ptr+10000, std::back_inser ter(v));
          >>>
          >>> does a little bookkeeping on each step -- probably incrementing a
          >>> "last" pointer -- but initializes using a copy constructor. If you
          >>> don't reserve enough memory before your std::copy the object will have
          >>> to reallocate memory several times.[/color]
          >>
          >> I guess the real question is why you don't do this:
          >>
          >> v.insert(v.end( ), ptr, ptr+10000);
          >>
          >> That way, vector::insert knows how many elements will be inserted
          >> before it starts its work, so (at least in principle) it can allocate
          >> the right amount of memory once and be done with it.
          >>[/color]
          >
          >
          > Your version might be OK depending on the implementation. The function
          > can't just subtract the iterators to find the length of the sequence
          > because that would assume random access iterators. And if you use that
          > form with something other than random access iterators it can't get the
          > sequence length even in principle. Using reserve() as I suggested works
          > with any type of iterator.[/color]

          An implementation could have specializations of insert for various
          types of input iterators. In such a case, it could very well subtract
          the iterators to find the number of items to be inserted. For example:


          class MyContainer
          {
          public:
          typedef ... iterator;
          ...

          private:
          template<typena me InputIterator>
          void insert_impl(ite rator pos, InputIterator first, InputIterator
          last, const std::input_iter ator_tag &)
          {
          ...
          }

          template<typena me ForwardIterator >
          void insert_impl(ite rator pos, ForwardIterator first, ForwardIterator
          last, const std::forward_it erator_tag &)
          {
          //Will handle forward, bidirectional and random-access
          reserve(std::di stance(first, last));
          insert_impl(pos , first, last, const std::input_iter ator_tag())
          }

          public:
          template<typena me InputIterator>
          void insert(iterator pos, InputIterator first, InputIterator last)
          {
          insert_impl(pos , first, last,
          std::iterator_t raits<InputIter ator>::iterator _category());
          }
          };


          --
          Clark S. Cox, III
          clarkcox3@gmail .com

          Comment

          • Pete Becker

            #6
            Re: Copy algorithm with insert iterators...

            Cy Edmunds wrote:[color=blue]
            >
            > Your version might be OK depending on the implementation. The function can't
            > just subtract the iterators to find the length of the sequence because that
            > would assume random access iterators.[/color]

            On the contrary: the function can subtract the iterators because they're
            random access iterators. That's what iterator categories are all about:
            tuning the algorithm based on the category of the iterator.

            --

            Pete Becker
            Dinkumware, Ltd. (http://www.dinkumware.com)

            Comment

            • Andrew Koenig

              #7
              Re: Copy algorithm with insert iterators...

              "Cy Edmunds" <cedmunds@spaml ess.rochester.r r.com> wrote in message
              news:emQWd.9371 2$nC5.6165@twis ter.nyroc.rr.co m...
              [color=blue][color=green]
              >> I guess the real question is why you don't do this:
              >>
              >> v.insert(v.end( ), ptr, ptr+10000);
              >>
              >> That way, vector::insert knows how many elements will be inserted before
              >> it starts its work, so (at least in principle) it can allocate the right
              >> amount of memory once and be done with it.
              >>[/color]
              >
              >
              > Your version might be OK depending on the implementation. The function
              > can't just subtract the iterators to find the length of the sequence
              > because that would assume random access iterators. And if you use that
              > form with something other than random access iterators it can't get the
              > sequence length even in principle. Using reserve() as I suggested works
              > with any type of iterator.[/color]

              If you know that there will be 10000 elements, you can do this:

              v.reserve(v.siz e() + 10000);
              v.insert(v.end( ), ptr, ptr + 10000);

              If you don't, you can't. Well, you can, but you don't know if it will help
              or hurt.

              As for saying that v.insert "can't just subtract the iterators to find the
              length of the sequence," it is not unusual for implementations to test
              (using iterator_catego ry) whether the iterators are random-access and use
              different algorithms if they are. It is actually possible to do this test
              at compile time if you're clever about it.


              Comment

              • Cy Edmunds

                #8
                Re: Copy algorithm with insert iterators...

                "Pete Becker" <petebecker@acm .org> wrote in message
                news:HNudnUbz_5 Coz7HfRVn-1A@rcn.net...[color=blue]
                > Cy Edmunds wrote:[color=green]
                >>
                >> Your version might be OK depending on the implementation. The function
                >> can't just subtract the iterators to find the length of the sequence
                >> because that would assume random access iterators.[/color]
                >
                > On the contrary: the function can subtract the iterators because they're
                > random access iterators. That's what iterator categories are all about:
                > tuning the algorithm based on the category of the iterator.
                >
                > --
                >
                > Pete Becker
                > Dinkumware, Ltd. (http://www.dinkumware.com)[/color]

                The iterators under discussion are not necessarily the iterators from
                std::vector. They could, for instance, be from std::list in which case they
                would NOT be random access iterators. Your own documentation describes the
                method in question as

                void insert(iterator where, InIt first, InIt last);

                which I assume means input iterator.

                --
                Cy



                Comment

                • Cy Edmunds

                  #9
                  Re: Copy algorithm with insert iterators...

                  "Andrew Koenig" <ark@acm.org> wrote in message
                  news:vSZWd.3424 70$w62.324724@b gtnsc05-news.ops.worldn et.att.net...[color=blue]
                  > "Cy Edmunds" <cedmunds@spaml ess.rochester.r r.com> wrote in message
                  > news:emQWd.9371 2$nC5.6165@twis ter.nyroc.rr.co m...
                  >[color=green][color=darkred]
                  >>> I guess the real question is why you don't do this:
                  >>>
                  >>> v.insert(v.end( ), ptr, ptr+10000);
                  >>>
                  >>> That way, vector::insert knows how many elements will be inserted before
                  >>> it starts its work, so (at least in principle) it can allocate the right
                  >>> amount of memory once and be done with it.
                  >>>[/color]
                  >>
                  >>
                  >> Your version might be OK depending on the implementation. The function
                  >> can't just subtract the iterators to find the length of the sequence
                  >> because that would assume random access iterators. And if you use that
                  >> form with something other than random access iterators it can't get the
                  >> sequence length even in principle. Using reserve() as I suggested works
                  >> with any type of iterator.[/color]
                  >
                  > If you know that there will be 10000 elements, you can do this:
                  >
                  > v.reserve(v.siz e() + 10000);
                  > v.insert(v.end( ), ptr, ptr + 10000);
                  >
                  > If you don't, you can't. Well, you can, but you don't know if it will
                  > help or hurt.
                  >
                  > As for saying that v.insert "can't just subtract the iterators to find the
                  > length of the sequence," it is not unusual for implementations to test
                  > (using iterator_catego ry) whether the iterators are random-access and use
                  > different algorithms if they are. It is actually possible to do this test
                  > at compile time if you're clever about it.
                  >
                  >[/color]

                  I know. But my recommended syntax works with an implementation which isn't
                  clever. It also works efficiently if you change the iterator type to
                  something less than a random access iterator. Hence it is more portable and
                  more maintainable than using insert.

                  You originally asked why I didn't just do

                  v.insert(v.end( ), ptr, ptr+10000);

                  That's why.

                  --
                  Cy



                  Comment

                  • Pete Becker

                    #10
                    Re: Copy algorithm with insert iterators...

                    Cy Edmunds wrote:
                    [color=blue]
                    > The iterators under discussion are not necessarily the iterators from
                    > std::vector. They could, for instance, be from std::list in which case they
                    > would NOT be random access iterators.[/color]

                    From your example:
                    [color=blue]
                    > std::copy(ptr, ptr+10000, &v[0]);[/color]

                    Since you're adding 10000 to ptr, it has to be a random access iterator.

                    --

                    Pete Becker
                    Dinkumware, Ltd. (http://www.dinkumware.com)

                    Comment

                    • Andrew Koenig

                      #11
                      Re: Copy algorithm with insert iterators...

                      "Cy Edmunds" <cedmunds@spaml ess.rochester.r r.com> wrote in message
                      news:g%%Wd.1096 63$nC5.66002@tw ister.nyroc.rr. com...
                      [color=blue]
                      > I know. But my recommended syntax works with an implementation which isn't
                      > clever. It also works efficiently if you change the iterator type to
                      > something less than a random access iterator. Hence it is more portable
                      > and more maintainable than using insert.
                      >
                      > You originally asked why I didn't just do
                      >
                      > v.insert(v.end( ), ptr, ptr+10000);
                      >
                      > That's why.[/color]

                      If you're not using a random-access iterator, you can't compute ptr+10000.

                      In any event, I still claim that if we take your statement

                      copy(ptr, ptr+10000, back_inserter(v ));

                      and replace it with

                      v.insert(v.end( ), ptr, ptr+10000);

                      leaving intact any calls to reserve or anything else that precedes it, the
                      result will be the same, the code will never execute less efficiently, and
                      it might be substantially faster on some implementations . The reason for
                      the speed claim is that back_inserter cannot know in advance how many
                      elements are being inserted, but v.insert might.


                      Comment

                      • Cy Edmunds

                        #12
                        Re: Copy algorithm with insert iterators...

                        "Pete Becker" <petebecker@acm .org> wrote in message
                        news:_ZqdnU7ipq MhErHfRVn-oQ@rcn.net...[color=blue]
                        > Cy Edmunds wrote:
                        >[color=green]
                        >> The iterators under discussion are not necessarily the iterators from
                        >> std::vector. They could, for instance, be from std::list in which case
                        >> they would NOT be random access iterators.[/color]
                        >
                        > From your example:
                        >[color=green]
                        >> std::copy(ptr, ptr+10000, &v[0]);[/color]
                        >
                        > Since you're adding 10000 to ptr, it has to be a random access iterator.
                        >
                        > --
                        >
                        > Pete Becker
                        > Dinkumware, Ltd. (http://www.dinkumware.com)[/color]

                        As anyone who actually read the thread (HINT) those were NOT the iterators
                        under discussion.

                        --
                        Cy



                        Comment

                        • Cy Edmunds

                          #13
                          Re: Copy algorithm with insert iterators...

                          "Andrew Koenig" <ark@acm.org> wrote in message
                          news:Fz0Xd.3432 80$w62.271819@b gtnsc05-news.ops.worldn et.att.net...[color=blue]
                          > "Cy Edmunds" <cedmunds@spaml ess.rochester.r r.com> wrote in message
                          > news:g%%Wd.1096 63$nC5.66002@tw ister.nyroc.rr. com...
                          >[color=green]
                          >> I know. But my recommended syntax works with an implementation which
                          >> isn't clever. It also works efficiently if you change the iterator type
                          >> to something less than a random access iterator. Hence it is more
                          >> portable and more maintainable than using insert.
                          >>
                          >> You originally asked why I didn't just do
                          >>
                          >> v.insert(v.end( ), ptr, ptr+10000);
                          >>
                          >> That's why.[/color]
                          >
                          > If you're not using a random-access iterator, you can't compute ptr+10000.[/color]

                          Read it again: "if you change the iterator type to something less than a
                          random access iterator"
                          [color=blue]
                          >
                          > In any event, I still claim that if we take your statement
                          >
                          > copy(ptr, ptr+10000, back_inserter(v ));
                          >
                          > and replace it with
                          >
                          > v.insert(v.end( ), ptr, ptr+10000);
                          >
                          > leaving intact any calls to reserve or anything else that precedes it, the
                          > result will be the same, the code will never execute less efficiently, and
                          > it might be substantially faster on some implementations . The reason for
                          > the speed claim is that back_inserter cannot know in advance how many
                          > elements are being inserted, but v.insert might.
                          >
                          >[/color]

                          I agree provided you call reserve first.

                          --
                          Cy



                          Comment

                          • Pete Becker

                            #14
                            Re: Copy algorithm with insert iterators...

                            Cy Edmunds wrote:[color=blue]
                            >
                            > As anyone who actually read the thread (HINT) those were NOT the iterators
                            > under discussion.
                            >[/color]

                            Sigh. Instead of being snotty (HINT), please re-read your example. Here
                            it is again, along with Andy's suggestion:
                            [color=blue]
                            > "Cy Edmunds" <cedmunds@spaml ess.rochester.r r.com> wrote in message
                            > news:wnKWd.7718 5$H05.39061@twi ster.nyroc.rr.c om...
                            >[color=green][color=darkred]
                            >>> "deancoo" <s2cuts555@yaho o.ca> wrote in message
                            >>> news:k7KWd.1182 3$KI2.9259@clgr ps12...[/color][/color]
                            >
                            >[color=green][color=darkred]
                            >>> The alternative you are using which is presumably
                            >>>
                            >>> std::vector<dou ble> v;
                            >>> v.reserve(10000 ); // don't forget this
                            >>> std::copy(ptr, ptr+10000, std::back_inser ter(v));
                            >>>
                            >>> does a little bookkeeping on each step -- probably incrementing a "last"
                            >>> pointer -- but initializes using a copy constructor. If you don't reserve
                            >>> enough memory before your std::copy the object will have to reallocate
                            >>> memory several times.[/color][/color]
                            >
                            >
                            > I guess the real question is why you don't do this:
                            >
                            > v.insert(v.end( ), ptr, ptr+10000);
                            >
                            > That way, vector::insert knows how many elements will be inserted before it
                            > starts its work, so (at least in principle) it can allocate the right amount
                            > of memory once and be done with it.
                            >
                            >[/color]

                            It is quite clear that ptr is a random access iterator, and in that
                            context Andy's question is perfectly reasonable.

                            You then replied:
                            [color=blue]
                            > Your version might be OK depending on the implementation. The function can't
                            > just subtract the iterators to find the length of the sequence because that
                            > would assume random access iterators. And if you use that form with
                            > something other than random access iterators it can't get the sequence
                            > length even in principle. Using reserve() as I suggested works with any type
                            > of iterator.[/color]

                            Your assertion that "The function can't just subtract the iterators to
                            find the length of the sequence" is simply wrong.

                            I won't try to predict whether you'll respond to this message by
                            acknowledging that you were wrong, or whether you'll come up with some
                            additional bullshit to try to pretend that you didn't say what you said.
                            Either way, I have no further interest in this inane thread.

                            --

                            Pete Becker
                            Dinkumware, Ltd. (http://www.dinkumware.com)

                            Comment

                            • Andrew Koenig

                              #15
                              Re: Copy algorithm with insert iterators...

                              "Cy Edmunds" <cedmunds@spaml ess.rochester.r r.com> wrote in message
                              news:WN0Xd.6131 7$vK5.8522@twis ter.nyroc.rr.co m...[color=blue]
                              > "Andrew Koenig" <ark@acm.org> wrote in message
                              > news:Fz0Xd.3432 80$w62.271819@b gtnsc05-news.ops.worldn et.att.net...[color=green]
                              >> "Cy Edmunds" <cedmunds@spaml ess.rochester.r r.com> wrote in message
                              >> news:g%%Wd.1096 63$nC5.66002@tw ister.nyroc.rr. com...[/color][/color]
                              [color=blue][color=green][color=darkred]
                              >>> I know. But my recommended syntax works with an implementation which
                              >>> isn't clever. It also works efficiently if you change the iterator type
                              >>> to something less than a random access iterator. Hence it is more
                              >>> portable and more maintainable than using insert.[/color][/color][/color]
                              [color=blue][color=green][color=darkred]
                              >>> You originally asked why I didn't just do[/color][/color][/color]
                              [color=blue][color=green][color=darkred]
                              >>> v.insert(v.end( ), ptr, ptr+10000);[/color][/color][/color]
                              [color=blue][color=green][color=darkred]
                              >>> That's why.[/color][/color][/color]
                              [color=blue][color=green]
                              >> If you're not using a random-access iterator, you can't compute
                              >> ptr+10000.[/color][/color]
                              [color=blue]
                              > Read it again: "if you change the iterator type to something less than a
                              > random access iterator"[/color]

                              Your original example was

                              std::vector<dou ble> v;
                              v.reserve(10000 ); // don't forget this
                              std::copy(ptr, ptr+10000, std::back_inser ter(v));

                              which computes ptr+10000. You can't do that unless ptr is a random-access
                              iterator.
                              [color=blue][color=green]
                              >> In any event, I still claim that if we take your statement
                              >>
                              >> copy(ptr, ptr+10000, back_inserter(v ));
                              >>
                              >> and replace it with
                              >>
                              >> v.insert(v.end( ), ptr, ptr+10000);
                              >>
                              >> leaving intact any calls to reserve or anything else that precedes it,
                              >> the result will be the same, the code will never execute less
                              >> efficiently, and it might be substantially faster on some
                              >> implementations . The reason for the speed claim is that back_inserter
                              >> cannot know in advance how many elements are being inserted, but v.insert
                              >> might.[/color][/color]
                              [color=blue]
                              > I agree provided you call reserve first.[/color]

                              My claim is true regardless of whether you call reserve first, as long as
                              the only difference between the two pieces of code being compared is the
                              difference between

                              copy(ptr, ptr+10000, back_inserter(v ));

                              and

                              v.insert(v.end( ), ptr, ptr+10000);

                              If you disagree, please post a code example in which you think that the
                              calls to copy and back_inserter are better than the call to insert, and
                              explain why you think they're better.


                              Comment

                              Working...