is that safe: V.push_back( V[0] );

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • jabbah
    New Member
    • Nov 2007
    • 63

    is that safe: V.push_back( V[0] );

    Hi, I just found, that code that ran under Visual Studio 8 doesnt run under 9 anymore and now im seeking confirmation of whether my code is unsafe and just accidentially worked or what one can expect from a standard conforming implementation of STL.

    The basic thing is that, given a vector V with some elements in it, I want to push_back one of these elements, like this:

    [CODE=c]#include <iostream>
    #include <vector>
    #include <cassert>
    using namespace std;

    int main()
    {
    typedef vector<int> X;
    X x;
    x.push_back( 6 );

    // make a vector V and fill it
    vector<X> V;
    for ( unsigned int i = 0; i < 10; i++ )
    {
    if ( V.size() == V.capacity() )
    cout << "[" << V.size() << "]";

    V.push_back( x );
    cout << ".";
    }
    cout << "\n";


    // push_back some elements
    size_t s = V.size();
    for ( unsigned int i = 0; i < s; i++ )
    {
    if ( V.size() == V.capacity() )
    cout << "[" << V.size() << "]";

    V.push_back( V[0] );
    cout << ".";

    const X& first = V[0];
    size_t size_first = first.size();
    const X& last = *V.rbegin();
    size_t size_last = last.size();
    assert( size_last == size_first ); // this assertion fails under VS9
    }
    cout << "\n";
    }
    [/CODE]

    In the second loop I do V.push_back(V[0]);
    Is it safe to do that?

    It worked in VS8 but in VS9, in iteration i=3, after the push_back, the last element, V[13] is not a copy of the vector containing the number 6, but it is an empty vector, thus the assertion fails.
  • Banfa
    Recognized Expert Expert
    • Feb 2006
    • 9067

    #2
    Here is the problem, when you call push_back if the vector does not have enough capacity then it gets reallocated and any reallocation invalidates all previously obtained iterators, references and pointers.

    However you just got a reference through V[0] so if the push_back causes a reallocation this reference is immediately invalidated, then you tried to copy from it.

    It was probably working previously because the recently freed memory was still hanging around with the same values but accessing that memory is actually undefined behavior which is why it is no longer working.

    If you want to call

    V.push_back(V[0]);

    you need to be absolutely certain that there will be no reallocation, alternatively you need to do something like

    X temp = V[0];
    V.push_back(tem p);

    This actually takes a copy of the object returned by the reference V[0] so it could be quite inefficient if that is a large object.

    Comment

    • jabbah
      New Member
      • Nov 2007
      • 63

      #3
      ok, I understand. thanks banfa.

      regarding the tmp copy: what about

      V.reserve( V.size()+1 );
      V.push_back( V[0] );

      Comment

      • Banfa
        Recognized Expert Expert
        • Feb 2006
        • 9067

        #4
        That should work because you have ensured that you have at least enough memory to do the push_back without reallocation before you call the push_back.

        However what is even better is

        V.reserve( V.size()*2 );

        just before the for since you exactly double the size of the vector by push_back V[0] once for each current entry because that way you reduce the total number of reallocations to 1.

        Comment

        • jabbah
          New Member
          • Nov 2007
          • 63

          #5
          V.reserve( V.size()*2 );
          I thought vector would double the capacity anyway whenever it is about to be exceeded.

          so suppose befor i take action, we have
          V.capacity()==V .size()==a

          then I do one of:
          V.resize( V.size()+1 ), or
          V.resize( V.size()*2 )

          no matter, afterwards we will have
          V.capacity() == 2 * a

          Comment

          • Banfa
            Recognized Expert Expert
            • Feb 2006
            • 9067

            #6
            It doesn't matter if the vector doubles its capacity whenever it is about to be exceeded, although I suspect that is an implementation detail and that all it is required to do is allocated enough memory to satisfy the current operation.

            Anyway this doesn't occur because

            V.reserve(V.siz e()+1);

            ensures that the capacity is never exceeded. I wouldn't use resize in this case. You are not interested in changing the size of the vector only in ensuring it has enough memory to perform the next operation without performing a reallocation. Also if you ran this code

            int size = V.size();
            V.resize(size+1 );
            V.push_back(x);

            After those lines of code you vector V.size() == size+2, because the resize increased the size by 1 and so did the push_back.

            If you do

            V.reserve( V.size()+1 );

            in the loop before the push_back or

            V.reserve( V.size()*2 );

            before the loop you are (nearly) correct at the end of the loop V.size() == 2 * a. V.capacity() will be at least 2 * a but may be larger.

            The point I was making is that V.reserve( V.size()*2 ); before the loop is more efficient than V.reserve( V.size()+1 ); in the loop because then you definitely only get 1 reallocation of the vectors memory where as with V.reserve( V.size()+1 ); in the loop you might get a memory reallocations. Reallocating memory is a slow operation because if involves:
            Allocate memory for new buffer
            Copy data from old buffer to new buffer
            Deallocate memory for old buffer
            so avoiding doing it more than necessary is good for program efficiency.

            Comment

            • jabbah
              New Member
              • Nov 2007
              • 63

              #7
              huh?!
              has a message from me just disappeared?!

              Im pretty sure that the thread went:

              jabbah: Hi, I just found, ....
              banfa: Here is the problem...
              jabbah:ok, I understand....
              banfa: That should work...
              jabbah: I thought vector....
              jabbah: < This Message Seems To Have Disappeard >
              banfa: It doesn't matter...

              can that be?! hm well


              -------------------
              regarding "plus one vs. times two"
              ok, i agree that
              avoiding doing it more than necessary is good for program efficiency.
              however, im convinced, that reallocation is mandated to double the capacity. or maybe not neccessarily "double", ie increase by a factor of two, but still increase it by some constant factor. this must be in order to achieve "Constant amortized time", specifically

              [CODE=cpp]
              for( unsigned int i=0; i < M; ++i ){
              v.reserve(i);
              }
              [/CODE]

              will not do more reallocations than log(M). So, as far as i understand it, the specification of vector ensures, that we dont do more reallocations than necessary. the _specification_ and not just some implementation!




              ------------------------------
              now let me rewrite the message that i believe has been lost.


              going back to the original problem: invalid reference due to reallocation.

              I _do_ understand that the reference is invalid _after_ push_back, ie

              [CODE=c]
              const X& x = V[0];
              V.reserve( ... );
              cout << x; // illegal, because x may have been invalidated
              [/CODE]

              but I do not understand why push_back( V[0] ) shouldnt work. consider the following implementation of push_back:

              Code:
              push_back( const X& x ){
                if ( enough capacity ) {
                  copy x
                } else {
                  allocate new space
                  copy from old space to new space
                  copy x
                  deallocate old space
                }
              }
              This would perfectly well work. its just a matter of the order of the last two lines of code.

              _When_ exactly do i have to stop using references/ptrs/iters? certainly _after_ push_back has terminated, but why cant push_back itself not work with the reference that is valid at the point of calling push_back?

              is the statement

              " V.push_back( V[0] ) is undefined"

              really standard conforming?

              im insisting a) out of interest - i simply want to understand, but also b) out of despair - i have a similar situation where the V.reserve( ... ); prior to V.push_back() doesnt seem to be an option and i really dont feel like making an extra (unnecessary) copy

              Comment

              • jabbah
                New Member
                • Nov 2007
                • 63

                #8
                is the statement
                "V.push_bac k( V[0] ) is undefined"
                really standard conforming?
                BUMPING ... anybody??

                Comment

                • Banfa
                  Recognized Expert Expert
                  • Feb 2006
                  • 9067

                  #9
                  however, im convinced, that reallocation is mandated to double the capacity. or maybe not neccessarily "double", ie increase by a factor of two, but still increase it by some constant factor. this must be in order to achieve "Constant amortized time", specifically
                  I see nothing in the specification that mandates a doubling in size although undoubtedly gcc implements it that way and you are correct in that insert is mandated to work in constant amortized time which may have an effect on the choice of allocation algorithm.

                  _When_ exactly do i have to stop using references/ptrs/iters? certainly _after_ push_back has terminated, but why cant push_back itself not work with the reference that is valid at the point of calling push_back?

                  is the statement

                  " V.push_back( V[0] ) is undefined"

                  really standard conforming?
                  You exactly have to stop using them at the instant the internal buffer is reallocated (and the old one deleted). Since the specification doesn't require a particular implementation the implicit reallocation that can happen inside push_back could happen at any time while the function is executing so you have to stop using references/ptrs/iters before you call the function unless you know that reallocation wont happen because of a prior reserve call. Just because you can write a implementation that would work with the construction you have doesn't make that construction standard conforming since no library writer is constrained to use that implementation

                  The statement is standard conforming.

                  im insisting a) out of interest - i simply want to understand, but also b) out of despair - i have a similar situation where the V.reserve( ... ); prior to V.push_back() doesnt seem to be an option and i really dont feel like making an extra (unnecessary) copy
                  Unfortunately just because you don't feel like doing it doesn't make it unnecessary.

                  Comment

                  • jabbah
                    New Member
                    • Nov 2007
                    • 63

                    #10
                    I see nothing in the specification that mandates a doubling in size although undoubtedly gcc implements it that way and you are correct in that insert is mandated to work in constant amortized time which may have an effect on the choice of allocation algorithm.
                    agreed.


                    Since the specification doesn't require a particular implementation the implicit reallocation that can happen inside push_back could happen at any time while the function is executing so you have to stop using references/ptrs/iters before you call the function
                    I was afraid that this would be your answer. Thanks a lot for clarifying this, Banfa!

                    unless you know that reallocation wont happen because of a prior reserve call.
                    That brings up another idea.

                    Suppose I make sure that there is always at least _one_ element free (because I call V.reserve( V.size() + 1 ) _after_ each push_back). Is that enough?

                    In other words can I be sure, that reallocation will only happen if V.capacity() == V.size() ?

                    Comment

                    • Banfa
                      Recognized Expert Expert
                      • Feb 2006
                      • 9067

                      #11
                      Calling reserve after the push_back probably would work but it would look a little odd so I would clearly comment why it is being done.

                      In other words can I be sure, that reallocation will only happen if V.capacity() == V.size() ?
                      Actually what the standard says is
                      It is guaranteed that no reallocation takes place during insertions that happen after a call to reserve() until the time when an insertion would make the size of the vector greater than the size specified in the most recent call to reserve().
                      This rather odd wording means that what I have quoted from you is not quite true according to the standard.

                      Imagine that you have a vector with size == capacity == N.
                      You call reserve(N+1) but it actually allocates capacity to N+2.
                      You call push_back(), size < reserve (N+1) so no allocation happens, guaranteed by the standard.
                      You call push_back() again, size < capacity (N+2) however it is not < last call to reserve (N+1) so even though the vector has the capacity to hold the data the standard does not guaranteed that no allocation happens.

                      All rather strange, however your plan to call V.reserve( V.size()+1 ) after each push_back does not fall fowl of this because even if reserve did not reallocate any memory (which it wont if the requested size <= current size) you have made a call to reserve with the appropriate value for the new size so the standard guarantees that no allocation happens next time.

                      Comment

                      • jabbah
                        New Member
                        • Nov 2007
                        • 63

                        #12
                        sounds good. thanks banfa

                        Comment

                        • jabbah
                          New Member
                          • Nov 2007
                          • 63

                          #13
                          yak.

                          seems that (the implementation of stl that i use) does one reallocation for each call of
                          v.reserve( v.size() + 1 );
                          i.e. it doesnt grow in the same way as v.push_back() does, but it actually increases the capa by exactly 1.

                          im getting bored. maybe i give in to the extra temp copy
                          Last edited by jabbah; Dec 17 '09, 12:16 PM. Reason: clarification: its not that _i_ have implemented stl

                          Comment

                          Working...