Effective STL Item 4 (size() vs. empty())

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Chris

    #16
    Re: Effective STL Item 4 (size() vs. empty())

    Howard Hinnant wrote:[color=blue]
    > <snip>
    > Now on to the question: What does O(1) list::size benefit?
    >
    > Answer: Both client code, and the implementation of list.
    >[/color]

    One quick question. It would seem to me that if you wanted to allow
    splice, and O(1) list::size, then each list node would have to contain a
    pointer to some kind of central information bank about the list, which
    stores the size. if you didn't have this then you would have no way of
    accessing the size information to update it after a splice out of the
    list. Also you wouldn't even be able to check if a splice had occured.
    You could make the first/last node of the list a different type to every
    other node and then whenever you splice out of a list cycle around to
    it, but that would be expensive.

    Basically, I don't see how you can make a list with O(1) size without
    adding an extra item to every single node, which could potentially make
    a list take 25% more space (for list<int> and sinilar things). That
    isn't to be sniffed at?

    Chris

    Comment

    • Ivan Vecerina

      #17
      Re: Effective STL Item 4 (size() vs. empty())

      "Howard Hinnant" <hinnant@metrow erks.com> wrote in message
      news:hinnant-827BDA.22572430 012005@syrcnyrd rs-02-ge0.nyroc.rr.co m...[color=blue]
      > In article <ctjv25$rr0$1@n ews.hispeed.ch> ,
      > "Ivan Vecerina" <INVALID_use_we bform_instead@v ecerina.com> wrote:
      >[color=green]
      >> Could you list the trade-offs for std::list to store its item count?
      >> I know that splice operations between std::list instances then become
      >> more expensive, and I agree I would not care much about that.
      >> Any other practical obstacle you have been encountering?[/color]
      >
      > Sure.[/color]
      Thank you :)
      [color=blue]
      > There are 5 possible valid splice situations: splicing from self or
      > splicing from another list, crossed with splicing 1 element, some
      > elements, or all elements. The O(1) size leads to an O(some) splice
      > in one of these five situations (best viewed in a mono-spaced font):
      >
      > list::splice complexity with O(1) size
      > +------+-----------------+-----------------+
      > | | from self | from other |
      > +------+-----------------+-----------------+
      > | one | O(1) | O(1) |
      > +------+-----------------+-----------------+
      > | some | O(1) | O(some) |
      > +------+-----------------+-----------------+
      > | all | not valid | O(1) |
      > +------+-----------------+-----------------+
      >
      > In the "splice some from other" case, the splice function must compute:
      >
      > std::distance(f irst, last);
      >
      > where [first, last) represents the range "some" which is to be spliced
      > from other. So the constant on that linear complexity term is really
      > quite small (but of course not zero).[/color]
      I think there is an additional case here: if the source and
      destination lists have different allocators, the items actually
      have to be copied and the 'splice some/all from other' is O(N) anyway
      - so there is no cost in counting them to maintain an O(1) list size.

      I appreciate the thorough list you provide ( all points seem very
      valid, only for the implementation of list::sort do I not see a
      clear benefit ).


      Since we are talking about splice, I dare ask another question:
      I am currently using splice(1 item from self) to move items within
      an std::list. The standard seems to say that iterators to moved
      items are always invalidated by splice. But in the case where the
      source and destination lists are the same, I see no practical reason
      for the iterator to be invalidated (and all the implementations of
      std::list I reviewed seem to agree). And I am using std::list just
      for that reason: to be able to keep iterators to specific items.
      But this does not seem to be guaranteed by the standard (ISO '98).

      Is this a reasonable omission?
      Is there a safe way to move items in an std::list without
      invalidating iterators?
      Also, am I right to believe that list::sort preserves iterators?

      [NB: I asked similar questions in clc++m, with no enlightening reply
      yet: "std::list::spl ice to move item within the same list[...]" ]


      Thank you Howard.
      Kudos for your excellent work, I keep fond memories of the early days
      of CW on the Mac, which was the first C++ lib I used (and enjoyed!).
      Ivan
      --
      http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form


      Comment

      • Ivan Vecerina

        #18
        Re: Effective STL Item 4 (size() vs. empty())

        "Chris" <caj@cs.york.ac .uk> wrote in message
        news:41FDF372.1 0102@cs.york.ac .uk...[color=blue]
        > Howard Hinnant wrote:[color=green]
        >> <snip>
        >> Now on to the question: What does O(1) list::size benefit?
        >>
        >> Answer: Both client code, and the implementation of list.[/color]
        >
        > One quick question. It would seem to me that if you wanted to allow
        > splice, and O(1) list::size, then each list node would have to contain a
        > pointer to some kind of central information bank about the list, which
        > stores the size. if you didn't have this then you would have no way of
        > accessing the size information to update it after a splice out of the
        > list.[/color]

        The splice member functions receives among its parameters a reference
        to the list that owns the items to be transferred. Similarly, all
        functions that add or remove items to and std::list already have a
        reference (or 'this' pointer) to the owning list root/object.

        So there is no need for individual list items (or iterators) to
        store a pointer to the owning list instance - your concerns
        are unwarranted.


        Regards,
        Ivan
        --
        http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form


        Comment

        • Chris Jefferson

          #19
          Re: Effective STL Item 4 (size() vs. empty())

          Ivan Vecerina wrote:[color=blue]
          > "Chris" <caj@cs.york.ac .uk> wrote in message
          > news:41FDF372.1 0102@cs.york.ac .uk...
          >[color=green]
          >>Howard Hinnant wrote:
          >>[color=darkred]
          >>><snip>
          >>>Now on to the question: What does O(1) list::size benefit?
          >>>
          >>>Answer: Both client code, and the implementation of list.[/color]
          >>
          >>One quick question. It would seem to me that if you wanted to allow
          >>splice, and O(1) list::size, then each list node would have to contain a
          >>pointer to some kind of central information bank about the list, which
          >>stores the size. if you didn't have this then you would have no way of
          >>accessing the size information to update it after a splice out of the
          >>list.[/color]
          >
          >
          > The splice member functions receives among its parameters a reference
          > to the list that owns the items to be transferred. Similarly, all
          > functions that add or remove items to and std::list already have a
          > reference (or 'this' pointer) to the owning list root/object.
          >[/color]

          Woops, thats what I get for trying to remember the splice function.
          Apologises!

          Chris

          Comment

          • Chris Jefferson

            #20
            Re: Effective STL Item 4 (size() vs. empty())

            Howard Hinnant wrote:[color=blue]
            > <snip>
            > For my money, the pros outweigh the cons by a significant margin for an
            > O(1) list::size. Not the least of which is that (if standardized)
            > clients could depend upon list::size portably.
            >[/color]
            Thank you for the interest discussion about the pros and cons.
            Personally I actually splice quite a bit, so I would perfer O(1) splice
            and O(n) size, as I would see the O(1) splice as one of the big
            advantages of std::list, and the (fixed) overhead of having and
            maintaining a counter would not be useful for me personally.

            However, one thing I think we can all agree on is that the standard
            should either mandate an O(1) splice or an O(1) size, as at the moment
            people like myself have to make sure we are using a compiler with O(1)
            splice, and other people assume they get O(1) size. I like the idea of
            having splice with an extra parameter, while that wouldn't fix my
            problems, it would I suspect fix most other people's problems. However I
            expect the chances of this ever getting mandated one way or the other is
            slim. It I suspect wouldn't get snuck in as a defect report, so we'd
            have to wait quite a while for a fix anyway.

            Chris[color=blue]
            > -Howard[/color]

            Comment

            • Howard Hinnant

              #21
              Re: Effective STL Item 4 (size() vs. empty())

              In article <ctkt4c$cd3$1@n ews.hispeed.ch> ,
              "Ivan Vecerina" <NOT_VALID_plea se_use_contact_ webform@vecerin a.com>
              wrote:
              [color=blue][color=green]
              > > In the "splice some from other" case, the splice function must compute:
              > >
              > > std::distance(f irst, last);
              > >
              > > where [first, last) represents the range "some" which is to be spliced
              > > from other. So the constant on that linear complexity term is really
              > > quite small (but of course not zero).[/color][/color]
              [color=blue]
              > I think there is an additional case here: if the source and
              > destination lists have different allocators, the items actually
              > have to be copied and the 'splice some/all from other' is O(N) anyway
              > - so there is no cost in counting them to maintain an O(1) list size.[/color]

              <nod> I intentionally ignored the unequal allocators case. I do discuss
              this in a little more detail here:



              This paper is about swap for containers of unequal allocators, but
              splice is a closely related issue.

              I feel that turning splice (and swap) into an element by element copy is
              too dangerous because it silently turns a nothrow operation into a
              throwing operation. This could silently invalidate client code which
              has been crafted with a nothrow splice/swap in mind. If this situation
              occurs, I'd prefer it be as noisy as possible the very first time it
              gets executed instead of silently trying to please.

              In addition to the changed exception guarantees, iterator validity also
              would get silently messed up with an element-by-element copy (see below).
              [color=blue]
              > Since we are talking about splice, I dare ask another question:
              > I am currently using splice(1 item from self) to move items within
              > an std::list. The standard seems to say that iterators to moved
              > items are always invalidated by splice. But in the case where the
              > source and destination lists are the same, I see no practical reason
              > for the iterator to be invalidated (and all the implementations of
              > std::list I reviewed seem to agree). And I am using std::list just
              > for that reason: to be able to keep iterators to specific items.
              > But this does not seem to be guaranteed by the standard (ISO '98).
              >
              > Is this a reasonable omission?
              > Is there a safe way to move items in an std::list without
              > invalidating iterators?
              >
              > [NB: I asked similar questions in clc++m, with no enlightening reply
              > yet: "std::list::spl ice to move item within the same list[...]" ][/color]

              This is a defect in the standard, and has a proposed resolution with WP
              status. WP status means that the committee has voted on the proposed
              resolution and is happy with it, and has applied it to the working paper
              for C++0X.



              The proposed resolution does what you believe is reasonable, and find in
              practice. Although I did not initiate this issue, by coincidence, I did
              write the current proposed wording.
              [color=blue]
              > Also, am I right to believe that list::sort preserves iterators?[/color]

              Yes. Reference 23.1p11:
              [color=blue]
              > Unless otherwise specified (either explicitly or by defining a function in
              > terms of other functions), invoking a container member function or passing a
              > container as an argument to a library function shall not invalidate iterators
              > to, or change the values of, objects within that container.[/color]
              [color=blue]
              > Thank you Howard.
              > Kudos for your excellent work, I keep fond memories of the early days
              > of CW on the Mac, which was the first C++ lib I used (and enjoyed!).[/color]

              Thanks for your kind words.

              -Howard

              Comment

              • Thorsten Ottosen

                #22
                Re: Effective STL Item 4 (size() vs. empty())

                From: "Chris Jefferson" <caj@cs.york.ac .uk>

                | However, one thing I think we can all agree on is that the standard
                | should either mandate an O(1) splice or an O(1) size, as at the moment
                | people like myself have to make sure we are using a compiler with O(1)
                | splice, and other people assume they get O(1) size.

                That was my point: the standard may not be strict about the requirement for
                size(), but
                it does AFAICT mandate an iterator-range splice() to be of linear complexity
                ***. Hence, there is no point in having a O(n) size().

                -Thorsten

                *** I assume that explicit complexity requirements are not allowed to be
                faster, though.
                I guess I could use some training in reading the standard


                Comment

                • Howard Hinnant

                  #23
                  Re: Effective STL Item 4 (size() vs. empty())

                  In article <41fe66f0$0$483 24$14726298@new s.sunsite.dk>,
                  "Thorsten Ottosen" <nesotto@cs.auc .dk> wrote:
                  [color=blue]
                  > That was my point: the standard may not be strict about the requirement for
                  > size(), but
                  > it does AFAICT mandate an iterator-range splice() to be of linear complexity
                  > ***. Hence, there is no point in having a O(n) size().
                  >
                  > -Thorsten
                  >
                  > *** I assume that explicit complexity requirements are not allowed to be
                  > faster, though.
                  > I guess I could use some training in reading the standard[/color]

                  See 17.3.1.3p5:
                  [color=blue]
                  > Complexity requirements specified in the library clauses are upper bounds,
                  > and implementations that provide better complexity guarantees satisfy the
                  > requirements.[/color]

                  -Howard

                  Comment

                  • Thorsten Ottosen

                    #24
                    Re: Effective STL Item 4 (size() vs. empty())


                    "Howard Hinnant" <hinnant@metrow erks.com> wrote in message
                    news:hinnant-ED7782.14432731 012005@syrcnyrd rs-01-ge0.nyroc.rr.co m...
                    | In article <41fe66f0$0$483 24$14726298@new s.sunsite.dk>,
                    | "Thorsten Ottosen" <nesotto@cs.auc .dk> wrote:

                    | > *** I assume that explicit complexity requirements are not allowed to be
                    | > faster, though.
                    | > I guess I could use some training in reading the standard
                    |
                    | See 17.3.1.3p5:
                    |
                    | > Complexity requirements specified in the library clauses are upper bounds,
                    | > and implementations that provide better complexity guarantees satisfy the
                    | > requirements.

                    this is fine as long as it does not involve design tradeoffs---in the case of
                    list
                    it means a drastic change in performance across implementations . This is
                    different from a O(n) std::sort() where there is no tradeoff.

                    Anyway, it does mean that I was wrong and Scott was right.

                    -Thorsten


                    Comment

                    • Phil Staite

                      #25
                      Re: Effective STL Item 4 (size() vs. empty())


                      Consider this typical (SGI) implementation of size() and empty() for
                      std::vector... (paraphrasing from memory, I'm on my linux box now)


                      size_t size() const { return static_cast<siz e_t>( end() - begin() ); }

                      bool empty() const { return begin() == end(); }


                      Seem about the same?

                      What do you know about iterators for say std::vector??? They're really
                      pointers to the contained type. (in every implementation I've looked at)

                      So what you really have in empty() is a simple pointer comparison, the
                      result of which will set an architecture dependent flag (eg. the zero
                      flag in the CPU...) that can be tested. (usually a subtraction and test
                      of the zero flag)

                      But end() - begin() is a pointer difference. That means a subtraction
                      of the two, and a division by sizeof(T) for whatever type is in the
                      vector... This gives you the offset between the pointers in T sized
                      units... Which you can then compare to 0 to set the CPU flag... So, it
                      would appear that the size() idiom requires an additional integral
                      subtraction and division as compared to simply calling empty(). That,
                      IMHO is why we should use empty() when we mean empty(), not size() == 0...

                      This came up at work a couple of weeks ago and I tested it (on SGIs)
                      doing a zillion calls of each, both with empty and non-empty vectors and
                      as expected, the size() calls were slower.

                      Comment

                      • Thorsten Ottosen

                        #26
                        Re: Effective STL Item 4 (size() vs. empty())


                        "Phil Staite" <phil@nospam.co m> wrote in message
                        news:41FF131A.6 090309@nospam.c om...
                        |
                        | Consider this typical (SGI) implementation of size() and empty() for
                        | std::vector... (paraphrasing from memory, I'm on my linux box now)
                        |
                        |
                        | size_t size() const { return static_cast<siz e_t>( end() - begin() ); }
                        |
                        | bool empty() const { return begin() == end(); }
                        |

                        | So what you really have in empty() is a simple pointer comparison, the
                        | result of which will set an architecture dependent flag (eg. the zero
                        | flag in the CPU...) that can be tested. (usually a subtraction and test
                        | of the zero flag)
                        |
                        | But end() - begin() is a pointer difference. That means a subtraction
                        | of the two, and a division by sizeof(T) for whatever type is in the
                        | vector... This gives you the offset between the pointers in T sized
                        | units... Which you can then compare to 0 to set the CPU flag... So, it
                        | would appear that the size() idiom requires an additional integral
                        | subtraction and division as compared to simply calling empty(). That,
                        | IMHO is why we should use empty() when we mean empty(), not size() == 0...

                        This is a good point although the biggest problem arises when size() has O(n)
                        complexity.

                        Howver, for your reason, Eric Niebler's comming BOOST_FOREACH() library
                        definitely uses

                        for( ; !empty( range ); ++range ) { ... }

                        instead of checking agaist size() == 0. Just imagine the performace impact
                        when
                        iterating over a subrange of a list.

                        -Thorsten


                        Comment

                        Working...