Effective STL Item 4 (size() vs. empty())

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Matthias

    Effective STL Item 4 (size() vs. empty())

    Hi,

    I am just reading that book by Scott Meyers. In Item 4 Meyers suggests
    to always use empty() instead of size() when probing for emptyness of
    STL containers. His reasoning is that size() might take linear time on
    some list implementations . That makes sense at first.
    However, he also says this at the very beginning:

    "That being the case [he is referring to size()==0 being equivalent
    to empty()==true], you might wonder why one construct should be
    preferred to the other, especially in view of the fact that empty is
    typically implemented as an inline function that simply returns whether
    size returns 0. [...]
    .... the reason is simple: empty is a constant-time operations for all
    standard containers, but for some list implementations , size may take
    linear time."

    So where is the logic? If empty() only calls size(), and size() is
    implemented to take linear time, empty() obviously needs linear time,
    too, because it does nothing more than calling size(). So if size()
    happens to take linear time, where is the benefit of using empty() instead?

    *confused*

    --
    Regards,
    Matthias
  • Gianni Mariani

    #2
    Re: Effective STL Item 4 (size() vs. empty())

    Matthias wrote:[color=blue]
    > Hi,
    >
    > I am just reading that book by Scott Meyers. In Item 4 Meyers suggests
    > to always use empty() instead of size() when probing for emptyness of
    > STL containers. His reasoning is that size() might take linear time on
    > some list implementations . That makes sense at first.
    > However, he also says this at the very beginning:
    >
    > "That being the case [he is referring to size()==0 being equivalent
    > to empty()==true], you might wonder why one construct should be
    > preferred to the other, especially in view of the fact that empty is
    > typically implemented as an inline function that simply returns whether
    > size returns 0. [...]
    > ... the reason is simple: empty is a constant-time operations for all
    > standard containers, but for some list implementations , size may take
    > linear time."
    >
    > So where is the logic? If empty() only calls size(), and size() is
    > implemented to take linear time, empty() obviously needs linear time,
    > too, because it does nothing more than calling size(). So if size()
    > happens to take linear time, where is the benefit of using empty() instead?[/color]

    I think he explains it clearly.

    std::list::empt y has a special way to determine emptiness (are there any
    objects in the list is easy to determine). However, while
    std::list::size ==0 can give you the same answer, calling it will have
    O(N) time since it needs to count every element. This would not be very
    desirable if you have a few million elements in the list.

    Comment

    • Duane Hebert

      #3
      Re: Effective STL Item 4 (size() vs. empty())


      "Matthias" <nospam@digital raid.com> wrote in message news:ctikpk$dv0 $05$1@news.t-online.com...[color=blue]
      > So where is the logic? If empty() only calls size(), and size() is
      > implemented to take linear time, empty() obviously needs linear time,
      > too, because it does nothing more than calling size(). So if size()
      > happens to take linear time, where is the benefit of using empty() instead?[/color]

      I think one implementation of empty() may be something like
      return size() == 0. But another may be that internally, a class holds
      a member bool that gets set to true when size is decremented to
      0 and false otherwise. Then empty() could be implemented by
      returning this bool.
      I think his basic point is that empty() may
      call size() but may not - in any case, depending on the implementation,
      empty() has possibly less overhead and probably at worst it's the
      same.


      Comment

      • Fabio Fracassi

        #4
        Re: Effective STL Item 4 (size() vs. empty())

        Matthias wrote:
        [color=blue]
        > Hi,
        >[/color]

        [snip]
        [Repeating original with emphasis added:]
        you might wonder why one construct should be
        preferred to the other, especially in view of the fact that empty is
        [!]TYPICALLY[!] implemented as an inline function that simply returns
        whether size returns 0. [...]
        [color=blue]
        > So where is the logic? If empty() only calls size(), and size() is
        > implemented to take linear time, empty() obviously needs linear time,
        > too, because it does nothing more than calling size(). So if size()
        > happens to take linear time, where is the benefit of using empty()
        > instead?
        >
        > *confused*
        >[/color]

        The Key here is typically! Not always! In the cases were it matters empty()
        is obviously NOT implemented in terms of size().

        Hope that helps

        Fabio


        Comment

        • Matthias

          #5
          Re: Effective STL Item 4 (size() vs. empty())

          Thanks everybody. I think I got the point.

          --
          Regards,
          Matthias

          Comment

          • Thorsten Ottosen

            #6
            Re: Effective STL Item 4 (size() vs. empty())


            "Matthias" <nospam@digital raid.com> wrote in message
            news:ctikpk$dv0 $05$1@news.t-online.com...

            | too, because it does nothing more than calling size(). So if size()
            | happens to take linear time, where is the benefit of using empty() instead?

            Though I have great respect for SM and his books, this Item is *not* correct
            as long as we talk about the C++ standard. size() is guaranteed to be O(1)
            also for list. Incidently, this false item sneaked its way into Sutter and
            Alexandrescus new book too.

            br

            Thorsten


            Comment

            • Gianni Mariani

              #7
              Re: Effective STL Item 4 (size() vs. empty())

              Thorsten Ottosen wrote:[color=blue]
              > "Matthias" <nospam@digital raid.com> wrote in message
              > news:ctikpk$dv0 $05$1@news.t-online.com...
              >
              > | too, because it does nothing more than calling size(). So if size()
              > | happens to take linear time, where is the benefit of using empty() instead?
              >
              > Though I have great respect for SM and his books, this Item is *not* correct
              > as long as we talk about the C++ standard. size() is guaranteed to be O(1)
              > also for list. Incidently, this false item sneaked its way into Sutter and
              > Alexandrescus new book too.[/color]

              GCC seems to have a bug then.

              Can you cite where in the standard that is required ?

              Comment

              • Thorsten Ottosen

                #8
                Re: Effective STL Item 4 (size() vs. empty())


                "Gianni Mariani" <gi2nospam@mari ani.ws> wrote in message
                news:15-dnUVHDd2D12DcRV n-1Q@speakeasy.ne t...
                | Thorsten Ottosen wrote:
                | > "Matthias" <nospam@digital raid.com> wrote in message
                | > news:ctikpk$dv0 $05$1@news.t-online.com...
                | >
                | > | too, because it does nothing more than calling size(). So if size()
                | > | happens to take linear time, where is the benefit of using empty()
                instead?
                | >
                | > Though I have great respect for SM and his books, this Item is *not*
                correct
                | > as long as we talk about the C++ standard. size() is guaranteed to be O(1)
                | > also for list. Incidently, this false item sneaked its way into Sutter and
                | > Alexandrescus new book too.
                |
                | GCC seems to have a bug then.

                it probably has many.

                | Can you cite where in the standard that is required ?

                Section 23.2.2 says that list fulfills the container requirements.
                Section 23.1, Table 65 says

                "Those entries marked ''(Note A)'' should have constant complexity."

                Note A applies to container::size ()

                -Thorsten


                Comment

                • Peter Koch Larsen

                  #9
                  Re: Effective STL Item 4 (size() vs. empty())


                  "Thorsten Ottosen" <nesotto@cs.auc .dk> skrev i en meddelelse
                  news:41fd3299$0 $48327$14726298 @news.sunsite.d k...[color=blue]
                  >
                  > "Matthias" <nospam@digital raid.com> wrote in message
                  > news:ctikpk$dv0 $05$1@news.t-online.com...
                  >
                  > | too, because it does nothing more than calling size(). So if size()
                  > | happens to take linear time, where is the benefit of using empty()
                  > instead?
                  >
                  > Though I have great respect for SM and his books, this Item is *not*
                  > correct
                  > as long as we talk about the C++ standard. size() is guaranteed to be O(1)
                  > also for list. Incidently, this false item sneaked its way into Sutter and
                  > Alexandrescus new book too.
                  >
                  > br
                  >
                  > Thorsten
                  >
                  >[/color]
                  Hi Thorsten

                  I do not have a copy of the holy standard, but are you absolutely sure? I
                  believe that size() behaves like e.g. push_back on a vector - that is has an
                  average complexity of O(1).
                  Some list algorithms (splicing) might invalidate the internal size holder,
                  requiring a recalculation when it is required.
                  Apart from that, I believe you will agree with me that list.empty() is
                  clearer than list.size() == 0.

                  Kind regards
                  Peter


                  Comment

                  • Howard Hinnant

                    #10
                    Re: Effective STL Item 4 (size() vs. empty())

                    In article <41fd51d3$0$483 22$14726298@new s.sunsite.dk>,
                    "Thorsten Ottosen" <nesotto@cs.auc .dk> wrote:
                    [color=blue]
                    > | Can you cite where in the standard that is required ?
                    >
                    > Section 23.2.2 says that list fulfills the container requirements.
                    > Section 23.1, Table 65 says
                    >
                    > "Those entries marked ''(Note A)'' should have constant complexity."
                    >
                    > Note A applies to container::size ()[/color]

                    This is a (dirty) trick in the standard. empty() is said to have
                    constant complexity. (period)

                    size() (as Thorsten correctly states) "should have constant complexity".

                    "should" has a specific meaning in a standards document. See:



                    To paraphrase, "should" means we would like it to, but we're not going
                    to require it to. Therefore a linear (or even quadratic or exponential)
                    complexity on size() is standards conforming. And if you really want to
                    be horrified, note that Note A applies also to swap and max_size.

                    In practice, only list::size appears to be vulnerable to this note.

                    You can find vehement supporters for this flexibility for list::size,
                    but I'm not one of them. I believe that not only is this flexibility
                    not needed, it makes list::size useless, and actually dangerous.

                    -Howard

                    Comment

                    • Ivan Vecerina

                      #11
                      Re: Effective STL Item 4 (size() vs. empty())

                      "Howard Hinnant" <hinnant@metrow erks.com> wrote in message
                      news:hinnant-A262DE.18544330 012005@syrcnyrd rs-03-ge0.nyroc.rr.co m...[color=blue]
                      > In article <41fd51d3$0$483 22$14726298@new s.sunsite.dk>,
                      > "Thorsten Ottosen" <nesotto@cs.auc .dk> wrote:
                      >[color=green]
                      >> | Can you cite where in the standard that is required ?
                      >>
                      >> Section 23.2.2 says that list fulfills the container requirements.
                      >> Section 23.1, Table 65 says[/color][/color]
                      ....[color=blue]
                      > To paraphrase, "should" means we would like it to, but we're not going
                      > to require it to. Therefore a linear (or even quadratic or exponential)
                      > complexity on size() is standards conforming. And if you really want to
                      > be horrified, note that Note A applies also to swap and max_size.[/color]
                      Wow. Very instructive.
                      [color=blue]
                      > In practice, only list::size appears to be vulnerable to this note.
                      >
                      > You can find vehement supporters for this flexibility for list::size,
                      > but I'm not one of them. I believe that not only is this flexibility
                      > not needed, it makes list::size useless, and actually dangerous.[/color]

                      Could you list the trade-offs for std::list to store its item count?
                      I know that splice operations between std::list instances then become
                      more expensive, and I agree I would not care much about that.
                      Any other practical obstacle you have been encountering?


                      Kind regards - Ivan
                      --
                      http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form



                      Comment

                      • Thorsten Ottosen

                        #12
                        Re: Effective STL Item 4 (size() vs. empty())


                        "Howard Hinnant" <hinnant@metrow erks.com> wrote in message
                        news:hinnant-A262DE.18544330 012005@syrcnyrd rs-03-ge0.nyroc.rr.co m...
                        | In article <41fd51d3$0$483 22$14726298@new s.sunsite.dk>,
                        | "Thorsten Ottosen" <nesotto@cs.auc .dk> wrote:
                        |
                        | > | Can you cite where in the standard that is required ?
                        | >
                        | > Section 23.2.2 says that list fulfills the container requirements.
                        | > Section 23.1, Table 65 says
                        | >
                        | > "Those entries marked ''(Note A)'' should have constant complexity."
                        | >
                        | > Note A applies to container::size ()
                        |
                        | This is a (dirty) trick in the standard. empty() is said to have
                        | constant complexity. (period)
                        |
                        | size() (as Thorsten correctly states) "should have constant complexity".
                        |
                        | "should" has a specific meaning in a standards document. See:
                        |
                        | http://www.iso.org/iso/en/iso9000-14.../2000rev8.html
                        |
                        | To paraphrase, "should" means we would like it to, but we're not going
                        | to require it to. Therefore a linear (or even quadratic or exponential)
                        | complexity on size() is standards conforming. And if you really want to
                        | be horrified, note that Note A applies also to swap and max_size.

                        ok, yeah, so the standard does not require it.

                        | In practice, only list::size appears to be vulnerable to this note.

                        In Scott's book he discusses how the alternative is between
                        an O(1) size() and a O(n) splice() or conversely. However, the
                        standard nails down the complexity of splice() to O(n). Hence
                        it would be weird to make size() O(n) too.

                        | You can find vehement supporters for this flexibility for list::size,
                        | but I'm not one of them. I believe that not only is this flexibility
                        | not needed, it makes list::size useless, and actually dangerous.

                        well, the question is: are there other operations than splice() which
                        would benefit from a O(n) size()? Otherwise there should be no
                        "should" in the standard.

                        -Thorsten


                        Comment

                        • Stephen Howe

                          #13
                          Re: Effective STL Item 4 (size() vs. empty())

                          > In practice, only list::size appears to be vulnerable to this note.[color=blue]
                          >
                          > You can find vehement supporters for this flexibility for list::size,
                          > but I'm not one of them. I believe that not only is this flexibility
                          > not needed, it makes list::size useless, and actually dangerous.[/color]

                          I don't see a problem. I would "cache" an internal count that used to
                          evaluate list::size().
                          If the internal count is valid it will O(1) to evaluate size(), otherwise
                          O(N).
                          If splice() is never called, size() would always be O(1).
                          And only with some versions of splice() are we in the dark as to the size()
                          afterwards, not all.

                          That gives near enough O(1) behaviour for both size() and splice() and only
                          alternative calls to each is the worse case.

                          Stephen Howe


                          Comment

                          • Howard Hinnant

                            #14
                            Re: Effective STL Item 4 (size() vs. empty())

                            In article <41fd819b$0$483 18$14726298@new s.sunsite.dk>,
                            "Thorsten Ottosen" <nesotto@cs.auc .dk> wrote:
                            [color=blue]
                            > | You can find vehement supporters for this flexibility for list::size,
                            > | but I'm not one of them. I believe that not only is this flexibility
                            > | not needed, it makes list::size useless, and actually dangerous.
                            >
                            > well, the question is: are there other operations than splice() which
                            > would benefit from a O(n) size()? Otherwise there should be no
                            > "should" in the standard.[/color]

                            In article <ctjv25$rr0$1@n ews.hispeed.ch> ,
                            "Ivan Vecerina" <INVALID_use_we bform_instead@v ecerina.com> wrote:
                            [color=blue]
                            > Could you list the trade-offs for std::list to store its item count?
                            > I know that splice operations between std::list instances then become
                            > more expensive, and I agree I would not care much about that.
                            > Any other practical obstacle you have been encountering?[/color]

                            Sure.

                            list::splice is the only function where there is an order of complexity
                            change. However, my last sentence is, I believe, overly alarming, and
                            indeed, an exaggeration.

                            There are 5 possible valid splice situations: splicing from self or
                            splicing from another list, crossed with splicing 1 element, some
                            elements, or all elements. The O(1) size leads to an O(some) splice in
                            one of these five situations (best viewed in a mono-spaced font):

                               list::splice complexity with O(1) size
                            +------+-----------------+-----------------+
                            |      |     from self   |     from other  |
                            +------+-----------------+-----------------+
                            | one  |      O(1)       |      O(1)       |
                            +------+-----------------+-----------------+
                            | some |      O(1)       |     O(some)     |
                            +------+-----------------+-----------------+
                            | all  |    not valid    |      O(1)       |
                            +------+-----------------+-----------------+

                            In the "splice some from other" case, the splice function must compute:

                            std::distance(f irst, last);

                            where [first, last) represents the range "some" which is to be spliced
                            from other. So the constant on that linear complexity term is really
                            quite small (but of course not zero).

                            To combat this, Metrowerks (which has a O(1) list::size) offers yet
                            another splice overload:

                            void splice(iterator position, list& x, iterator first,
                            iterator last, size_type n);

                            Where a precondition is that the last parameter "n" is equal to
                            distance(first, last). It turns out that clients that use the "splice
                            some from other" functionality usually know, or can compute without
                            changing algorithmic complexity, distance(first, last). Furthermore, in
                            debug builds this precondition is checked (along with a host of other
                            checks in debug builds).

                            The result is that distance(first, last) no longer needs to be computed
                            within splice (in release mode) and so "splice some from other" is also
                            now truly O(1).

                            Now on to the question: What does O(1) list::size benefit?

                            Answer: Both client code, and the implementation of list.

                            I have had the opportunity to review much C++ production code using
                            std::list, which I have not written. Of the code I've reviewed,
                            list::size is used much, much more often than list::splice. And
                            list::size is sometimes used in a way that could turn a linear algorithm
                            into a quadratic one (such as the end condition in a for loop). Yet I
                            haven't noticed "splice some from other" being used in a context where
                            it could change the complexity of an algorithm. Note this is not to say
                            it isn't used in such a context. Just that I haven't actually seen it
                            in production code that I've reviewed.

                            Conclusion (anecdotal, not scientific): std::list clients ignorant of
                            this issue are more likely to be bitten by an O(N) list::size than by an
                            O(some) list::splice some from other.

                            In the implementation of std::list, there are some functions that can
                            take good advantage of an O(1) size (though it does not result in a
                            complexity reduction).

                            When list::resize is shrinking, an O(1) size can be used to choose
                            whether it is better to iterate from begin to the start of the erase
                            range, or from end to the start of the erase range. Indeed, I've seen
                            implementations of list::resize that have an O(N) size and the first
                            thing they do is compute distance(begin( ), end()).

                            For operator== and operator!= for list one can check size() first (if it
                            is O(1)) before continuing with the element by element check,
                            potentially making these operations more efficient for common cases.

                            For list assignment (operator=, and assign overloads), the exception
                            safety guarantee can be increased from from basic to strong for the case
                            that T's assignment has a nothrow guarantee (at no extra expense).
                            Trying to do this with an O(N) size is computationally not practical.

                            list::sort is simpler to implement, and potentially slightly faster if
                            one can use an O(1) size to efficiently bisect the list in preparation
                            for a merge sort.

                            The cost of updating the size data within list to support an O(1) size
                            does not significantly effect any function, much less change its
                            complexity (except of course for the "splice some from other" function
                            already stated).

                            For my money, the pros outweigh the cons by a significant margin for an
                            O(1) list::size. Not the least of which is that (if standardized)
                            clients could depend upon list::size portably.

                            -Howard

                            Comment

                            • Howard Hinnant

                              #15
                              Re: Effective STL Item 4 (size() vs. empty())

                              In article <41fda673$0$314 $cc9e4d1f@news. dial.pipex.com> ,
                              "Stephen Howe" <sjhoweATdialDO TpipexDOTcom> wrote:
                              [color=blue][color=green]
                              > > In practice, only list::size appears to be vulnerable to this note.
                              > >
                              > > You can find vehement supporters for this flexibility for list::size,
                              > > but I'm not one of them. I believe that not only is this flexibility
                              > > not needed, it makes list::size useless, and actually dangerous.[/color]
                              >
                              > I don't see a problem. I would "cache" an internal count that used to
                              > evaluate list::size().
                              > If the internal count is valid it will O(1) to evaluate size(), otherwise
                              > O(N).
                              > If splice() is never called, size() would always be O(1).
                              > And only with some versions of splice() are we in the dark as to the size()
                              > afterwards, not all.
                              >
                              > That gives near enough O(1) behaviour for both size() and splice() and only
                              > alternative calls to each is the worse case.[/color]

                              <nod> This is a popular alternative suggestion to this conundrum.
                              However it is not without cost. Each internal list function which
                              depends upon an O(1) list::size, must now take into account that
                              list::size may not be O(1). list::size itself now must contain a
                              conditional expression, and may now be too big to be inlined. Instead
                              of simply returning a data member it has to ask if the cache is valid
                              and then possibly recompute it. Branches can be expensive in heavily
                              pipelined architectures.

                              And then there is the storage for the cache validity flag. You might
                              sacrifice only a bit, reducing the max_size of the list, and
                              complicating the extraction of the size. Or you could allocate a word
                              for this purpose to increase computational efficiency but now the
                              sizeof(list) has increased by 33%, making container<list< T>>
                              significantly more expensive.

                              Imho, either way, the cost isn't worth it. The cost isn't much, but it
                              buys even less.

                              -Howard

                              Comment

                              Working...