Profiling, sum-comprehension vs reduce

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • cnb

    Profiling, sum-comprehension vs reduce

    This must be because of implementation right? Shouldn't reduce be
    faster since it iterates once over the list?
    doesnt sum first construct the list then sum it?

    -----------------------
    >>============= =============== ==== RESTART =============== =============== ==
    >>>
    reduce with named function: 37.9864357062
    reduce with nested, named function: 39.4710288598
    reduce with lambda: 39.2463927678
    sum comprehension: 25.9530121845
    >>============= =============== ==== RESTART =============== =============== ==
    >>>
    reduce with named function: 36.4529584067
    reduce with nested, named function: 37.6278529813
    reduce with lambda: 38.2629448715
    sum comprehension: 26.0197561422
    >>>


    from timeit import Timer

    def add(x,y):
    return x+y

    def rednamed(lst):
    return reduce(add, lst)

    def rednn(lst):
    def add2(x,y):
    return x+y
    return reduce(add2, lst)

    def redlambda(lst):
    return reduce(lambda x,y:x+y, lst)

    def com(lst):
    return sum(x for x in lst)

    s = xrange(101)

    t1 = Timer('rednamed (s)', 'from __main__ import rednamed, s')
    t2 = Timer('rednn(s) ', 'from __main__ import rednn, s')
    t3 = Timer('redlambd a(s)', 'from __main__ import redlambda, s')
    t4 = Timer('com(s)', 'from __main__ import com, s')

    print "reduce with named function: ", t1.timeit()
    print "reduce with nested, named function: ", t2.timeit()
    print "reduce with lambda: ", t3.timeit()
    print "sum comprehension: ", t4.timeit()


    ---------------------------------------
    also, using range instead of xrange doesnt seem to generate a
    performance-penalty:
    >>>
    reduce with named function: 36.7560729087
    reduce with nested, named function: 38.5393266463
    reduce with lambda: 38.3852953378
    sum comprehension: 27.9001007111
    >>>
  • Marc 'BlackJack' Rintsch

    #2
    Re: Profiling, sum-comprehension vs reduce

    On Sat, 13 Sep 2008 01:06:22 -0700, cnb wrote:
    This must be because of implementation right? Shouldn't reduce be faster
    since it iterates once over the list? doesnt sum first construct the
    list then sum it?
    No it doesn't. Why should it?
    also, using range instead of xrange doesnt seem to generate a
    performance-penalty:
    (De)Allocating a list of length 100 isn't very slow. Try some million
    elements. And watch the memory consumption too.

    Ciao,
    Marc 'BlackJack' Rintsch

    Comment

    • Harold Fellermann

      #3
      Re: Profiling, sum-comprehension vs reduce

      doesnt sum first construct the list then sum it?
      def com(lst):
      return sum(x for x in lst)
      You construct a generator over an existing list in your code.
      Try sum([x for x in lst]) to see the effect of additional list
      construction. And while you're at it, try the simple sum(lst).

      Cheers,

      - harold -

      Comment

      • Bas

        #4
        Re: Profiling, sum-comprehension vs reduce

        On Sep 13, 10:06 am, cnb <circularf...@y ahoo.sewrote:
        This must be because of implementation right? Shouldn't reduce be
        faster since it iterates once over the list?
        doesnt sum first construct the list then sum it?
        No, sum also iterates the sequence just once and doesn't create a
        list. It is probably implemented similar to

        def sum(sequence, start=0):
        it = iter(sequence)
        total = start
        for i in it:
        total += i
        return i

        but then implemented in C for speed. Reduce is probably implemented
        pretty similar, but then with a random function instead of addition.
        Make sure that you understand the difference between generator
        expression and list comprehension, and that [f(x) for x in something]
        is (almost) equal to list(f(x) for x in something), so you can emulate
        a LC by using the list constructor on the equivalent GE.

        HTH,
        Bas

        Comment

        • Steven D'Aprano

          #5
          Re: Profiling, sum-comprehension vs reduce

          On Sat, 13 Sep 2008 01:06:22 -0700, cnb wrote:
          This must be because of implementation right? Shouldn't reduce be faster
          since it iterates once over the list? doesnt sum first construct the
          list then sum it?
          What makes you think that?

          Given the speed of sum(), it sure doesn't look like it's generating a
          full list before summing. Why would it?

          reduce with named function: 37.9864357062
          reduce with nested, named function: 39.4710288598
          reduce with lambda: 39.2463927678
          sum comprehension: 25.9530121845
          If you want to see reduce really shine, time it with a C-based function
          rather than one written in pure Python:
          >>Timer('reduce (add, xrange(10000))' ,
          .... 'from operator import add').repeat(nu mber=10000)
          [19.724750995635 986, 19.410486936569 214, 19.614511013031 006]
          >>>
          >>Timer('reduce (add, xrange(10000))' ,
          .... 'def add(x, y): return x+y').repeat(nu mber=10000)
          [45.210143089294 434, 44.814558982849 121, 46.906874895095 825]


          You probably won't see much (if any) benefit for small lists, so make
          sure your test is on a significantly-sized input list.

          Of course, sum() is even faster than reduce:
          >>Timer('sum(xr ange(10000))'). repeat(number=1 0000)
          [9.8149249553680 42, 8.7169640064239 502, 9.5062401294708 252]



          --
          Steven

          Comment

          • Terry Reedy

            #6
            Re: Profiling, sum-comprehension vs reduce



            Steven D'Aprano wrote:
            If you want to see reduce really shine, time it with a C-based function
            rather than one written in pure Python:
            >
            >>>Timer('reduc e(add, xrange(10000))' ,
            ... 'from operator import add').repeat(nu mber=10000)
            [19.724750995635 986, 19.410486936569 214, 19.614511013031 006]
            >>>Timer('reduc e(add, xrange(10000))' ,
            ... 'def add(x, y): return x+y').repeat(nu mber=10000)
            [45.210143089294 434, 44.814558982849 121, 46.906874895095 825]
            ....
            Of course, sum() is even faster than reduce:
            >
            >>>Timer('sum(x range(10000))') .repeat(number= 10000)
            [9.8149249553680 42, 8.7169640064239 502, 9.5062401294708 252]
            'Of course', because the irreducible difference between reduce(add.seq)
            and sum(seq) is that reduce has to call an add function while sum has
            the operation built-in in place of a call.

            Comment

            • Hrvoje Niksic

              #7
              Re: Profiling, sum-comprehension vs reduce

              Terry Reedy <tjreedy@udel.e duwrites:
              >Of course, sum() is even faster than reduce:
              >>
              >>>>Timer('sum( xrange(10000))' ).repeat(number =10000)
              >[9.8149249553680 42, 8.7169640064239 502, 9.5062401294708 252]
              >
              'Of course', because the irreducible difference between
              reduce(add.seq) and sum(seq) is that reduce has to call an add
              function while sum has the operation built-in in place of a call.
              Note that, despite appearances, it's not as built-in as one might
              wish. sum(seq) is still completely generic and works on all
              number-like objects (in fact on all objects that define an __add__
              operation except strings, which are explicitly forbidden). This means
              that it can't just go ahead and execute a C addition, it must properly
              call PyNumber_Add (the C API call equivalent to Python's "+"
              operator), which will then inspect the objects and invoke the
              appropriate implementation of addition. On the other hand,
              operator.add is defined to do exactly that -- call PyNumber_Add on its
              arguments and return the result.

              It's not entirely obvious why summing numbers using the same method,
              repeatedly calling PyNumber_Add, would be significantly slower for
              reduce than for sum. Admittedly reduce has to execute a Python-level
              function call which requires allocating and filling out a tuple, only
              to reach PyNumber_Add, which sum calls immediately. But
              builtin_reduce( ) is actually careful to optimize those repeated
              function calls, for example by reusing the same tuple across the loop.

              Comment

              • Raymond Hettinger

                #8
                Re: Profiling, sum-comprehension vs reduce

                On Sep 13, 9:27 pm, Hrvoje Niksic <hnik...@xemacs .orgwrote:
                Note that, despite appearances, it's not as built-in as one might
                wish.  sum(seq) is still completely generic and works on all
                number-like objects (in fact on all objects that define an __add__
                operation except strings, which are explicitly forbidden).  This means
                that it can't just go ahead and execute a C addition, it must properly
                call PyNumber_Add (the C API call equivalent to Python's "+"
                operator), which will then inspect the objects and invoke the
                appropriate implementation of addition.
                The time machine strikes again! Try using Py2.6.
                The built-in sum() function is much smarter and faster.
                It does in fact use C addition.


                Raymond

                Comment

                Working...