inline function call

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Riko Wichmann

    inline function call

    hi everyone,

    I'm googeling since some time, but can't find an answer - maybe because
    the answer is 'No!'.

    Can I call a function in python inline, so that the python byte compiler
    does actually call the function, but sort of inserts it where the inline
    call is made? Therefore avoiding the function all overhead.

    Thanks and cheers,

    Riko
  • M1st0

    #2
    Re: inline function call

    I think it does'n exists.
    But should be.

    You can also roll up your own using some templating library..

    Comment

    • Diez B. Roggisch

      #3
      Re: inline function call

      Riko Wichmann wrote:[color=blue]
      > Can I call a function in python inline, so that the python byte compiler
      > does actually call the function, but sort of inserts it where the inline
      > call is made? Therefore avoiding the function all overhead.[/color]

      No. That is simply impossible in python as well as in java where functions
      are always virtual, meaning they are looked up at runtime. Because you'd
      never know _which_ code to insert of all the different foo()-methods that
      might be around there.

      Do you have an actual use-case for that? I mean, do you have code that runs
      slow, but with inlined code embarrassingly faster?

      Regards,

      Diez

      Comment

      • Riko Wichmann

        #4
        Re: inline function call

        > Do you have an actual use-case for that? I mean, do you have code that runs[color=blue]
        > slow, but with inlined code embarrassingly faster?[/color]

        Well, I guess it would not actually be embarrassingly faster. From
        trying various things and actually copying the function code into the
        DoMC routine, I estimate to get about 15-20% reduction in the execution
        time. It ran very slow, in the beginning but after applying some other
        'fastpython' techniques it's actually quite fast ....

        'inlining' is mostly a matter of curiosity now :)

        here is the code snipplet:

        -----------------------------------------------------------------

        [... cut out some stuff here ....]


        # riskfunc(med, low, high):
        # risk function for costs: triangular distribution
        # implemented acoording to:

        def riskfunc(med, low, high):


        if med != 0.0:
        u = random()
        try:
        if u <= (med-low)/(high-low):
        r = low+sqrt(u*(hig h-low)*(med-low))
        else:
        r = high - sqrt((1.0-u)*(high-low)*(high-med))

        except ZeroDivisionErr or: # case high = low
        r = med
        else:
        r = 0.0

        return r


        # doMC:
        # run the MC of the cost analysis
        #
        def doMC(Ntrial = 1):

        from math import sqrt

        start = time.time()
        print 'run MC with ', Ntrial, ' trials'

        # start with a defined seed for reproducability

        total = 0.0

        for i in range(Ntrial):

        summe = 0.0
        for k in range(len(Gcost )):

        x = riskfunc(Gcost[k], Gdown[k], Gup[k])
        summe += x

        # store the value 'summe' for later usage
        # ..... more code here


        print "Summe : ", summe
        stop = time.time()
        print 'Computing time: ', stop-start



        ############### ############### ############### ############### ########
        ############### ############### ############### ############### ########

        if __name__ == '__main__':



        n = 100000
        doMC(n)

        Comment

        • Rocco Moretti

          #5
          Re: inline function call

          Riko Wichmann wrote:[color=blue]
          > hi everyone,
          >
          > I'm googeling since some time, but can't find an answer - maybe because
          > the answer is 'No!'.
          >
          > Can I call a function in python inline, so that the python byte compiler
          > does actually call the function, but sort of inserts it where the inline
          > call is made? Therefore avoiding the function all overhead.[/color]

          The cannonical answer is "you probably don't need to do that."

          If you're still set on inlining functions, take a look at bytecodehacks:
          Download bytecodehacks for free. The notorious bytecodehacks rewrite the bytecode executed by the CPython virtual machine to do things you never dreamt possible for Python.

          Comment

          • Peter Hansen

            #6
            Re: inline function call

            Riko Wichmann wrote:[color=blue]
            > Can I call a function in python inline, so that the python byte compiler
            > does actually call the function, but sort of inserts it where the inline
            > call is made? Therefore avoiding the function all overhead.[/color]

            I know a simple technique that could should basically avoid the function
            call overhead that might be worrying you, but it's really not suitable
            to use except in case of a really serious bottleneck. How bad is the
            performance in the particular case that concerns you? What kinds of
            timing measurement have you got that show the problem?

            (The technique is basically a particular way of using a generator...
            should be obvious and easy to figure out if it's really the "function
            call overhead" that is your bottleneck.)

            -Peter

            Comment

            • Peter Hansen

              #7
              Re: inline function call

              Riko Wichmann wrote:[color=blue]
              > def riskfunc(med, low, high):
              > if med != 0.0:
              > u = random()
              > try:
              > if u <= (med-low)/(high-low):
              > r = low+sqrt(u*(hig h-low)*(med-low))
              > else:
              > r = high - sqrt((1.0-u)*(high-low)*(high-med))
              >
              > except ZeroDivisionErr or: # case high = low
              > r = med
              > else:
              > r = 0.0
              >
              > return r[/color]

              Since math.sqrt() is in C, the overhead of the sqrt() call is probably
              minor, but the lookup is having to go to the global namespace which is a
              tiny step beyond looking just at the locals. Using a default argument
              to get a local could make a small difference. That is do this instead:

              def riskfunc(med, low, high, sqrt=math.sqrt) :

              Same thing with calling random(), which is doing a global lookup first
              to find the function.

              By the way, you'll get better timing information by learning to use the
              timeit module. Among other things, depending on your platform and how
              long the entire loop takes to run, using time.time() could be given you
              pretty coarse results.

              You might also try precalculating high-low and storing it in a temporary
              variable to avoid the duplicate calculations.

              -Peter

              Comment

              • Stuart D. Gathman

                #8
                Re: inline function call

                On Wed, 04 Jan 2006 13:18:32 +0100, Riko Wichmann wrote:
                [color=blue]
                > I'm googeling since some time, but can't find an answer - maybe because
                > the answer is 'No!'.
                >
                > Can I call a function in python inline, so that the python byte compiler
                > does actually call the function, but sort of inserts it where the inline
                > call is made? Therefore avoiding the function all overhead.[/color]

                In standard python, the answer is no. The reason is that all python
                functions are effectively "virtual", and you don't know *which* version to
                inline.

                HOWEVER, there is a slick product called Psyco:



                which gets around this by creating multiple versions of functions which
                contain inlined (or compiled) code. For instance, if foo(a,b) is often
                called with a and b of int type, then a special version of foo is compiled
                that is equivalent performance wise to foo(int a,int b). Dynamically
                finding the correct version of foo at runtime is no slower than normal
                dynamic calls, so the result is a very fast foo function. The only
                tradeoff is that every specialized version of foo eats memory. Psyco
                provides controls allowing you to specialize only those functions that
                need it after profiling your application.

                --
                Stuart D. Gathman <stuart@bmsi.co m>
                Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154
                "Confutatis maledictis, flamis acribus addictis" - background song for
                a Microsoft sponsored "Where do you want to go from here?" commercial.

                Comment

                • Riko Wichmann

                  #9
                  Re: inline function call

                  Hey guys,

                  thanks for all the quick replies! In addition to the tips Peter and
                  Stuart gave me above, I also followed some of the hints found under



                  That greatly improved performance from about 3 minutes initially (inner
                  loop about 2000, outer loop about 10000 runs - I think) down to a few
                  seconds. My question on the inline function call was triggered by the 3
                  minute run on a pretty small statistic (10000 events) Monte Carlo
                  sample. Now, I'm much more relaxed! :)

                  One of the biggest improvements in addition to using psyco was actually
                  being careful about avoiding global namespace lookup.

                  So, thanks again, I learned a great deal about efficient python coding
                  today! :)

                  Cheers,

                  Riko

                  Comment

                  • Peter Hansen

                    #10
                    Re: inline function call

                    Riko Wichmann wrote:[color=blue]
                    > That greatly improved performance from about 3 minutes initially (inner
                    > loop about 2000, outer loop about 10000 runs - I think) down to a few
                    > seconds. My question on the inline function call was triggered by the 3
                    > minute run on a pretty small statistic (10000 events) Monte Carlo
                    > sample. Now, I'm much more relaxed! :)
                    >
                    > One of the biggest improvements in addition to using psyco was actually
                    > being careful about avoiding global namespace lookup.[/color]

                    Riko, any chance you could post the final code and a bit more detail on
                    exactly how much Psyco contributed to the speedup? The former would be
                    educational for all of us, while I'm personally very curious about the
                    latter because my limited attempts to use Psyco in the past have
                    resulted in speedups on the order of only 20% or so. (I blame my
                    particular application, not Psyco per se, but I'd be happy to see a
                    real-world case where Psyco gave a much bigger boost.)

                    Thanks,
                    -Peter

                    Comment

                    • bearophileHUGS@lycos.com

                      #11
                      Re: inline function call

                      Peter Hansen>but I'd be happy to see a real-world case where Psyco gave
                      a much bigger boost.)<

                      Psyco can be very fast, but:
                      - the program has to be the right one;
                      - you have to use "low level" programming, programming more like in C,
                      avoiding most of the nice things Python has, like list generators,
                      etc.;
                      - you can try to use array.array (for floats and ints), I have found
                      that sometimes Psyco can use them in a very fast way.

                      This is an example of mine, it's not really a real-world case, it looks
                      like C, but it shows the difference, if you switch off Psyco you can
                      see that it goes much slower:


                      This Python version is MUCH slower, but it looks more like Python:

                      If you try Psyco with this version you can see that it's much slower
                      than the other one.

                      This is the summary page, the Python version is about 54 times slower
                      than the Psyco version:


                      More info:


                      Bye,
                      bearophile

                      Comment

                      • Riko Wichmann

                        #12
                        Re: inline function call

                        Hi Peter,
                        [color=blue]
                        > Riko, any chance you could post the final code and a bit more detail on
                        > exactly how much Psyco contributed to the speedup? The former would be
                        > educational for all of us, while I'm personally very curious about the
                        > latter because my limited attempts to use Psyco in the past have
                        > resulted in speedups on the order of only 20% or so. (I blame my
                        > particular application, not Psyco per se, but I'd be happy to see a
                        > real-world case where Psyco gave a much bigger boost.)[/color]

                        the difference between running with and without psyco is about a factor
                        3 for my MC simulation. Without psyco the simulation runs for 62 sec,
                        with it for 19 secs (still using time instead of timeit, though!:) This
                        is for about 2300 and 10000 in for the inner and outer loop, respectively.

                        A factor 3 I consider worthwhile, especially since it doesn't really
                        cost you much.

                        This is on a Dell Lat D600 running Linux (Ubuntu 5.10) with a 1.6 GHz
                        Pentium M and 512 MB of RAM and python2.4.

                        The final code snipplet is attached. However, it is essentially
                        unchanged compared to the piece I posted earlier which already had most
                        of the global namespace look-up removed. Taking care of sqrt and random
                        as you suggested didn't improve much anymore. So it's probably not that
                        educational afterall.

                        Cheers,

                        Riko

                        -----------------------------------------------------


                        # import some modules
                        import string
                        import time
                        from math import sqrt

                        # accelerate:
                        import psyco

                        # random number init
                        from random import random, seed
                        seed(1)



                        # riskfunc(med, low, high):
                        # risk function for costs: triangular distribution
                        # implemented acoording to:

                        def riskfunc(med, low, high):


                        if med != 0.0:
                        u = random()
                        try:
                        if u <= (med-low)/(high-low):
                        r = low+sqrt(u*(hig h-low)*(med-low))
                        else:
                        r = high - sqrt((1.0-u)*(high-low)*(high-med))

                        except ZeroDivisionErr or: # case high = low
                        r = med
                        else:
                        r = 0.0

                        return r


                        # doMC:
                        # run the MC of the cost analysis
                        #
                        def doMC(Ntrial = 1):

                        start = time.time()
                        print 'run MC with ', Ntrial, ' trials'


                        # now do MC simulation and calculate sums

                        for i in range(Ntrial):

                        summe = 0.0
                        # do MC experiments for all cost entries
                        for k in range(len(Gcost )):
                        x = riskfunc(Gcost[k], Gdown[k], Gup[k])
                        summe +=x

                        if i%(Ntrial/10) == 0:
                        print i, 'MC experiment processed, Summe = %10.2f' % (summe)

                        stop = time.time()
                        print 'Computing time: ', stop-start



                        ############### ############### ############### ############### ########
                        ############### ############### ############### ############### ########

                        if __name__ == '__main__':

                        fname_base = 'XFEL_budget-book_Master-2006-01-02_cost'

                        readCosts(fname _base+'.csv')

                        psyco.full()

                        n = 10000
                        doMC(n)


                        Comment

                        • Steven D'Aprano

                          #13
                          Re: inline function call

                          On Wed, 04 Jan 2006 13:18:32 +0100, Riko Wichmann wrote:
                          [color=blue]
                          > hi everyone,
                          >
                          > I'm googeling since some time, but can't find an answer - maybe because
                          > the answer is 'No!'.
                          >
                          > Can I call a function in python inline, so that the python byte compiler
                          > does actually call the function, but sort of inserts it where the inline
                          > call is made? Therefore avoiding the function all overhead.[/color]

                          The closest thing to that is the following:


                          # original version:

                          for i in xrange(100000):
                          myObject.someth ing.foo() # three name space lookups every loop


                          # inline version:

                          # original version:

                          foo = myObject.someth ing.foo
                          for i in xrange(100000):
                          foo() # one name space lookup every loop



                          --
                          Steven.

                          Comment

                          • Peter Hansen

                            #14
                            Re: inline function call

                            Riko Wichmann wrote:[color=blue]
                            > the difference between running with and without psyco is about a factor
                            > 3 for my MC simulation.[/color]
                            [color=blue]
                            > A factor 3 I consider worthwhile, especially since it doesn't really
                            > cost you much.[/color]

                            Definitely. "import psyco; psyco.full()" is pretty hard to argue
                            against. :-)
                            [color=blue]
                            > The final code snipplet is attached. However, it is essentially
                            > unchanged compared to the piece I posted earlier which already had most
                            > of the global namespace look-up removed. Taking care of sqrt and random
                            > as you suggested didn't improve much anymore. So it's probably not that
                            > educational afterall.[/color]

                            Seeing what others have achieved is always educational to the ignorant,
                            so I learned something. ;-)

                            I suspect using psyco invalidates a number of the typical Python
                            optimizations, and localizing global namespace lookups with default
                            variable assignments is probably one of them.

                            Thanks for posting.
                            -Peter

                            Comment

                            • bearophileHUGS@lycos.com

                              #15
                              Re: inline function call

                              I haven't examined the code very well, but generally I don't suggest to
                              use exceptions inside tight loops that that have to go fast.

                              Bye,
                              bearophile

                              Comment

                              Working...