writing python extensions in assembly

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • inhahe

    writing python extensions in assembly

    Can anyone give me pointers/instructions/a template for writing a Python
    extension in assembly (or better, HLA)?


  • Diez B. Roggisch

    #2
    Re: writing python extensions in assembly

    inhahe schrieb:
    Can anyone give me pointers/instructions/a template for writing a Python
    extension in assembly (or better, HLA)?
    You could write a C-extension and embed assembly. See the docs for how
    to write one. If you know how to implement a C-callingconventi on-based
    shared library in assembly (being an assembler guru you sure know how
    that works), you could mimic a C-extension.

    Diez

    Comment

    • inhahe

      #3
      Re: writing python extensions in assembly

      Well the problem is that I'm actually not an assembler guru, so I don't know
      how to implement a dll in asm or use a c calling convention, although I'm
      sure those instructions are available on the web. I was just afraid of
      trying to learn that AND making python-specific extensions at the same time.
      I thought of making a c extension with embedded asm, but that just seemed
      less than ideal. But if somebody thinks that's the Right Way to do it,
      that's good enough..

      "Diez B. Roggisch" <deets@nospam.w eb.dewrote in message
      news:695jdlF30f p0gU1@mid.uni-berlin.de...
      inhahe schrieb:
      >Can anyone give me pointers/instructions/a template for writing a Python
      >extension in assembly (or better, HLA)?
      >
      You could write a C-extension and embed assembly. See the docs for how to
      write one. If you know how to implement a C-callingconventi on-based shared
      library in assembly (being an assembler guru you sure know how that
      works), you could mimic a C-extension.
      >
      Diez

      Comment

      • Diez B. Roggisch

        #4
        Re: writing python extensions in assembly

        inhahe schrieb:
        Well the problem is that I'm actually not an assembler guru, so I don't know
        how to implement a dll in asm or use a c calling convention, although I'm
        sure those instructions are available on the web. I was just afraid of
        trying to learn that AND making python-specific extensions at the same time.
        I thought of making a c extension with embedded asm, but that just seemed
        less than ideal. But if somebody thinks that's the Right Way to do it,
        that's good enough..
        I think the right thing to do if you are not as fluent in assembly is do
        not do anything in it at all. What do you need it for?

        Diez

        Comment

        • D'Arcy J.M. Cain

          #5
          Re: writing python extensions in assembly

          On Fri, 16 May 2008 11:21:39 -0400
          "inhahe" <inhahe@gmail.c omwrote:
          You could be right, but here are my reasons.
          >
          I need to make something that's very CPU-intensive and as fast as possible.
          The faster, the better, and if it's not fast enough it won't even work.
          >
          They say that the C++ optimizer can usually optimize better than a person
          coding in assembler by hand can, but I just can't believe that, at least for
          me, because when I code in assembler, I feel like I can see the best way to
          do it and I just can't imagine AI would even be smart enough to do it that
          way...
          Perhaps. Conventional wisdom says that you shouldn't optimize until
          you need to though. That's one of the benefits of the way Python
          works. Here's how I would do it.

          1. Write the code (call it a prototype) in pure Python. Make sure that
          everything is modularized based on functionality. Try to get it split
          into nice, bite size chunks. Make sure that you have unit tests for
          everything that you write.

          2. Once the code is functioning, benchmark it and find the
          bottlenecks. Replace the problem methods with a C extension. Refactor
          (and check your unit tests again) if needed to break out the problem
          areas into as small a piece as possible.

          3. If it is still slow, embed some assembler where it is slowing down.

          One advantage of this is that you always know if your optimizations are
          useful. You may be surprised to find that you hardly ever need to go
          beyond step 1 leaving you with the most portable and easily maintained
          code that you can have.
          For portability, I'd simply write different asm routines for different
          systems. How wide a variety of systems I'd support I don't know. As a bare
          minimum, 32-bit x86, 64-bit x86, and one or more of their available forms of
          SIMD.
          Even on the same processor you may have different assemblers depending
          on the OS.

          --
          D'Arcy J.M. Cain <darcy@druid.ne t | Democracy is three wolves
          http://www.druid.net/darcy/ | and a sheep voting on
          +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.

          Comment

          • inhahe

            #6
            Re: writing python extensions in assembly

            I like to learn what I need, but I have done assembly before, I wrote a
            terminal program in assembly for example, with ansi and avatar support. I'm
            just not fluent in much other than the language itself, per se.

            Perhaps C would be as fast as my asm would, but C would not allow me to use
            SIMD, which seems like it would improve my speed a lot, I think my goals are
            pretty much what SIMD was made for.
            I think the right thing to do if you are not as fluent in assembly is do
            not do anything in it at all. What do you need it for?
            >
            Diez

            Comment

            • Diez B. Roggisch

              #7
              Re: writing python extensions in assembly

              inhahe schrieb:
              I like to learn what I need, but I have done assembly before, I wrote a
              terminal program in assembly for example, with ansi and avatar support. I'm
              just not fluent in much other than the language itself, per se.
              >
              Perhaps C would be as fast as my asm would, but C would not allow me to use
              SIMD, which seems like it would improve my speed a lot, I think my goals are
              pretty much what SIMD was made for.

              That is not true. I've used the altivec-extensions myself on OSX and
              inside C.

              Besides, the parts of your program that are really *worth* optimizing
              are astonishly few. Don't bother using assembler until you need to.

              Diez

              Comment

              • inhahe

                #8
                Re: writing python extensions in assembly


                "D'Arcy J.M. Cain" <darcy@druid.ne twrote in message
                news:mailman.12 32.1210952591.1 2834.python-list@python.org ...
                >
                2. Once the code is functioning, benchmark it and find the
                bottlenecks. Replace the problem methods with a C extension. Refactor
                (and check your unit tests again) if needed to break out the problem
                areas into as small a piece as possible.
                There's probably only 2 or 3 basic algorithms that will need to have all
                that speed.
                >
                3. If it is still slow, embed some assembler where it is slowing down.
                >
                I won't know if the assembler is faster until I embed it, and if I'm going
                to do that I might as well use it.
                Although it's true I'd only have to embed it for one system to see (more or
                less).
                >
                >For portability, I'd simply write different asm routines for different
                >systems. How wide a variety of systems I'd support I don't know. As a
                >bare
                >minimum, 32-bit x86, 64-bit x86, and one or more of their available forms
                >of
                >SIMD.
                >
                Even on the same processor you may have different assemblers depending
                on the OS.
                yeah I don't know much about that, I was figuring perhaps I could limit the
                assembler parts / methodology to something I could write generically
                enough.. and if all else fails write for the other OS's or only support
                windows. also I think I should be using SIMD of some sort, and I'm not
                sure but I highly doubt C++ compilers support SIMD.


                Comment

                • Henrique Dante de Almeida

                  #9
                  Re: writing python extensions in assembly

                  >
                  yeah I don't know much about that,  I was figuring perhaps I could limitthe
                  assembler parts / methodology to something I could write generically
                  enough.. and if all else fails write for the other OS's or only support
                  windows.   also I think I should be using SIMD of some sort, and I'm not
                  sure but I highly doubt C++ compilers support SIMD.
                  You're wrong.

                  Maybe we could help you better if you told us what task are you
                  trying to achieve (or which algorithms do you think need optimization).

                  Comment

                  • Mensanator

                    #10
                    Re: writing python extensions in assembly

                    On May 16, 11:24 am, "inhahe" <inh...@gmail.c omwrote:
                    "D'Arcy J.M. Cain" <da...@druid.ne twrote in messagenews:mai lman.1232.12109 52591.12834.pyt hon-list@python.org ...
                    >
                    >
                    >
                    2. Once the code is functioning, benchmark it and find the
                    bottlenecks.  Replace the problem methods with a C extension.  Refactor
                    (and check your unit tests again) if needed to break out the problem
                    areas into as small a piece as possible.
                    >
                    There's probably only 2 or 3 basic algorithms that will need to have all
                    that speed.
                    >
                    >
                    >
                    3.  If it is still slow, embed some assembler where it is slowing down..
                    >
                    I won't know if the assembler is faster until I embed it, and if I'm going
                    to do that I might as well use it.
                    Although it's true I'd only have to embed it for one system to see (more or
                    less).
                    >
                    >
                    >
                    For portability, I'd simply write different asm routines for different
                    systems.  How wide a variety of systems I'd support I don't know.  As a
                    bare
                    minimum, 32-bit x86, 64-bit x86, and one or more of their available forms
                    of
                    SIMD.
                    >
                    Even on the same processor you may have different assemblers depending
                    on the OS.
                    >
                    yeah I don't know much about that,  I was figuring perhaps I could limitthe
                    assembler parts / methodology to something I could write generically
                    enough.. and if all else fails write for the other OS's or only support
                    windows.   also I think I should be using SIMD of some sort, and I'm not
                    sure but I highly doubt C++ compilers support SIMD.
                    The Society for Inherited Metabolic Disorders?

                    Why wouldn't the compilers support it? It's part of the x86
                    architexture,
                    isn't it?

                    Comment

                    • Dan Upton

                      #11
                      Re: writing python extensions in assembly

                      >3. If it is still slow, embed some assembler where it is slowing down.
                      >>
                      >
                      I won't know if the assembler is faster until I embed it, and if I'm going
                      to do that I might as well use it.
                      Although it's true I'd only have to embed it for one system to see (more or
                      less).
                      >
                      Regardless of whether it's faster, I thought you indicated that really
                      it's most important that it's fast enough.

                      That said, it's not true that you won't know if it's faster until you
                      embed it--that's what unit testing would be for. Write your loop(s)
                      in Python, C, ASM, <insert language hereand run them, on actual
                      inputs (or synthetic, if necessary, I suppose). That's how you'll be
                      able to tell whether it's even worth the effort to get the assembly
                      callable from Python.

                      On Fri, May 16, 2008 at 1:27 PM, Mensanator <mensanator@aol .comwrote:
                      >
                      Why wouldn't the compilers support it? It's part of the x86
                      architexture,
                      isn't it?
                      Yeah, but I don't know if it uses it by default, and my guess is it
                      depends on how the compiler back end goes about optimizing the code
                      for whether it will see data access/computation patterns amenable to
                      SIMD.

                      Comment

                      • sjdevnull@yahoo.com

                        #12
                        Re: writing python extensions in assembly

                        On May 16, 12:24 pm, "inhahe" <inh...@gmail.c omwrote:
                        "D'Arcy J.M. Cain" <da...@druid.ne twrote in messagenews:mai lman.1232.12109 52591.12834.pyt hon-list@python.org ...
                        >
                        >
                        >
                        2. Once the code is functioning, benchmark it and find the
                        bottlenecks. Replace the problem methods with a C extension. Refactor
                        (and check your unit tests again) if needed to break out the problem
                        areas into as small a piece as possible.
                        >
                        There's probably only 2 or 3 basic algorithms that will need to have all
                        that speed.
                        >
                        >
                        >
                        3. If it is still slow, embed some assembler where it is slowing down.
                        >
                        I won't know if the assembler is faster until I embed it, and if I'm going
                        to do that I might as well use it.
                        You won't know if the C is faster than the assembly until you write
                        it, and if you're going to do that you might as well use it...

                        If the C is fast enough, there's no point in wasting time writing the
                        assembly.

                        (Also FWIW C and C++ are different languages; you seem to conflate the
                        two a few times upthread).

                        Comment

                        • inhahe

                          #13
                          Re: writing python extensions in assembly


                          "Dan Upton" <upton@virginia .eduwrote in message
                          news:mailman.12 36.1210959884.1 2834.python-list@python.org ...

                          >
                          On Fri, May 16, 2008 at 1:27 PM, Mensanator <mensanator@aol .comwrote:
                          >>
                          >Why wouldn't the compilers support it? It's part of the x86
                          >architexture ,
                          >isn't it?
                          >
                          Yeah, but I don't know if it uses it by default, and my guess is it
                          depends on how the compiler back end goes about optimizing the code
                          for whether it will see data access/computation patterns amenable to
                          SIMD.
                          perhaps you explicitly use them with some extended syntax or something?


                          Comment

                          • Dan Upton

                            #14
                            Re: writing python extensions in assembly

                            On Fri, May 16, 2008 at 2:08 PM, inhahe <inhahe@gmail.c omwrote:
                            >
                            "Dan Upton" <upton@virginia .eduwrote in message
                            news:mailman.12 36.1210959884.1 2834.python-list@python.org ...
                            >
                            >
                            >>
                            >On Fri, May 16, 2008 at 1:27 PM, Mensanator <mensanator@aol .comwrote:
                            >>>
                            >>Why wouldn't the compilers support it? It's part of the x86
                            >>architextur e,
                            >>isn't it?
                            >>
                            >Yeah, but I don't know if it uses it by default, and my guess is it
                            >depends on how the compiler back end goes about optimizing the code
                            >for whether it will see data access/computation patterns amenable to
                            >SIMD.
                            >
                            perhaps you explicitly use them with some extended syntax or something?
                            >
                            Hey, I learned something today.



                            Also, from the gcc manpage, apparently 387 is the default when
                            compiling for 32 bit architectures, and using sse instructions is
                            default on x86-64 architectures, but you can use -march=(some
                            architecture with simd instructions), -msse, -msse2, -msse3, or
                            -mfpmath=(one of 387, sse, or sse,387) to get the compiler to use
                            them.

                            As long as we're talking about compilers and such... anybody want to
                            chip in how this works in Python bytecode or what the bytecode
                            interpreter does? Okay, wait, before anybody says that's
                            implementation-dependent: does anybody want to chip in what the
                            CPython implementation does? (or any other implementation they're
                            familiar with, I guess)

                            Comment

                            • Ivan Illarionov

                              #15
                              Re: writing python extensions in assembly

                              On Fri, 16 May 2008 10:13:04 -0400, inhahe wrote:
                              Can anyone give me pointers/instructions/a template for writing a Python
                              extension in assembly (or better, HLA)?
                              Look up pygame sources. They have some hot inline MMX stuff.
                              I experimented with this rescently and I must admit that it's etremely
                              hard to beat C compiler. My first asm code actually was slower than C,
                              only after reading Intel docs, figuring out what makes 'movq' and
                              'movntq' different I was able to write something that was several times
                              faster than C.

                              D language inline asm and tools to make Python extensions look very
                              promising although I haven't tried them yet.

                              -- Ivan

                              Comment

                              Working...