interpreter vs. compiled

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • alex23

    #16
    Re: interpreter vs. compiled

    On Jul 29, 2:21 pm, castironpi <castiro...@gma il.comwrote:
    On Jul 28, 5:58 pm, Fuzzyman <fuzzy...@gmail .comwrote:
    Well - in IronPython user code gets compiled to in memory assemblies
    which can be JIT'ed.
    >
    I don't believe so.
    Uh, you're questioning someone who is not only co-author of a book on
    IronPython, but also a developer on one of the first IronPython-based
    commercial applications.

    I know authorship isn't always a guarantee of correctness, but what
    experience do you have with IronPython that makes you so unwilling to
    accept the opinion of someone with substantial knowledge of the
    subject?

    Comment

    • castironpi

      #17
      Re: interpreter vs. compiled

      On Jul 29, 7:39 am, alex23 <wuwe...@gmail. comwrote:
      On Jul 29, 2:21 pm, castironpi <castiro...@gma il.comwrote:
      >
      On Jul 28, 5:58 pm, Fuzzyman <fuzzy...@gmail .comwrote:
      Well - in IronPython user code gets compiled to in memory assemblies
      which can be JIT'ed.
      >
      I don't believe so.
      >
      Uh, you're questioning someone who is not only co-author of a book on
      IronPython, but also a developer on one of the first IronPython-based
      commercial applications.
      >
      I know authorship isn't always a guarantee of correctness, but what
      experience do you have with IronPython that makes you so unwilling to
      accept the opinion of someone with substantial knowledge of the
      subject?
      None, no experience, no authority, only the stated premises &
      classifications , which I am generally tending to misinterpret. I'm
      overstepping my bounds and trying to do it politely. (Some might call
      it learning, which yes, though uncustomary, *requires questioning
      authorities*, or reinventing.)

      Evidently, I have a "fundamenta l misunderstandin g of the compilation
      process", which I'm trying to correct by stating what I believe. I'm
      trying to elaborate, and I'm meeting with increasingly much detail.
      So, perhaps I'll learn something out of this. Until then...

      What I know I have is two conflicting, contradictory, inconsistent
      beliefs. Maybe I've spent too much time in Python to imagine how a
      dynamic language can compile.

      This is from 7/22/08, same author:
      I wouldn't say "can't". The current CPython VM does not compile
      code. It COULD. The C#/.NET VM does.
      Three big claims here that I breezed right over and didn't believe.
      It COULD.
      I'm evidently assuming that if it could, it would.
      The current CPython VM does not compile code.
      Therefore it couldn't, or the assumption is wrong. Tim says it is.
      And the glaring one--

      WHY NOT? Why doesn't CPython do it?

      From 7/18/08, own author:
      >>
      #define TOP() (stack_pointer[-1])
      #define BASIC_POP() (*--stack_pointer)

      ...(line 1159)...
      w = POP();
      v = TOP();
      if (PyInt_CheckExa ct(v) && PyInt_CheckExac t(w)) {
      /* INLINE: int + int */
      register long a, b, i;
      a = PyInt_AS_LONG(v );
      b = PyInt_AS_LONG(w );
      i = a + b;
      <<

      I am imagining that every Python implementation has something like
      it. If IronPython does not, in particular, not have the 'POP();
      TOP();' sequence, then it isn't running on a stack machine. Is the
      IronPython code open source, and can someone link to it? I'm not
      wading through it from scratch. What does it have instead? Does
      dynamic typing still work?

      <closing hostile remark>
      If you're bluffing, bluff harder; I call. If you're not, I apologize;
      teach me something. If you can ask better, teach me that too.
      </hostile>

      Comment

      • castironpi

        #18
        Re: interpreter vs. compiled

        On Jul 29, 1:46 am, Tim Roberts <t...@probo.com wrote:
        castironpi <castiro...@gma il.comwrote:
        >
        In CPython yes.  In IronPython yes:  the parts that are compiled into
        machine code are the interpreter, *not user's code*.
        >
        WRONG!  You are WRONG.  At "compile" time, the Python code is compiled to
        an intermediate language.  At "run" time, the intermediate language (which
        is still the user's code, just in another representation) is compiled into
        machine language.  It is the user's program, not the interpreter.
        >
        It's the exact same process that occurs in a C compiler.  Most C compilers
        translate the C program to an intermediate form before finally converting
        it to machine language.  The only difference is that, in a C compiler, both
        steps occur within the compiler.  In IronPython, the two steps are
        separated in time.  There is no other difference.
        >
        Without that
        step, the interpreter would be running on an interpreter, but that
        doesn't get the user's statement 'a= b+ 1' into registers-- it gets
        'push, push, add, pop' into registers.
        >
        You have a fundamental misunderstandin g of the compilation process.
        --
        Tim Roberts, t...@probo.com
        Providenza & Boekelheide, Inc.
        In C, we have:

        int x, y;
        x= 10;
        y= x+ 1;

        It translates as, roughly:


        8000 .data
        7996 ffffffff #x
        7992 ffffffff #y
        7988 .end data
        7984 loadi reg0 7996
        7980 loadi reg1 7992
        7976 loadi reg2 10
        7972 loadi reg3 1
        7968 storv reg2 reg0
        7964 add reg0 reg1 reg2
        7960 storv reg3 reg1


        You are telling me that the same thing happens in IronPython. By the
        time the instruction pointer gets to 'x= 10', the next 7 instructions
        are the ones shown here compiled from C.

        CMIIW, but the CPython implementation -does- -not-. Instead, it has,

        push 10
        stor x
        push 1
        add
        stor y

        which each amounts to, to give a rough figure, 5-10 machine
        instructions. push 10, for example, with instruction_poi nter in reg0:

        loadi reg1 4 #add 4 to stack pointer (one word)
        add reg0 reg1 reg2
        load reg0 reg2
        loadi reg2 10 #load ten
        stor reg0 reg2 #store at top of stack

        And this is all not to mention (i) the extra comparisons in
        intobject.h,

        #define PyInt_CheckExac t(op) ((op)->ob_type == &PyInt_Type)

        (ii) the huge case statement just to evaluate add, OR (iii)

        a = PyInt_AS_LONG(v );
        b = PyInt_AS_LONG(w );

        because CPython -does- -not- -know- ahead of time which op it will be
        executing, or what addresses (remember __coerce__), it will be
        performing the op on. Does not know EVER, not until it gets there.

        My point is, CPython takes more than seven steps. My question is,
        does IronPython?

        Comment

        • Dino Viehland

          #19
          RE: interpreter vs. compiled

          IronPython doesn't have an interpreter loop and therefore has no POP / TOP / etc... Instead what IronPython has is a method call Int32Ops.Add which looks like:

          public static object Add(Int32 x, Int32 y) {
          long result = (long) x + y;
          if (Int32.MinValue <= result && result <= Int32.MaxValue) {
          return Microsoft.Scrip ting.Runtime.Ru ntimeHelpers.In t32ToObject((In t32)(result));
          }
          return BigIntegerOps.A dd((BigInteger) x, (BigInteger)y);
          }

          This is the implementation of int.__add__. Note that calling int.__add__ can actually return NotImplemented and that's handled by the method binder looking at the strong typing defined on Add's signature here - and then automatically generating the NotImplemented result when the arguments aren't ints. So that's why you don't see that here even though it's the full implementation of int.__add__.

          Ok, next if you define a function like:

          def adder(a, b):
          return a + b

          this turns into a .NET method, which will get JITed, which in C# would looksomething like like:

          static object adder(object a, object b) {
          return $addSite.Invoke (a, b)
          }

          where $addSite is a dynamically updated call site.

          $addSite knows that it's performing addition and knows how to do nothing other than update the call site the 1st time it's invoked. $addSite is localto the function so if you define another function doing addition it'll have its own site instance.

          So the 1st thing the call site does is a call back into the IronPython runtime which starts looking at a & b to figure out what to do. Python definesthat as try __add__, maybe try __radd__, handle coercion, etc... So we golooking through finding the __add__ method - if that can return NotImplemented then we find the __radd__ method, etc... In this case we're just adding two integers and we know that the implementation of Add() won't return NotImplemented - so there's no need to call __radd__. We know we don't haveto worry about NotImplemented because the Add method doesn't have the .NETattribute indicating it can return NotImplemented.

          At this point we need to do two things. We need to generate the test whichis going to see if future arguments are applicable to what we just figuredout and then we need to generate the code which is actually going to handle this. That gets combined together into the new call site delegate and it'll look something like:

          static void CallSiteStub(Ca llSite site, object a, object b) {
          if (a != null && a.GetType() == typeof(int) && b != null &&b.GetType( ) == typeof(int)) {
          return IntOps.Add((int )a, (int)b);
          }
          return site.UpdateBind ingAndInvoke(a, b);
          }

          That gets compiled down as a lightweight dynamic method which also gets JITed. The next time through the call site's Invoke body will be this method and things will go really fast if we have int's again. Also notice this islooking an awful lot like the inlined/fast-path(?) code dealing with int'sthat you quoted. If everything was awesome (currently it's not for a couple of reasons) the JIT would even inline the IntOps.Add call and it'd probably be near identical. And everything would be running native on the CPU.

          So that's how 2 + 2 works... Finally if it's a user type then we'd generate a more complicated test like (and getting more and more pseudo code to keep things simple):

          if (PythonOps.Chec kTypeVersion(a, 42) && PythonOps.Check TypeVersion(b, 42)){
          return $callSite.Invok e(__cachedAddSl ot__.__get__(a) , b);
          }

          Here $callSite is another stub which will handle doing optimal dispatch to whatever __add__.__get__ will return. It could be a Python type, it could be a user defined function, it could be the Python built-in sum function, etc... so that's the reason for the extra dynamic dispatch.

          So in summary: everything is compiled to IL. At runtime we have lots of stubs all over the place which do the work to figure out the dynamic operation and then cache the result of that calculation.

          Also what I've just described is how IronPython 2.0 works. IronPython 1.0 is basically the same but mostly w/o the stubs and where we use stub methods they're much less sophisticated.

          Also, IronPython is open source - www.codeplex.com/IronPython

          -----Original Message-----
          From: python-list-bounces+dinov=m icrosoft.com@py thon.org [mailto:python-list-bounces+dinov=m icrosoft.com@py thon.org] On Behalf Of castironpi
          Sent: Tuesday, July 29, 2008 9:20 PM
          To: python-list@python.org
          Subject: Re: interpreter vs. compiled

          On Jul 29, 7:39 am, alex23 <wuwe...@gmail. comwrote:
          On Jul 29, 2:21 pm, castironpi <castiro...@gma il.comwrote:
          >
          On Jul 28, 5:58 pm, Fuzzyman <fuzzy...@gmail .comwrote:
          Well - in IronPython user code gets compiled to in memory assemblies
          which can be JIT'ed.
          >
          I don't believe so.
          >
          Uh, you're questioning someone who is not only co-author of a book on
          IronPython, but also a developer on one of the first IronPython-based
          commercial applications.
          >
          I know authorship isn't always a guarantee of correctness, but what
          experience do you have with IronPython that makes you so unwilling to
          accept the opinion of someone with substantial knowledge of the
          subject?
          None, no experience, no authority, only the stated premises &
          classifications , which I am generally tending to misinterpret. I'm
          overstepping my bounds and trying to do it politely. (Some might call
          it learning, which yes, though uncustomary, *requires questioning
          authorities*, or reinventing.)

          Evidently, I have a "fundamenta l misunderstandin g of the compilation
          process", which I'm trying to correct by stating what I believe. I'm
          trying to elaborate, and I'm meeting with increasingly much detail.
          So, perhaps I'll learn something out of this. Until then...

          What I know I have is two conflicting, contradictory, inconsistent
          beliefs. Maybe I've spent too much time in Python to imagine how a
          dynamic language can compile.

          This is from 7/22/08, same author:
          I wouldn't say "can't". The current CPython VM does not compile
          code. It COULD. The C#/.NET VM does.
          Three big claims here that I breezed right over and didn't believe.
          It COULD.
          I'm evidently assuming that if it could, it would.
          The current CPython VM does not compile code.
          Therefore it couldn't, or the assumption is wrong. Tim says it is.
          And the glaring one--

          WHY NOT? Why doesn't CPython do it?
          >From 7/18/08, own author:
          >>
          #define TOP() (stack_pointer[-1])
          #define BASIC_POP() (*--stack_pointer)

          ....(line 1159)...
          w = POP();
          v = TOP();
          if (PyInt_CheckExa ct(v) && PyInt_CheckExac t(w)) {
          /* INLINE: int + int */
          register long a, b, i;
          a = PyInt_AS_LONG(v );
          b = PyInt_AS_LONG(w );
          i = a + b;
          <<

          I am imagining that every Python implementation has something like
          it. If IronPython does not, in particular, not have the 'POP();
          TOP();' sequence, then it isn't running on a stack machine. Is the
          IronPython code open source, and can someone link to it? I'm not
          wading through it from scratch. What does it have instead? Does
          dynamic typing still work?

          <closing hostile remark>
          If you're bluffing, bluff harder; I call. If you're not, I apologize;
          teach me something. If you can ask better, teach me that too.
          </hostile>
          --


          Comment

          • castironpi

            #20
            Re: interpreter vs. compiled

            I note that IronPython and Python's pickle.dumps do not return the
            same value. Perhaps this relates to the absence of interpreter loop.
            >>p.dumps( { 'a': True, 'b': set( ) } )
            IPy: '(dp0\nVb\np1\n c__builtin__\ns et\np3\n((lp4\n tp5\nRp2\nsVa
            \np6\nI01\ns.'
            CPy: "(dp0\nS'a'\np1 \nI01\nsS'b'\np 2\nc__builtin__ \nset
            \np3\n((lp4\ntp 5\nRp6\ns."

            You make me think of a more elaborate example.

            for k in range( 100 ):
            i= j()
            g= h+ i
            e= f+ g
            c= d+ e
            a= b+ c

            Here, j creates a new class dynamically, and returns an instance of
            it. Addition is defined on it but the return type from it varies.

            If I read you correctly, IPy can leave hundreds of different addition
            stubs laying around at the end of the for-loop, each of which only
            gets executed once or twice, each of which was compiled for the exact
            combination of types it was called for.

            I might construe this to be a degenerate case, and the majority of
            times, you'll reexecute stubs enough to outweigh the length of time
            the compilation step takes. If you still do the bounds checking, it
            takes extra instructions (C doesn't), but operation switch-case
            BINARY_ADD, (PyInt_CheckExa ct(v) && PyInt_CheckExac t(w)), and POP and
            TOP, are all handled by the selection of stubs from $addSite.

            I'm read from last April:
            >>The most interesting cases to me are the 5 tests where CPython is more than 3x faster than IronPython and the other 5 tests where IronPython is more than 3x faster than CPython. CPython's strongest performance is in dictionaries with integer and string keys, list slicing, small tuples and code that actually throws and catches exceptions. IronPython's strongest performance is in calling builtin functions, if/then/else blocks, calling python functions, deep recursion, and try/except blocks that don't actually catch an exception.
            <<< http://lists.ironpython.com/pipermai...il/004773.html

            It's interesting that CPython can make those gains still by using a
            stack implementation.

            I'll observe that IronPython has the additional dependency of the
            full .NET runtime. (It was my point 7/18 about incorporating the GNU
            libs, that to compile to machine-native, as a JIT does, you need the
            instruction set of the machine.) Whereas, CPython can disregard
            them, having already been compiled for it.

            I think what I was looking for is that IronPython employs the .NET to
            compile to machine instructions, once it's known what the values of
            the variables are that are the operands. The trade-off is compilation
            time + type checks + stub look-up.

            What I want to know is, if __add__ performs an attribute look-up, is
            that optimized in any way, after the IP is already in compiled code?

            After all that, I don't feel so guilty about stepping on Tim's toes.

            On Jul 30, 12:12 am, Dino Viehland <di...@exchange .microsoft.com>
            wrote:
            IronPython doesn't have an interpreter loop and therefore has no POP / TOP / etc...   Instead what IronPython has is a method call Int32Ops.Add which looks like:
            >
                    public static object Add(Int32 x, Int32 y) {
                        long result = (long) x + y;
                        if (Int32.MinValue <= result && result <= Int32.MaxValue) {
                            return Microsoft.Scrip ting.Runtime.Ru ntimeHelpers.In t32ToObject((In t32)(result));
                        }
                        return BigIntegerOps.A dd((BigInteger) x, (BigInteger)y);
                    }
            >
            This is the implementation of int.__add__.  Note that calling int.__add__ can actually return NotImplemented and that's handled by the method binder looking at the strong typing defined on Add's signature here - and then automatically generating the NotImplemented result when the arguments aren't ints.  So that's why you don't see that here even though it's the full implementation of int.__add__.
            >
            Ok, next if you define a function like:
            >
            def adder(a, b):
                    return a + b
            >
            this turns into a .NET method, which will get JITed, which in C# would look something like like:
            >
            static object adder(object a, object b) {
                return $addSite.Invoke (a, b)
            >
            }
            >
            where $addSite is a dynamically updated call site.
            >
            $addSite knows that it's performing addition and knows how to do nothing other than update the call site the 1st time it's invoked.  $addSite is local to the function so if you define another function doing addition it'llhave its own site instance.
            >
            So the 1st thing the call site does is a call back into the IronPython runtime which starts looking at a & b to figure out what to do.  Python defines that as try __add__, maybe try __radd__, handle coercion, etc...  Sowe go looking through finding the __add__ method - if that can return NotImplemented then we find the __radd__ method, etc...  In this case we're just adding two integers and we know that the implementation of Add() won't return NotImplemented - so there's no need to call __radd__.  We know we don't have to worry about NotImplemented because the Add method doesn't have the .NET attribute indicating it can return NotImplemented.
            >
            At this point we need to do two things.  We need to generate the test which is going to see if future arguments are applicable to what we just figured out and then we need to generate the code which is actually going to handle this.  That gets combined together into the new call site delegate and it'll look something like:
            >
            static void CallSiteStub(Ca llSite site, object a, object b) {
                    if (a != null && a.GetType() == typeof(int) && b != null && b.GetType() == typeof(int)) {
                        return IntOps.Add((int )a, (int)b);
                    }
                    return site.UpdateBind ingAndInvoke(a, b);
            >
            }
            >
            That gets compiled down as a lightweight dynamic method which also gets JITed.  The next time through the call site's Invoke body will be this method and things will go really fast if we have int's again.  Also notice this is looking an awful lot like the inlined/fast-path(?) code dealing withint's that you quoted.  If everything was awesome (currently it's not for a couple of reasons) the JIT would even inline the IntOps.Add call and it'd probably be near identical.  And everything would be running native onthe CPU.
            >
            So that's how 2 + 2 works...  Finally if it's a user type then we'd generate a more complicated test like (and getting more and more pseudo code to keep things simple):
            >
            if (PythonOps.Chec kTypeVersion(a, 42) && PythonOps.Check TypeVersion(b, 42)) {
                return $callSite.Invok e(__cachedAddSl ot__.__get__(a) , b);
            >
            }
            >
            Here $callSite is another stub which will handle doing optimal dispatch to whatever __add__.__get__ will return.  It could be a Python type, it could be a user defined function, it could be the Python built-in sum function, etc...  so that's the reason for the extra dynamic dispatch.
            >
            So in summary: everything is compiled to IL.  At runtime we have lots of stubs all over the place which do the work to figure out the dynamic operation and then cache the result of that calculation.
            >
            Also what I've just described is how IronPython 2.0 works.  IronPython 1.0 is basically the same but mostly w/o the stubs and where we use stub methods they're much less sophisticated.
            >
            Also, IronPython is open source -www.codeplex.co m/IronPython
            >
            -----Original Message-----
            abriged:
            This is from 7/22/08, same author:
            I wouldn't say "can't".  The current CPython VM does not compile
            code.  It COULD.  The C#/.NET VM does.
            >
            Three big claims here that I breezed right over and didn't believe.
            >
            It COULD.
            >
            I'm evidently assuming that if it could, it would.
            >
            The current CPython VM does not compile code.
            >
            Therefore it couldn't, or the assumption is wrong.  Tim says it is.
            And the glaring one--
            >
            WHY NOT?  Why doesn't CPython do it?
            >
            I am imagining that every Python implementation has something like
            it.  If IronPython does not, in particular, not have the 'POP();
            TOP();' sequence, then it isn't running on a stack machine.  Is the
            IronPython code open source, and can someone link to it?  I'm not
            wading through it from scratch.  What does it have instead?  Does
            dynamic typing still work?

            Comment

            • Dino Viehland

              #21
              RE: interpreter vs. compiled

              It looks like the pickle differences are due to two issues. First IronPython doesn't have ASCII strings so it serializes strings as Unicode. Second there are dictionary ordering differences. If you just do:

              { 'a': True, 'b': set( ) }

              Cpy prints: {'a': True, 'b': set([])}
              Ipy prints: {'b': set([]), 'a': True}

              The important thing is that we interop - and indeed you can send either pickle string to either implementation and the correct results are deserialized (modulo getting Unicode strings).

              For your more elaborate example you're right that there could be a problem here. But the DLR actually recognizes this sort of pattern and optimizes for it. All of the additions in your code are what I've been calling serially monomorphic call sites. That is they see the same types for a while, maybe even just once as in your example, and then they switch to a new type -never to return to the old one. When IronPython gives the DLR the code for the call site the DLR can detect when the code only differs by constants - in this case type version checks. It will then re-write the code turningthe changing constants into variables. The next time through when it seesthe same code again it'll re-use the existing compiled code with the new sets of constants.

              That's still slower than we were in 1.x so we'll need to push on this more in the future - for example producing a general rule instead of a type-specific rule. But for the time being having the DLR automatically handle thishas been working good enough for these situations.

              -----Original Message-----
              From: python-list-bounces+dinov=m icrosoft.com@py thon.org [mailto:python-list-bounces+dinov=m icrosoft.com@py thon.org] On Behalf Of castironpi
              Sent: Tuesday, July 29, 2008 11:40 PM
              To: python-list@python.org
              Subject: Re: interpreter vs. compiled

              I note that IronPython and Python's pickle.dumps do not return the
              same value. Perhaps this relates to the absence of interpreter loop.
              >>p.dumps( { 'a': True, 'b': set( ) } )
              IPy: '(dp0\nVb\np1\n c__builtin__\ns et\np3\n((lp4\n tp5\nRp2\nsVa
              \np6\nI01\ns.'
              CPy: "(dp0\nS'a'\np1 \nI01\nsS'b'\np 2\nc__builtin__ \nset
              \np3\n((lp4\ntp 5\nRp6\ns."

              You make me think of a more elaborate example.

              for k in range( 100 ):
              i= j()
              g= h+ i
              e= f+ g
              c= d+ e
              a= b+ c

              Here, j creates a new class dynamically, and returns an instance of
              it. Addition is defined on it but the return type from it varies.

              If I read you correctly, IPy can leave hundreds of different addition
              stubs laying around at the end of the for-loop, each of which only
              gets executed once or twice, each of which was compiled for the exact
              combination of types it was called for.

              I might construe this to be a degenerate case, and the majority of
              times, you'll reexecute stubs enough to outweigh the length of time
              the compilation step takes. If you still do the bounds checking, it
              takes extra instructions (C doesn't), but operation switch-case
              BINARY_ADD, (PyInt_CheckExa ct(v) && PyInt_CheckExac t(w)), and POP and
              TOP, are all handled by the selection of stubs from $addSite.

              I'm read from last April:
              >>The most interesting cases to me are the 5 tests where CPython is more than 3x faster than IronPython and the other 5 tests where IronPython is more than 3x faster than CPython. CPython's strongest performance is in dictionaries with integer and string keys, list slicing, small tuples and code that actually throws and catches exceptions. IronPython's strongest performance is in calling builtin functions, if/then/else blocks, calling python functions, deep recursion, and try/except blocks that don't actually catch an exception.
              <<< http://lists.ironpython.com/pipermai...il/004773.html

              It's interesting that CPython can make those gains still by using a
              stack implementation.

              I'll observe that IronPython has the additional dependency of the
              full .NET runtime. (It was my point 7/18 about incorporating the GNU
              libs, that to compile to machine-native, as a JIT does, you need the
              instruction set of the machine.) Whereas, CPython can disregard
              them, having already been compiled for it.

              I think what I was looking for is that IronPython employs the .NET to
              compile to machine instructions, once it's known what the values of
              the variables are that are the operands. The trade-off is compilation
              time + type checks + stub look-up.

              What I want to know is, if __add__ performs an attribute look-up, is
              that optimized in any way, after the IP is already in compiled code?

              After all that, I don't feel so guilty about stepping on Tim's toes.

              On Jul 30, 12:12 am, Dino Viehland <di...@exchange .microsoft.com>
              wrote:
              IronPython doesn't have an interpreter loop and therefore has no POP / TOP / etc... Instead what IronPython has is a method call Int32Ops.Add which looks like:
              >
              public static object Add(Int32 x, Int32 y) {
              long result = (long) x + y;
              if (Int32.MinValue <= result && result <= Int32.MaxValue) {
              return Microsoft.Scrip ting.Runtime.Ru ntimeHelpers.In t32ToObject((In t32)(result));
              }
              return BigIntegerOps.A dd((BigInteger) x, (BigInteger)y);
              }
              >
              This is the implementation of int.__add__. Note that calling int.__add__can actually return NotImplemented and that's handled by the method binderlooking at the strong typing defined on Add's signature here - and then automatically generating the NotImplemented result when the arguments aren't ints. So that's why you don't see that here even though it's the full implementation of int.__add__.
              >
              Ok, next if you define a function like:
              >
              def adder(a, b):
              return a + b
              >
              this turns into a .NET method, which will get JITed, which in C# would look something like like:
              >
              static object adder(object a, object b) {
              return $addSite.Invoke (a, b)
              >
              }
              >
              where $addSite is a dynamically updated call site.
              >
              $addSite knows that it's performing addition and knows how to do nothing other than update the call site the 1st time it's invoked. $addSite is local to the function so if you define another function doing addition it'll have its own site instance.
              >
              So the 1st thing the call site does is a call back into the IronPython runtime which starts looking at a & b to figure out what to do. Python defines that as try __add__, maybe try __radd__, handle coercion, etc... So we go looking through finding the __add__ method - if that can return NotImplemented then we find the __radd__ method, etc... In this case we're just adding two integers and we know that the implementation of Add() won't returnNotImplem ented - so there's no need to call __radd__. We know we don't have to worry about NotImplemented because the Add method doesn't have the .NET attribute indicating it can return NotImplemented.
              >
              At this point we need to do two things. We need to generate the test which is going to see if future arguments are applicable to what we just figured out and then we need to generate the code which is actually going to handle this. That gets combined together into the new call site delegate and it'll look something like:
              >
              static void CallSiteStub(Ca llSite site, object a, object b) {
              if (a != null && a.GetType() == typeof(int) && b != null && b.GetType() == typeof(int)) {
              return IntOps.Add((int )a, (int)b);
              }
              return site.UpdateBind ingAndInvoke(a, b);
              >
              }
              >
              That gets compiled down as a lightweight dynamic method which also gets JITed. The next time through the call site's Invoke body will be this method and things will go really fast if we have int's again. Also notice this is looking an awful lot like the inlined/fast-path(?) code dealing with int's that you quoted. If everything was awesome (currently it's not for a couple of reasons) the JIT would even inline the IntOps.Add call and it'd probably be near identical. And everything would be running native on the CPU..
              >
              So that's how 2 + 2 works... Finally if it's a user type then we'd generate a more complicated test like (and getting more and more pseudo code to keep things simple):
              >
              if (PythonOps.Chec kTypeVersion(a, 42) && PythonOps.Check TypeVersion(b, 42)) {
              return $callSite.Invok e(__cachedAddSl ot__.__get__(a) , b);
              >
              }
              >
              Here $callSite is another stub which will handle doing optimal dispatch to whatever __add__.__get__ will return. It could be a Python type, it could be a user defined function, it could be the Python built-in sum function,etc... so that's the reason for the extra dynamic dispatch.
              >
              So in summary: everything is compiled to IL. At runtime we have lots of stubs all over the place which do the work to figure out the dynamic operation and then cache the result of that calculation.
              >
              Also what I've just described is how IronPython 2.0 works. IronPython 1.0 is basically the same but mostly w/o the stubs and where we use stub methods they're much less sophisticated.
              >
              Also, IronPython is open source -www.codeplex.co m/IronPython
              >
              -----Original Message-----
              abriged:
              This is from 7/22/08, same author:
              I wouldn't say "can't". The current CPython VM does not compile
              code. It COULD. The C#/.NET VM does.
              >
              Three big claims here that I breezed right over and didn't believe.
              >
              It COULD.
              >
              I'm evidently assuming that if it could, it would.
              >
              The current CPython VM does not compile code.
              >
              Therefore it couldn't, or the assumption is wrong. Tim says it is.
              And the glaring one--
              >
              WHY NOT? Why doesn't CPython do it?
              >
              I am imagining that every Python implementation has something like
              it. If IronPython does not, in particular, not have the 'POP();
              TOP();' sequence, then it isn't running on a stack machine. Is the
              IronPython code open source, and can someone link to it? I'm not
              wading through it from scratch. What does it have instead? Does
              dynamic typing still work?
              --


              Comment

              • Terry Reedy

                #22
                Re: interpreter vs. compiled



                castironpi wrote:
                >The current CPython VM does not compile code.
                CPython compiles Python code to bytecode for its CPython *hardware
                independent* VM using standard compiler methdods and tools (lexer,
                parser, code generator, and optimizer). That VM (interpreter) is
                written in portable-as-possible C, with machine/OS #ifdefs added as
                needed.
                WHY NOT? Why doesn't CPython do it?
                1. Portability: The Microsoft C# JIT compiler runs under Windows .NET on
                x86/amd64 and maybe it64 and what else? Just porting .NET to run 0n
                Linux on the same processors was/is a bit task. Does MONO have a JIT also?

                There is a JIT for Python: Psyco. It originally only worked on x86. I
                am not sure what else. It originated as a PhD project, working with
                CPython, and was developed further as part of PyPy, but I do not know if
                there is any current progress.

                Python VM runs on numerous platforms.

                2. Money: C#, its JIT, and IronPython were and are funded by MS.
                Getting JIT right is hard and tedious.

                CPython is mostly a volunteer project. It is also the Python development
                platform. So it has to be simple enough for volunteers to pick up on
                its innards and for experimentation to be possible. Give the PSF more
                resources and

                tjr


                Comment

                • Tim Roberts

                  #23
                  Re: interpreter vs. compiled

                  castironpi <castironpi@gma il.comwrote:
                  >
                  >In C, we have:
                  >
                  >int x, y;
                  >x= 10;
                  >y= x+ 1;
                  >
                  >It translates as, roughly:
                  >>
                  >8000 .data
                  >7996 ffffffff #x
                  >7992 ffffffff #y
                  >7988 .end data
                  >7984 loadi reg0 7996
                  >7980 loadi reg1 7992
                  >7976 loadi reg2 10
                  >7972 loadi reg3 1
                  >7968 storv reg2 reg0
                  >7964 add reg0 reg1 reg2
                  >7960 storv reg3 reg1
                  I don't recognize that assembly language. Is that another intermediate
                  language?
                  >You are telling me that the same thing happens in IronPython.
                  Yes, the same process happens.
                  >By the
                  >time the instruction pointer gets to 'x= 10', the next 7 instructions
                  >are the ones shown here compiled from C.
                  I most certainly did NOT say that, as you well know. Different C compilers
                  produce different instruction sequences for a given chunk of code. Indeed,
                  a single C compiler will produce different instruction sequences based on
                  the different command-line options. It's unreasonable to expect a Python
                  compiler to produce exactly the same code as a C compiler.

                  However, that does disqualify the Python processor as a "compiler".
                  >CMIIW, but the CPython implementation -does- -not-.
                  And again, I never said that it did. CPython is an interpreter. the
                  user's code is never translated into machine language.
                  >My point is, CPython takes more than seven steps. My question is,
                  >does IronPython?
                  So, if compiler B isn't as good at optimization as compiler A, does that
                  mean in your mind that compiler B is not a "compiler"?
                  --
                  Tim Roberts, timr@probo.com
                  Providenza & Boekelheide, Inc.

                  Comment

                  • Bob Martin

                    #24
                    Re: interpreter vs. compiled

                    in 76135 20080731 090911 Dennis Lee Bieber <wlfraed@ix.net com.comwrote:
                    >On Thu, 31 Jul 2008 06:17:59 GMT, Tim Roberts <timr@probo.com declaimed
                    >the following in comp.lang.pytho n:
                    >
                    >
                    >And again, I never said that it did. CPython is an interpreter. the
                    >user's code is never translated into machine language.
                    >>
                    >Using that definition, the UCSD P-code Pascal and Java are also not
                    >"compilers" -- all three create files containing instructions for a
                    >non-hardware virtual machine.
                    >
                    >The only difference between Python, UCSD Pascal, and Java is that
                    >Python foregoes the explicit "compiler" pass.
                    >
                    >BASIC (classical microcomputer implementations -- like the one M$
                    >supplied for TRS-80s) is an interpreter -- the pre-scan of the source
                    >merely translated BASIC keywords into a byte index, not into opcodes for
                    >any virtual machine.
                    You are confusing languages with implementations , as I pointed out earlier.
                    Java is a language.
                    I have used at least 2 Java compilers, ie they compiled Java source to native
                    machine language.

                    Comment

                    • Duncan Booth

                      #25
                      Re: interpreter vs. compiled

                      Terry Reedy <tjreedy@udel.e duwrote:
                      1. Portability: The Microsoft C# JIT compiler runs under Windows .NET
                      on x86/amd64 and maybe it64 and what else? Just porting .NET to run
                      0n Linux on the same processors was/is a bit task. Does MONO have a
                      JIT also?
                      Technically there is no such thing as a Microsoft C# JIT compiler: the C#
                      compiler targets IL and the JIT compilers convert IL to the native machine,
                      but C# is just one of the frontend compilers you could use.

                      Microsoft do JIT compilers for .Net Compact Framework that target ARM,
                      MIPS, SHx and x86. The Mono JIT supports:

                      s390, s390x (32 and 64 bits) Linux
                      SPARC(32) Solaris, Linux
                      PowerPC Linux, Mac OSX
                      x86 Linux, FreeBSD, OpenBSD, NetBSD, Microsoft Windows, Solaris, OS X
                      x86-64: AMD64 and EM64T (64 bit) Linux, Solaris
                      IA64 Itanium2 (64 bit) Linux
                      ARM: little and big endian Linux (both the old and the new ABI)
                      Alpha Linux
                      MIPS Linux
                      HPPA Linux

                      (from http://www.mono-project.com/Supported_Platforms)


                      So I'd say .Net scores pretty highly on the portability stakes. (Although
                      of course code written for .Net might not do so well).

                      --
                      Duncan Booth http://kupuguy.blogspot.com

                      Comment

                      • Terry Reedy

                        #26
                        Re: interpreter vs. compiled



                        Duncan Booth wrote:
                        Terry Reedy <tjreedy@udel.e duwrote:
                        >
                        >1. Portability: The Microsoft C# JIT compiler runs under Windows .NET
                        >on x86/amd64 and maybe it64 and what else? Just porting .NET to run
                        >0n Linux on the same processors was/is a bit task. Does MONO have a
                        >JIT also?
                        >
                        Technically there is no such thing as a Microsoft C# JIT compiler: the C#
                        compiler targets IL and the JIT compilers convert IL to the native machine,
                        but C# is just one of the frontend compilers you could use.
                        >
                        Microsoft do JIT compilers for .Net Compact Framework that target ARM,
                        MIPS, SHx and x86. The Mono JIT supports:
                        >
                        s390, s390x (32 and 64 bits) Linux
                        SPARC(32) Solaris, Linux
                        PowerPC Linux, Mac OSX
                        x86 Linux, FreeBSD, OpenBSD, NetBSD, Microsoft Windows, Solaris, OS X
                        x86-64: AMD64 and EM64T (64 bit) Linux, Solaris
                        IA64 Itanium2 (64 bit) Linux
                        ARM: little and big endian Linux (both the old and the new ABI)
                        Alpha Linux
                        MIPS Linux
                        HPPA Linux
                        >
                        (from http://www.mono-project.com/Supported_Platforms)
                        >
                        >
                        So I'd say .Net scores pretty highly on the portability stakes. (Although
                        of course code written for .Net might not do so well).
                        Did you mean IL scores highly? In any case, scratch 1. portability as a
                        reason why CPython lacks JIT. That leave 2. $ in the multimillions.

                        More
                        3. Design difference1: I suspect that IL was designed with JIT
                        compilation in mind, whereas PyCode was certainly not.

                        4. Design difference2: The first 'killer ap' for Python was driving
                        compiled Fortran/C functions (early Numeric). If a 'Python' program
                        spend 95% of its time in compiled-to-machine-code extensions, reducing
                        the other 5% to nothing gains little. CPython *was* and has-been
                        designed for this.

                        The process continues. The relatively new itertools module was designed
                        and tested in Python (see itertools in Libary reference). But the
                        delivered module is compiled C.

                        tjr

                        Comment

                        • castironpi

                          #27
                          Re: interpreter vs. compiled

                          On Jul 31, 1:17 am, Tim Roberts <t...@probo.com wrote:
                          castironpi <castiro...@gma il.comwrote:
                          >
                          In C, we have:
                          >
                          int x, y;
                          x= 10;
                          y= x+ 1;
                          >
                          It translates as, roughly:
                          >
                          8000 .data
                          7996 ffffffff #x
                          7992 ffffffff #y
                          7988 .end data
                          7984 loadi reg0 7996
                          7980 loadi reg1 7992
                          7976 loadi reg2 10
                          7972 loadi reg3 1
                          7968 storv reg2 reg0
                          7964 add reg0 reg1 reg2
                          7960 storv reg3 reg1
                          >
                          I don't recognize that assembly language.  Is that another intermediate
                          language?
                          I'm looking at a system word of 1's and 0's that gets executed on a
                          per-cycle basis in the processor. Could easily be that the designs
                          are tuned to JIT's these days and I'm out of date, what with
                          pipelining and lookahead branching and all, but no, it's what I
                          remember from system architecture class.
                          You are telling me that the same thing happens in IronPython.
                          >
                          Yes, the same process happens.
                          >
                          By the
                          time the instruction pointer gets to 'x= 10', the next 7 instructions
                          are the ones shown here compiled from C.
                          >
                          I most certainly did NOT say that, as you well know.  Different C compilers
                          produce different instruction sequences for a given chunk of code.  Indeed,
                          a single C compiler will produce different instruction sequences based on
                          the different command-line options.  It's unreasonable to expect a Python
                          compiler to produce exactly the same code as a C compiler.
                          >
                          However, that does disqualify the Python processor as a "compiler".
                          >
                          CMIIW, but the CPython implementation -does- -not-.
                          >
                          And again, I never said that it did.  CPython is an interpreter.  the
                          user's code is never translated into machine language.
                          >
                          My point is, CPython takes more than seven steps.  My question is,
                          does IronPython?
                          >
                          So, if compiler B isn't as good at optimization as compiler A, does that
                          mean in your mind that compiler B is not a "compiler"?
                          --
                          Tim Roberts, t...@probo.com
                          Providenza & Boekelheide, Inc.
                          You can translate C code to machine code, for any given C code, for
                          any given machine.

                          You can translate Python code to machine code, for some Python code,
                          for any given machine.

                          Given the restrictions (or rather, freedoms) of Python, does there
                          exist code that necessarily cannot translate to machine code? In
                          other words, can you translate all Python code to machine code?
                          Similarly, I take it that the decision to make CPython a stack machine
                          + VM was a design decision, not a necessity, favoring internal
                          simplicity over the extra 5%.

                          The output of the program is determined from the input by the Python
                          specification, regardless of implementation, but the answer still
                          isn't necessarily yes. But I think the only counterexample that comes
                          to me is the case of a dynamic grammar, not merely dynamic data type,
                          so for Python maybe it is. And furthermore, I think I'm getting
                          confused about what exactly constitutes an interpreter: it is whether
                          there is a process that runs product instructions, or the product
                          instructions can run standalone. I would take 'compiler' to mean,
                          something that outputs an .EXE executable binary file, and I don't
                          just mean bundling up the python.exe executable with a file. Python
                          needs to be present on the machine you run target code on, not so with
                          C binaries. Can .NET bring its targets to such a state, or are those
                          run-times requried? Are they just DLL's, or are they, for lack of
                          better word, driving?

                          (Of course, for futuristic abstract hardware designs, CPython may be
                          mostly native instructions, and the output of its processor on
                          function definition, can be stored and run directly, such as for a
                          stack machine architecture.)

                          But I don't want to restrict my question to a question of
                          optimization. Your compiler could output something like this:

                          read variables into registers
                          reorganize variable dictionaries
                          perform addition
                          do laundry
                          write variables into system memory
                          clean sink

                          and run when you write 'C:\>silly.exe' , and so on. And still be
                          compiled, even if the live IronPython session, which you invoke with,
                          'C:\>ironpy.exe silly.py', outputs the same 7 MIPS instructions.

                          Comment

                          • Paul Boddie

                            #28
                            Re: interpreter vs. compiled

                            On 1 Aug, 07:11, castironpi <castiro...@gma il.comwrote:
                            >
                            Given the restrictions (or rather, freedoms) of Python, does there
                            exist code that necessarily cannot translate to machine code?  In
                            other words, can you translate all Python code to machine code?
                            Given that all valid Python code can be executed somehow and that
                            execution takes place as the processor performs instructions which "it
                            gets from somewhere", meaning that those instructions can belong
                            either to a general interpreter or to specific code generated for a
                            given user program (or a combination of these things), I think that
                            you have to refine your question. What you seem to be asking is this:
                            can you translate Python code to machine code which encodes the
                            behaviour of the user program in a way nearing the efficiency of code
                            generated from other programming languages? Rephrased, the question is
                            this: can Python code be efficiently represented using low-level
                            machine instructions?

                            I think you've already touched upon this when thinking about integer
                            operations. The apparently simple case of integer addition in Python
                            is not completely encoded by a few machine instructions. In other
                            words...

                            a + b # in Python

                            ...is not sufficiently represented by...

                            ldr r1, a
                            ldr r2, b
                            add r3, r1, r2

                            ...in some assembly language (and the resulting machine code), mostly
                            because the semantics of Python addition are more complicated. Of
                            course, you can generate code for those semantics, which would lead to
                            quite a few more machine instructions than those suggested above, but
                            then it might be interesting to bundle those instructions in some kind
                            of subroutine, and we could call this subroutine BINARY_ADD. At this
                            point, you'd almost be back at the stage where you're writing a
                            bytecode interpreter again.

                            Of course, it's worth considering something in between these
                            situations (the verbose expansion of the user program vs. a bytecode
                            interpreter which examines virtual instructions and jumps to
                            subroutines), and there are apparently a few techniques which make
                            virtual machines more efficient (so that the processor isn't jumping
                            around too much in the interpreter code, for example), and there are
                            also going to be techniques which permit the simplification of any
                            verbose machine code representation (most likely by not generating
                            code which is never going to be executed, due to various properties of
                            the program).

                            Obviously, CPython isn't oriented towards investigating these matters
                            in great depth, but that doesn't mean that other implementations can't
                            pursue other approaches.
                            Similarly, I take it that the decision to make CPython a stack machine
                            + VM was a design decision, not a necessity, favoring internal
                            simplicity over the extra 5%.
                            Probably: it simplifies code generation somewhat.

                            Paul

                            Comment

                            • Terry Reedy

                              #29
                              Re: interpreter vs. compiled

                              castironpi wrote:
                              Similarly, I take it that the decision to make CPython a stack machine
                              + VM was a design decision, not a necessity, favoring internal
                              simplicity over the extra 5%.
                              Years ago, someone once started a project to write a register-based
                              virtual machine for (C)Python. I suspect it was abandoned for some
                              combination of lack of time and preliminary results showing little
                              speedup for the increased complication. But I never saw any 'final
                              report'.
                              And furthermore, I think I'm getting
                              confused about what exactly constitutes an interpreter: it is whether
                              there is a process that runs product instructions, or the product
                              instructions can run standalone. I would take 'compiler' to mean,
                              something that outputs an .EXE executable binary file,
                              This is way too restrictive. Does *nix have no compilers? In any case,
                              the CPython compiler uses stadard compiler components: lexer, parser,
                              syntax tree, code generator, and peephole optimizer. The result is a
                              binary file (.pyc for Python compiled) executable on a PyCode machine.

                              Comment

                              • castironpi

                                #30
                                Re: interpreter vs. compiled

                                On Aug 1, 5:24 am, Paul Boddie <p...@boddie.or g.ukwrote:
                                On 1 Aug, 07:11, castironpi <castiro...@gma il.comwrote:
                                >
                                >
                                >
                                Given the restrictions (or rather, freedoms) of Python, does there
                                exist code that necessarily cannot translate to machine code?  In
                                other words, can you translate all Python code to machine code?
                                >
                                Given that all valid Python code can be executed somehow and that
                                execution takes place as the processor performs instructions which "it
                                gets from somewhere", meaning that those instructions can belong
                                either to a general interpreter or to specific code generated for a
                                given user program (or a combination of these things), I think that
                                you have to refine your question. What you seem to be asking is this:
                                can you translate Python code to machine code which encodes the
                                behaviour of the user program in a way nearing the efficiency of code
                                generated from other programming languages? Rephrased, the question is
                                this: can Python code be efficiently represented using low-level
                                machine instructions?
                                >
                                I think you've already touched upon this when thinking about integer
                                operations. The apparently simple case of integer addition in Python
                                is not completely encoded by a few machine instructions. In other
                                words...
                                >
                                  a + b # in Python
                                >
                                ...is not sufficiently represented by...
                                >
                                  ldr r1, a
                                  ldr r2, b
                                  add r3, r1, r2
                                >
                                ...in some assembly language (and the resulting machine code), mostly
                                because the semantics of Python addition are more complicated.
                                No, it is not sufficiently represented. Python runs checks before and
                                after, to check for overflows.

                                test safeinteger a
                                test safeinteger b
                                ldr r1, a
                                ldr r2, b
                                add r3, r1, r2
                                test not overflow

                                However, no implementation of Python can do better, given Python's
                                specification.
                                Of
                                course, you can generate code for those semantics, which would lead to
                                quite a few more machine instructions than those suggested above, but
                                then it might be interesting to bundle those instructions in some kind
                                of subroutine, and we could call this subroutine BINARY_ADD. At this
                                point, you'd almost be back at the stage where you're writing a
                                bytecode interpreter again.
                                This isn't the bytecode interpreter returning, it's bounds checking,
                                which is part and parcel of Python, I hold.

                                Another factor, a and b are known to be and are always integers, in a
                                given C context.

                                int a, b;
                                ...
                                a + b

                                The C compilation process outputs:

                                ldr r1, a
                                ldr r2, b
                                add r3, r1, r2

                                and you are correct. However, for:

                                string a, b;
                                a + b

                                performs a concatenation which is not that simple. The point is, C
                                compilation runs, and you actually have -ldr, ldr, add- lying around
                                in a file somewhere, which can run as three consecutive instructions
                                on a processor. It's already in the interpreter in Python, and you
                                have the -test, test, ldr, ldr, add, test- sequence somewhere in
                                Python.exe, specifically wherever the object code for ceval.c is
                                going.

                                Incidentally, I find 2 bytes in 16K different in a simple C program
                                that merely executes a + b vs. a - b. It is not in this practical
                                case a three-word output (12 bytes, ldr-ldr-add vs. ldr-ldr-subtr),
                                though it's not clear what entry points the OS requires.
                                Of course, it's worth considering something in between these
                                situations (the verbose expansion of the user program vs. a bytecode
                                interpreter which examines virtual instructions and jumps to
                                subroutines), and there are apparently a few techniques which make
                                virtual machines more efficient (so that the processor isn't jumping
                                around too much in the interpreter code, for example), and there are
                                also going to be techniques which permit the simplification of any
                                verbose machine code representation (most likely by not generating
                                code which is never going to be executed, due to various properties of
                                the program).
                                I think it's relevant and fair to consider -consecutive- language
                                instructions at this point. Working example:

                                int a, b, c, d;
                                ...
                                a + b
                                c + d

                                The C compilation process outputs:

                                ldr r1, a
                                ldr r2, b
                                add r3, r1, r2
                                ldr r1, c
                                ldr r2, d
                                add r3, r1, r2

                                Whereas there is no equivalent in CPython. The actual code that runs
                                on the processor (summary) is:

                                :loop
                                ...
                                :addition_sign
                                test safeinteger a
                                test safeinteger b
                                ldr r1, a
                                ldr r2, b
                                add r3, r1, r2
                                test not overflow
                                :goto loop

                                as opposed to two duplicate sections being outputted back-to-back,
                                even though they are -run- effectively back-to-back. For a different
                                type, say list concatenation, the disassembly looks like:
                                >>def f():
                                ... []+[]
                                ...
                                >>f()
                                >>dis.dis(f)
                                2 0 BUILD_LIST 0
                                3 BUILD_LIST 0
                                6 BINARY_ADD
                                7 POP_TOP
                                8 LOAD_CONST 0 (None)
                                11 RETURN_VALUE

                                the meaning of which I am not finding in ceval.c. Anyone?

                                Regardless, the JIT compilation process allocates a new executable
                                block of memory:

                                :loop
                                ...
                                :addition_sign
                                output( 'ldr r1, a; ldr r1, a; ldr r2, b' )
                                :goto loop

                                which in this stage executes twice, yielding

                                ldr r1, a
                                ldr r2, b
                                add r3, r1, r2
                                ldr r1, c
                                ldr r2, d
                                add r3, r1, r2

                                somewhere in memory, same as C. Then it runs the block. It also has
                                to have already ascertained that 'a' and 'b' are necessarily integers
                                by the time it makes the output( ) statement. I remain unclear on the
                                disagreement between why JIT is not called compiling, and why it
                                doesn't output an executable binary, unless the practical reasons of
                                saving developer time and cross-platform object files, or merely
                                terminology, are the only difference.

                                Comment

                                Working...