Doubt on compiler-processor relationship

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • ajaygargnsit
    New Member
    • Jul 2007
    • 18

    Doubt on compiler-processor relationship

    Let's say we are using the gcc compiler, and a Pentium 4. Also say there is another machine running same gcc compiler, but a different processor, say Sunsparc. So, now, even though only the processor differs, yet different machine codes will be generated for the same code. (I assume it to be the case, kindly correct me I am wrong..)

    Now how does the compiler know that what machine code to generate. (I presume it has got to do something with compiler "Adjusting" its parameters, during boot up time, BUT I AM NOT SURE... PLZ help..)

    Looking forward to a reply.

    Ajay Garg
  • JosAH
    Recognized Expert MVP
    • Mar 2007
    • 11453

    #2
    Originally posted by ajaygargnsit
    Let's say we are using the gcc compiler, and a Pentium 4. Also say there is another machine running same gcc compiler, but a different processor, say Sunsparc. So, now, even though only the processor differs, yet different machine codes will be generated for the same code. (I assume it to be the case, kindly correct me I am wrong..)

    Now how does the compiler know that what machine code to generate. (I presume it has got to do something with compiler "Adjusting" its parameters, during boot up time, BUT I AM NOT SURE... PLZ help..)

    Looking forward to a reply.

    Ajay Garg
    Gcc nicely separates the compilation frontend (lexical analysis, parsing,
    abstract code generation) from the backend. The backend 'knows' for what
    processor it has to generate its code. Gcc can even do cross compilation,
    i.e. simply replace the backend of the compiler by another one.

    kind regards,

    Jos

    Comment

    • ajaygargnsit
      New Member
      • Jul 2007
      • 18

      #3
      Gcc nicely separates the compilation frontend (lexical analysis, parsing,
      abstract code generation) from the backend. The backend 'knows' for what
      processor it has to generate its code. Gcc can even do cross compilation,
      i.e. simply replace the backend of the compiler by another one.
      That means, that compilation from source code to object code is a two step process :: one at the front end side, other at the back end side, right ?

      Comment

      • ajaygargnsit
        New Member
        • Jul 2007
        • 18

        #4
        The compiler itself is a (compiled) binary. You can't run a gcc for a Sparc on
        a pentium, nor vice versa. See your other thread where I also explained things
        a bit more. The compiler doesn't 'know' anything about other processors. It's
        the code generating part of the compiler that knows all about one single processor.
        Well JosAh, I moved this here, so we can continue in the same thread ...

        Ok.. so that means, that in the installation CDs of Linux, we have different binaries of gcc, one each for all the processors possible?

        Comment

        • JosAH
          Recognized Expert MVP
          • Mar 2007
          • 11453

          #5
          Originally posted by ajaygargnsit
          Well JosAh, I moved this here, so we can continue in the same thread ...

          Ok.. so that means, that in the installation CDs of Linux, we have different binaries of gcc, one each for all the processors possible?
          A binary executable is targeted for one specific processor only, so if you have
          Linux running on say, an ARM and an Intel processor you need binaries for each
          one of them.

          Note that you can have compiler backends running on processor P1 while
          they generate machine code for processor P2; they're the building blocks for
          cross compilation.

          kind regards,

          Jos

          Comment

          • ajaygargnsit
            New Member
            • Jul 2007
            • 18

            #6
            Ok, I think that now I can make sense ... Taking my system configuration example (Fedora, gcc, Intel pentium 4), the steps that happen are as follows ::

            During installation, the initial booting stage checks for the processor. Now since mine is Intel P4, it will load every binary (including gcc of course), targetted towards P4.

            Now the gcc binaries include ::

            a) The front end binaries (written for P4)
            b) The backend binaries (also written for P4), but which are able to cross compile.

            So all in all, one front end binary (for P4), several backend binaries (for P4 itself, Sunsparc, .... .... ...)..

            Also that means that all the compilation in the "usual" sense (resolving references, linking, etc. etc. etc.) happens in front end, while the backend only actually converts the code into strings of 0's and 1's.

            Kindly correct me, if I am wrong at any point above .

            Thanks JosAH

            Ajay Garg

            Comment

            • JosAH
              Recognized Expert MVP
              • Mar 2007
              • 11453

              #7
              Originally posted by ajaygargnsit
              Ok, I think that now I can make sense ... Taking my system configuration example (Fedora, gcc, Intel pentium 4), the steps that happen are as follows ::

              During installation, the initial booting stage checks for the processor. Now since mine is Intel P4, it will load every binary (including gcc of course), targetted towards P4.

              Now the gcc binaries include ::

              a) The front end binaries (written for P4)
              b) The backend binaries (also written for P4), but which are able to cross compile.

              So all in all, one front end binary (for P4), several backend binaries (for P4 itself, Sunsparc, .... .... ...)..

              Also that means that all the compilation in the "usual" sense (resolving references, linking, etc. etc. etc.) happens in front end, while the backend only actually converts the code into strings of 0's and 1's.

              Kindly correct me, if I am wrong at any point above .

              Thanks JosAH

              Ajay Garg
              Actually resolving references etc. is the work of the linker; the loader substitutes
              logical addresses to the physical addresses. The compiler backend had generated
              COFF or ELF files for that (Common Object File Format and/or Extensible Link
              Format).
              Those files do contain the machine instructions for the processor for which the
              compiler was targeted.

              kind regards,

              Jos

              Comment

              • ajaygargnsit
                New Member
                • Jul 2007
                • 18

                #8
                Ahh ... Well, it would be nice if I could get the "complete" procedure as to what happens, from the step of writing a C program, to the step of getting it executed ..

                I will be obliged to get help.

                (Well, wouldn't be nice if we start one step at a time, and then discuss each step, taking into account the different cases possible at each step..)

                Comment

                • JosAH
                  Recognized Expert MVP
                  • Mar 2007
                  • 11453

                  #9
                  Originally posted by ajaygargnsit
                  Ahh ... Well, it would be nice if I could get the "complete" procedure as to what happens, from the step of writing a C program, to the step of getting it executed ..

                  I will be obliged to get help.

                  (Well, wouldn't be nice if we start one step at a time, and then discuss each step, taking into account the different cases possible at each step..)
                  I've got a better idea: read a book on compilers; Aho, Sethi and Ulman's
                  "the Dragon book" is legendary. For starters you might have a peek at my
                  little "Compilers" series in the Java Articles section.

                  kind regards,

                  Jos

                  Comment

                  • weaknessforcats
                    Recognized Expert Expert
                    • Mar 2007
                    • 9214

                    #10
                    I always understood that these chips have a built-in x386 instruction set to native code translator.

                    Your C++ compiler generates x386 code which is translated by the chip into native instructions at execution time.

                    Otherwise, you will need a binary for every chip under the sun, plus distribution, maintenance, etc.

                    I even read an article once where a Pentium designer wrote a compiler that generated Pentium code instead of x386 code. Result: The program was 14x faster. His comment: Do something nice for people and they all they want to do is cook on a cmpfire.

                    Comment

                    • JosAH
                      Recognized Expert MVP
                      • Mar 2007
                      • 11453

                      #11
                      Originally posted by weaknessforcats
                      I always understood that these chips have a built-in x386 instruction set to native code translator.

                      Your C++ compiler generates x386 code which is translated by the chip into native instructions at execution time.

                      Otherwise, you will need a binary for every chip under the sun, plus distribution, maintenance, etc.
                      But we still do; no matter what clever tricks Intel put in their pentium processors.
                      Halfway burried in the bare metal there's a RISC processor that emulates the
                      CISC instruction set of the older Intel processors. Compilers still generate that
                      old instruction set. But, e.g. Sparcs or ARMS (both RISC processors) still need
                      their own binaries. For those Intel thingies: those 'hidden' instructions can be
                      considered the micro code of the pentium; not very clever IMHO because it
                      takes million of gates and heat dissipation and all to get a reasonable performance.
                      Multi code technology is only a halfway solution. 100K gates RISC processors
                      can do the same thing at (almost) the same speed using a much slower clock
                      (less heat dissipation and power consumption).

                      kind regards,

                      Jos

                      Comment

                      Working...