what is the difference between compiling and linking?

**arora** · Jul 14 '10, 08:00 AM

compiler: creates the machine-usable binary files from the code.
linker: now there are more binary files so we need here linker to join togehter all binary files to do somthing useful

**Banfa** · Jul 14 '10, 08:27 AM

Not bad.

The compiler works on a single source file compiling it into a single object file (containing machine usable binary code). You may pass more that 1 source file to the compile but it treats each one individually.

The linker combine multiple objects files plus any libraries into a single executable image resolving all symbols (making sure that every reference to a symbol actually has a object to refer to).

**donbock** · Jul 14 '10, 11:59 AM

The term "linker" is actually short for "link editor", although I haven't heard the longer term used in a while.

The compiler converts the source file into an object file. The object consists of [at least] the assembler instructions that implement the source code, a list of references, and a list of definitions. More on those two lists later.

Consider a program consisting of two source files, let's call them file1.c and file2.c, respectively:

Code:

void sub1(void);
int main(void) {
   sub1();
   return 0;
   }

Code:

void sub1(void);
static void sub2(void);
void sub1(void) {
   sub2();
   }
static void sub2(void) {
   }

Now you compile file1.c. The compiler replaces the function call to sub1 with whatever assembler instructions accomplish the same thing (lets call it JSR). The compiler knows there is a function called sub1, but it doesn't have any idea where it is. It uses a placeholder value (perhaps 0, but it could be anything) as the argument to the JSR. It addes an entry to the reference list that associates the location of this placeholder value with symbol sub1.

Now you compile file2.c. The compiler again generates the corresponding assembler instructions. In the function call to sub2, the compiler knows the address of function sub2 relative to the start of file2, but that won't be the actual address if other modules precede this one. It again uses a placeholder as the argument to the JSR and adds an entry to the reference list that associates this placeholder with the relative address of sub2. The compiler also adds an entry to the definition list for the address (relative to the start of file2) of the entry point of sub1.

Now we link these two object files together, let's assume they're in the order file1.o followed by file2.o. The linker scans all of the object files to extract all of the reference lists into a single combined reference list; likewise for the definition lists. As it goes, it converts definition addresses from relative to the start of the object file into relative to the start of the program. It can do this because it knows the size of the blocks of assembler code for each object module. Then it concatenates all of the blocks of assembler code. It then uses the reference list to replace each placeholder value with the proper address of the referenced symbols. The result is the executable image file.

There is one further step after linking. Eventually you want to run your program. The loader finds a block of physical memory large enough to hold your program, copies the program into that block, and then scans the program to replace addresses relative to the start of the program with absolute addresses. It does this in a manner analogous to how the linker replaced placeholder values. Finally, the loader transfers control to the entrypoint of your program.

As a further complication, the entrypoint to your program is not actually main -- it is code provided by the compiler vendor that initializes the runtime environment and calls main. For details, look at crt0.

This is a simplified explanation. I left out a lot of details.

what is the difference between compiling and linking?

what is the difference between compiling and linking?

Comment

Comment

Comment