optimizers are overrated

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • copx

    optimizers are overrated

    Optimizers are overrated

    I started learning ASM not long ago to improve my understanding of the
    hardware architecture and my ability to optimize C code. The results of my
    first experiment were surprising to say at least. After reading the chapter
    on loops in my ASM book I wanted to test whether modern C compilers are
    actually as smart as commonly claimed. I chose a most simple loop: calling
    putchar 100 times. The first function (foo) uses a typical C style loop to
    test the assumption that "the compiler will optimize that better than any
    human could". The second function (bar) is based my newly gained knowledge,
    the loop is basically ASM written in C. Now I am certain if I asked here
    which one is more efficient all you guys would reply "the compiler will most
    likely generate the same code in both cases" (I have read such claims
    countless times here). Well, look at the ASM output below to see how wrong
    your assumption is.

    /* C Code */

    void foo(void)
    {
    int i;

    for (i = 0; i < 100; i++) putchar('a');
    }


    void bar(void)
    {
    int i = 100;

    do {
    putchar('a');
    } while (--i);
    }

    As I said, damn simple. No nasty side effects, no access to global
    variables, etc. The optimizer has no excuses. It should generate optimial
    code in both cases. But see the result:


    /* GCC 4.3.0 on x86/Windows, -O2 */

    /* foo */
    L7:
    subl $12, %esp
    pushl $97
    call _putchar

    incl %ebx
    addl $16, %esp
    cmpl $100, %ebx
    jne L7


    /* bar */
    L2:
    subl $12, %esp
    pushl $97
    call _putchar
    addl $16, %esp

    decl %ebx
    jne L2


    Comment: See, even the most recent version of the probably most widely used
    compiler can not correctly optimize a most simple loop! At least GCC
    understood the bar loop, so my "write C like ASM" optimization worked.

    At this point you might wonder what horrible things an average C compiler
    will do when GCC already fails so badly. Here is the gruesome result:


    /* lccwin32, optimize on */

    /* foo */
    _$4:
    pushl $97
    call _putchar
    popl %ecx

    incl %edi
    cmpl $100,%edi
    jl _$4


    /* bar */
    _$10:
    pushl $97
    call _putchar
    popl %ecx

    movl %edi,%eax
    decl %eax
    movl %eax,%edi
    or %eax,%eax
    jne _$10


    Comment: lcc is unable to optimize the loop just like GCC, but it adds
    insults to injury by actually generating worse code for the ASM-style loop!
    So you cannot even optimize the loop yourself!


    /* MS Visual C++ 6 /O2 */

    For this compiler I had to replace the putchar call with a call to a custom
    my_putchar function otherwise the compiler replaces the putchar calls with
    direct OS API stuff. While this is a good optimization it is not the
    subject of this test, and only makes the resulting asm harder to read, so I
    supressed that.


    /* foo */

    jmp SHORT $L833
    $L834:
    mov eax, DWORD PTR _i$[ebp]
    add eax, 1
    mov DWORD PTR _i$[ebp], eax
    $L833:
    cmp DWORD PTR _i$[ebp], 100
    jge SHORT $L835

    push 97
    call _my_putchar
    add esp, 4

    jmp SHORT $L834
    $L835:


    /* bar */

    $L840:
    push 97
    call _my_putchar
    add esp, 4

    mov eax, DWORD PTR _i$[ebp]
    sub eax, 1
    mov DWORD PTR _i$[ebp], eax
    cmp DWORD PTR _i$[ebp], 0
    jne SHORT $L840


    Comment: Amazingly enough, this compiler has found yet another way to screw
    up. Would you have thought that each compiler generates different code for
    such a simple construct?
    I hope you agree that the compiler of the beast deserves the award "Worst of
    Show" for this mess. Are MS compilers still this bad?




  • Ian Collins

    #2
    Re: optimizers are overrated

    copx wrote:
    /* C Code */
    >
    void foo(void)
    {
    int i;
    >
    for (i = 0; i < 100; i++) putchar('a');
    }
    >
    >
    void bar(void)
    {
    int i = 100;
    >
    do {
    putchar('a');
    } while (--i);
    }
    >
    As I said, damn simple. No nasty side effects, no access to global
    variables, etc. The optimizer has no excuses. It should generate optimial
    code in both cases. But see the result:
    >
    You didn't say what you expected to see. Did you optimise for speed or
    space, or use a default?

    The first compiler I tried partly unrolled both loops generating near
    identical code for both, which is what I'd expect for a default
    optimisation.

    The only difference was the starting condition and test.

    --
    Ian Collins.

    Comment

    • copx

      #3
      Re: optimizers are overrated


      "Ian Collins" <ian-news@hotmail.co mschrieb im Newsbeitrag
      news:677bt8F2ls kqqU5@mid.indiv idual.net...
      copx wrote:
      >/* C Code */
      >>
      >void foo(void)
      >{
      > int i;
      >>
      > for (i = 0; i < 100; i++) putchar('a');
      >}
      >>
      >>
      >void bar(void)
      >{
      > int i = 100;
      >>
      > do {
      > putchar('a');
      > } while (--i);
      >}
      >>
      >As I said, damn simple. No nasty side effects, no access to global
      >variables, etc. The optimizer has no excuses. It should generate optimial
      >code in both cases. But see the result:
      >>
      You didn't say what you expected to see.
      I wasn't sure what to expect, that's why I tested it.
      Did you optimise for speed or
      space, or use a default?
      Read the post again, I specified the used compiler flags (e.g. -O2 for GCC).
      When there was a choice (lcc's UI doesn't offer one) I chose "optimize for
      speed".
      The first compiler I tried partly unrolled both loops generating near
      identical code for both, which is what I'd expect for a default
      optimisation.
      And which compiler was that? Just curious.
      The only difference was the starting condition and test.
      The test is the whole point of optimizing this loop on x86. You do not need
      a test (cmp instruction) in the loop if you decrement towards zero instead
      of incrementing towards 100. This saves one instruction in the body of loop.
      The other obvious optimizations are using a register to hold the counter,
      and skipping the first check because it is known at compile time that the
      loop will always execute at least once. Not one compiler managed to do all
      that when feed with the "the optimizer will understand" version (foo).

      Loop unrolling is a trickier optimization. You sacrifice code size for
      speed. Or lets say the hope for speed, because the increased code size might
      cause the code to end up being slower in the end.




      Comment

      • Ian Collins

        #4
        Re: optimizers are overrated

        copx wrote:
        "Ian Collins" <ian-news@hotmail.co mschrieb im Newsbeitrag
        news:677bt8F2ls kqqU5@mid.indiv idual.net...
        >copx wrote:
        >>/* C Code */
        >>>
        >>void foo(void)
        >>{
        >> int i;
        >>>
        >> for (i = 0; i < 100; i++) putchar('a');
        >>}
        >>>
        >>>
        >>void bar(void)
        >>{
        >> int i = 100;
        >>>
        >> do {
        >> putchar('a');
        >> } while (--i);
        >>}
        >>>
        >>As I said, damn simple. No nasty side effects, no access to global
        >>variables, etc. The optimizer has no excuses. It should generate optimial
        >>code in both cases. But see the result:
        >>>
        >You didn't say what you expected to see.
        >
        I wasn't sure what to expect, that's why I tested it.
        >
        >Did you optimise for speed or
        >space, or use a default?
        >
        Read the post again, I specified the used compiler flags (e.g. -O2 for GCC).
        When there was a choice (lcc's UI doesn't offer one) I chose "optimize for
        speed".
        >
        Now everyone knows what a specific compiler's flags do.
        >The first compiler I tried partly unrolled both loops generating near
        >identical code for both, which is what I'd expect for a default
        >optimisation .
        >
        And which compiler was that? Just curious.
        >
        Sun c99.
        >The only difference was the starting condition and test.
        >
        The test is the whole point of optimizing this loop on x86. You do not need
        a test (cmp instruction) in the loop if you decrement towards zero instead
        of incrementing towards 100. This saves one instruction in the body of loop.
        You still have to test for 0, which may or may not be faster.
        The other obvious optimizations are using a register to hold the counter,
        and skipping the first check because it is known at compile time that the
        loop will always execute at least once. Not one compiler managed to do all
        that when feed with the "the optimizer will understand" version (foo).
        >
        c99 appears to, generated

        movl $100,%ebx ;/ line : 18
        leaq __iob+128(%rip) ,%r12 ;/ line : 18
        .align 16
        ..CG6.21:
        movl $97,%edi ;/ line : 18
        movq %r12,%rsi ;/ line : 18
        call putc ;/ line : 18
        movl $97,%edi ;/ line : 18
        movq %r12,%rsi ;/ line : 18
        call putc ;/ line : 18
        movl $97,%edi ;/ line : 18
        movq %r12,%rsi ;/ line : 18
        call putc ;/ line : 18
        movl $97,%edi ;/ line : 18
        movq %r12,%rsi ;/ line : 18
        call putc ;/ line : 18
        movl $97,%edi ;/ line : 18
        movq %r12,%rsi ;/ line : 18
        call putc ;/ line : 18
        addl $-5,%ebx ;/ line : 19
        ..LU7.69:
        testl %ebx,%ebx ;/ line : 19
        jne .CG6.21 ;/ line : 19

        --
        Ian Collins.

        Comment

        • robertwessel2@yahoo.com

          #5
          Re: optimizers are overrated

          On Apr 22, 6:38 pm, "copx" <c...@gazeta.pl wrote:
          Optimizers are overrated
          >
          I started learning ASM not long ago to improve my understanding of the
          hardware architecture and my ability to optimize C code. The results of my
          first experiment were surprising to say at least. After reading the chapter
          on loops in my ASM book I wanted to test whether modern C compilers are
          actually as smart as commonly claimed. I chose a most simple loop: calling
          putchar 100 times. The first function (foo) uses a typical C style loop to
          test the assumption that "the compiler will optimize that better than any
          human could". The second function (bar) is based my newly gained knowledge,
          the loop is basically ASM written in C. Now I am certain if I asked here
          which one is more efficient all you guys would reply "the compiler will most
          likely generate the same code in both cases" (I have read such claims
          countless times here). Well, look at the ASM output below to see how wrong
          your assumption is.
          >
          /* C Code */
          >
          void foo(void)
          {
           int i;
          >
           for (i = 0; i < 100; i++) putchar('a');
          >
          }
          >
          void bar(void)
          {
           int i = 100;
          >
           do {
            putchar('a');
           } while (--i);
          >
          }
          ...(GCC and LccWin stuff snipped)...
          >
          /* MS Visual C++ 6 /O2 */
          >
          For this compiler I had to replace the putchar call with a call to a custom
          my_putchar function otherwise the compiler replaces the putchar calls with
          direct OS API stuff. While this is a good  optimization it is not the
          subject of this test, and only makes the resulting asm harder to read, so I
          supressed that.
          >
          /* foo */
          >
                  jmp     SHORT $L833
          $L834:
                  mov     eax, DWORD PTR _i$[ebp]
                  add     eax, 1
                  mov     DWORD PTR _i$[ebp], eax
          $L833:
                  cmp     DWORD PTR _i$[ebp], 100
                  jge     SHORT $L835
          >
                  push    97
                  call    _my_putchar
                  add     esp, 4
          >
                  jmp     SHORT $L834
          $L835:
          >
          /* bar */
          >
          $L840:
                  push    97
                  call    _my_putchar
                  add     esp, 4
          >
                  mov     eax, DWORD PTR _i$[ebp]
                  sub     eax, 1
                  mov     DWORD PTR _i$[ebp], eax
                  cmp     DWORD PTR _i$[ebp], 0
                  jne     SHORT $L840
          >
          Comment: Amazingly enough, this compiler has found yet another way to screw
          up. Would you have thought that each compiler generates different code for
          such a simple construct?
          I hope you agree that the compiler of the beast deserves the award "Worst of
          Show" for this mess. Are MS compilers still this bad?

          So just how much abuse do *you* think you deserve for testing, and
          complaining about, a compiler over a *decade* old? Especially when
          you can download a current version for free.

          But be that as it may, you've clearly not managed to run the compiler
          correctly, because my copy of VC6 generates the following when run
          with -O2 or -Ox. In fact, the output you included appears to be the
          default, non-optimized output for VC6.


          MSVC6 ("cl -Ox -c -Fa test48.c"):

          _foo PROC NEAR
          ; File test48.c
          ; Line 2
          push esi
          ; Line 6
          mov esi, 100 ; 00000064H
          $L90:
          push 97 ; 00000061H
          call _putchar
          add esp, 4
          dec esi
          jne SHORT $L90
          pop esi
          ; Line 10
          ret 0
          _foo ENDP
          _TEXT ENDS
          PUBLIC _bar
          _TEXT SEGMENT
          _bar PROC NEAR
          ; Line 14
          push esi
          ; Line 15
          mov esi, 100 ; 00000064H
          $L98:
          ; Line 18
          push 97 ; 00000061H
          call _putchar
          add esp, 4
          ; Line 19
          dec esi
          jne SHORT $L98
          pop esi
          ; Line 23
          ret 0
          _bar ENDP

          Comment

          • A. Sinan Unur

            #6
            Re: optimizers are overrated

            "copx" <copx@gazeta.pl wrote in news:fulste$336 $1@inews.gazeta .pl:
            Optimizers are overrated
            >
            I started learning ASM not long ago to improve my understanding of the
            hardware architecture and my ability to optimize C code. The results
            of my first experiment were surprising to say at least. After reading
            the chapter on loops in my ASM book I wanted to test whether modern C
            compilers are actually as smart as commonly claimed. I chose a most
            simple loop: calling putchar 100 times.
            And how many programs have you written which just print 100 a's?

            The argument for relying on the compiler has never been that they are
            perfect. Rather, in most cases, given a clean algorithm, you will get
            good enough results.

            Then, if there are performance sensitive areas of the code that are
            being executed too slowly given your criteria, you think hard about the
            algorithm and see if you can improve performance where it matters most
            through algorithmic changes.

            Maybe then it is time to figure out if there are any remaining
            bottlenecks that can be solved through hand tuning.

            Sure, you can hand to tune every single part of a large application but
            the world will have moved on by then.

            An example of this is seen in the example you chose. Any optimizations
            you make in decrementing/incrementing loop counters will be dwarfed by
            the fact that you have to make 100 IO calls.

            Constructing the string once and making one puts call should improve
            things more than fiddling with the loop counter especially if this
            function is called repeatedly.

            Look at the tests below. It takes about 3 seconds to run your version
            (t1.c) 1000 times. Whereas the version with just a single IO call per
            invocation (t2.c) (albeit with a longer string) takes only 0.2 seconds
            to finish the 1000 calls.

            As another check, I changed this second version to randomize the
            contents of the string. Even including the calls to rand to, 1000
            invocations still took about 0.2 seconds.

            E:\Testcat t1.c
            #include <stdio.h>

            void foo(void) {
            int i;
            for (i = 0; i < 100; i++) {
            putchar('a');
            }
            putchar('\n');
            }

            int main(void) {
            int i;
            for ( i = 0; i < 1000; ++i ) {
            foo();
            }
            return 0;
            }


            E:\Testcl /O2 /nologo t1.c
            t1.c

            TimeThis : Command Line : t1
            TimeThis : Start Time : Tue Apr 22 21:46:28 2008
            TimeThis : End Time : Tue Apr 22 21:46:31 2008
            TimeThis : Elapsed Time : 00:00:03.047

            E:\Testcat t2.c
            #include <stdio.h>
            #include <string.h>

            void foo(void) {
            char x[101];
            memset( x, 'a', 100 );
            x[100] = 0;
            puts( x );
            return;
            }

            int main(void) {
            int i;
            for ( i = 0; i < 1000; ++i) {
            foo();
            }
            return 0;
            }


            E:\Testcl /O2 /nologo t2.c
            t2.c

            TimeThis : Command Line : t2
            TimeThis : Start Time : Tue Apr 22 21:50:20 2008
            TimeThis : End Time : Tue Apr 22 21:50:20 2008
            TimeThis : Elapsed Time : 00:00:00.187

            For control, here is what I get with a dummy function:

            E:\Testcat t3.c
            void foo(void) { return; }

            int main(void) {
            int i = 0;
            for ( i = 0; i < 1000; ++i) {
            foo();
            }
            return 0;
            }

            E:Testcl /O2 /nologo t3.c
            t3.c

            TimeThis : Command Line : t3
            TimeThis : Start Time : Tue Apr 22 21:52:39 2008
            TimeThis : End Time : Tue Apr 22 21:52:39 2008
            TimeThis : Elapsed Time : 00:00:00.125


            In between each invocation, I ran the following program to combat any
            kind of cache effects:

            E:\Testcat flushmem.c
            #include <stdio.h>
            #include <stdlib.h>
            #include <string.h>


            int main(void) {
            char *p;
            size_t bufsize = 1024;

            while ( p = malloc( bufsize ) ) {
            memset( p, 0xda, bufsize );
            free( p );
            bufsize *= 2;
            printf("%x\n", bufsize/1024);
            }

            return 0;
            }

            --
            A. Sinan Unur <1usa@llenroc.u de.invalid>
            (remove .invalid and reverse each component for email address)

            Comment

            • copx

              #7
              Re: optimizers are overrated


              "A. Sinan Unur" <1usa@llenroc.u de.invalidschri eb im Newsbeitrag
              news:Xns9A88E10 F459FCasu1corne lledu@127.0.0.1 ...
              "copx" <copx@gazeta.pl wrote in news:fulste$336 $1@inews.gazeta .pl:
              >
              >Optimizers are overrated
              >>
              >I started learning ASM not long ago to improve my understanding of the
              >hardware architecture and my ability to optimize C code. The results
              >of my first experiment were surprising to say at least. After reading
              >the chapter on loops in my ASM book I wanted to test whether modern C
              >compilers are actually as smart as commonly claimed. I chose a most
              >simple loop: calling putchar 100 times.
              >
              And how many programs have you written which just print 100 a's?
              Come on! Obviously the point of this test was to figure out whether the
              compilers are smart enough to optimize a for (i = 0; i < x; i++) loop to a i
              = x do {}while (--i) loop. That is a simple micro-optimization you can do
              (on x86 and any other platform where the dec/jump trick works) and exactly
              the kind of stuff most regulars here would claim "is done by the compiler
              anyway". I just proved that this common assumption in wrong.

              [snip stuff about IO]

              As I said the point of this was not printing characters. Maybe I should have
              made it clearer by calling a function with no specified purpose in the loop.


              Comment

              • copx

                #8
                Re: optimizers are overrated


                "Ian Collins" <ian-news@hotmail.co mschrieb im Newsbeitrag
                news:677fp5F2ls kqqU9@mid.indiv idual.net...
                >The test is the whole point of optimizing this loop on x86. You do not
                >need
                >a test (cmp instruction) in the loop if you decrement towards zero
                >instead
                >of incrementing towards 100. This saves one instruction in the body of
                >loop.
                >
                You still have to test for 0, which may or may not be faster.
                No you don't and that's the point. Look at the output of GCC for the "bar"
                function for example, see any cmp? The dec instruction sets the necessary
                flags to terminate the loop if we reach zero. That's a feature of the x86
                instruction set (and others) a compiler could/should exploit to create more
                efficient loops.
                >The other obvious optimizations are using a register to hold the counter,
                >and skipping the first check because it is known at compile time that the
                >loop will always execute at least once. Not one compiler managed to do
                >all
                >that when feed with the "the optimizer will understand" version (foo).
                >>
                c99 appears to, generated
                >
                movl $100,%ebx ;/ line : 18
                leaq __iob+128(%rip) ,%r12 ;/ line : 18
                .align 16
                .CG6.21:
                movl $97,%edi ;/ line : 18
                movq %r12,%rsi ;/ line : 18
                call putc ;/ line : 18
                movl $97,%edi ;/ line : 18
                movq %r12,%rsi ;/ line : 18
                call putc ;/ line : 18
                movl $97,%edi ;/ line : 18
                movq %r12,%rsi ;/ line : 18
                call putc ;/ line : 18
                movl $97,%edi ;/ line : 18
                movq %r12,%rsi ;/ line : 18
                call putc ;/ line : 18
                movl $97,%edi ;/ line : 18
                movq %r12,%rsi ;/ line : 18
                call putc ;/ line : 18
                addl $-5,%ebx ;/ line : 19
                .LU7.69:
                testl %ebx,%ebx ;/ line : 19
                jne .CG6.21 ;/ line : 19
                Yet another way to translate this simple construct. Compilers certainly have
                personality!



                Comment

                • A. Sinan Unur

                  #9
                  Re: optimizers are overrated

                  "copx" <copx@gazeta.pl wrote in news:fum77a$qjc $1@inews.gazeta .pl:
                  >
                  "A. Sinan Unur" <1usa@llenroc.u de.invalidschri eb im Newsbeitrag
                  news:Xns9A88E10 F459FCasu1corne lledu@127.0.0.1 ...
                  >"copx" <copx@gazeta.pl wrote in news:fulste$336 $1@inews.gazeta .pl:
                  >>
                  >>Optimizers are overrated
                  >>>
                  >>I started learning ASM not long ago to improve my understanding of
                  >>the hardware architecture and my ability to optimize C code. The
                  >>results of my first experiment were surprising to say at least.
                  >>After reading the chapter on loops in my ASM book I wanted to test
                  >>whether modern C compilers are actually as smart as commonly
                  >>claimed. I chose a most simple loop: calling putchar 100 times.
                  >>
                  >And how many programs have you written which just print 100 a's?
                  >
                  Come on! Obviously the point of this test was to figure out whether
                  the compilers are smart enough to optimize
                  And the point of my post that the programmer ought to be smart enough to
                  be able understand when the effort spent in beating the compiler is
                  worth it.
                  loop to a i = x do {}while (--i) loop. That is a simple
                  micro-optimization you can do (on x86 and any other platform where the
                  dec/jump trick works) and exactly the kind of stuff most regulars here
                  would claim "is done by the compiler anyway". I just proved that this
                  common assumption in wrong.
                  And I don't see why anyone ought to care even if you are correct.

                  Sinan
                  --
                  A. Sinan Unur <1usa@llenroc.u de.invalid>
                  (remove .invalid and reverse each component for email address)

                  Comment

                  • copx

                    #10
                    Re: optimizers are overrated


                    <robertwessel2@ yahoo.comschrie b im Newsbeitrag
                    news:104814e3-8638-4180-b2ff-e103d9824269@f3 6g2000hsa.googl egroups.com...
                    [snip]
                    >So just how much abuse do *you* think you deserve for testing, and
                    >complaining about, a compiler over a *decade* old?
                    Eh, none? Since when is testing old software offensive?
                    I did not try to claim that this reflects the current performance of the MS
                    compiler. In fact, I explicitly asked "Are MS compilers still this bad?"
                    >Especially when you can download a current version for free.
                    The current version is not simply better than the old one. AFAIK software
                    built with VS2008 won't run on older versions of Windows or so I have been
                    told.
                    >But be that as it may, you've clearly not managed to run the compiler
                    correctly

                    Maybe, I have never used the command line version before and rarely use VC
                    anyway.
                    >because my copy of VC6 generates the following when run
                    >with -O2 or -Ox. In fact, the output you included appears to be the
                    >default, non-optimized output for VC6.
                    I compiled with /O2. So -O2 is the correct form? Strange, I could swear cl
                    /help suggested the /O2 form.. But maybe I confused something there.
                    MSVC6 ("cl -Ox -c -Fa test48.c"):
                    [snip]

                    If that is true, VC6 moves from the bottom to the top. The first compiler
                    which actually manages to properly optimize the for loop!



                    Comment

                    • Eric Sosman

                      #11
                      Re: optimizers are overrated

                      copx wrote:
                      "A. Sinan Unur" <1usa@llenroc.u de.invalidschri eb im Newsbeitrag
                      news:Xns9A88E10 F459FCasu1corne lledu@127.0.0.1 ...
                      >"copx" <copx@gazeta.pl wrote in news:fulste$336 $1@inews.gazeta .pl:
                      >>
                      >>Optimizers are overrated
                      >>>
                      >>I started learning ASM not long ago to improve my understanding of the
                      >>hardware architecture and my ability to optimize C code. The results
                      >>of my first experiment were surprising to say at least. After reading
                      >>the chapter on loops in my ASM book I wanted to test whether modern C
                      >>compilers are actually as smart as commonly claimed. I chose a most
                      >>simple loop: calling putchar 100 times.
                      >And how many programs have you written which just print 100 a's?
                      >
                      Come on! Obviously the point of this test was to figure out whether the
                      compilers are smart enough to optimize a for (i = 0; i < x; i++) loop to a i
                      = x do {}while (--i) loop. That is a simple micro-optimization you can do
                      (on x86 and any other platform where the dec/jump trick works) and exactly
                      the kind of stuff most regulars here would claim "is done by the compiler
                      anyway". I just proved that this common assumption in wrong.
                      If you give your car a nice fresh coat of wax, do you
                      improve its fuel efficiency by reducing air drag or hurt
                      the efficiency by increasing weight?

                      Compiler writers do not stay up nights trying to figure
                      out how to optimize silly loops; they give their attention
                      to getting more "serious" programs to run well. They worry
                      about how to map many local variables to a few CPU registers;
                      you don't have enough variables to notice. They worry about
                      eliminating common sub-expressions; you don't have any and
                      again can't see any optimization. They worry about strength
                      reductions; your sample has no strength to be reduced. They
                      worry about cache lines, about prefetching, about filling the
                      various instruction pipelines, about branch prediction ...
                      and you are oblivious to all of these.

                      You'd probably call Bach overrated because he never wrote
                      any good kazoo concertos.

                      --
                      Eric Sosman
                      esosman@ieee-dot-org.invalid

                      Comment

                      • copx

                        #12
                        Re: optimizers are overrated


                        "A. Sinan Unur" <1usa@llenroc.u de.invalidschri eb im Newsbeitrag
                        news:Xns9A88E9F C7166Basu1corne lledu@127.0.0.1 ...
                        [snip]
                        And the point of my post that the programmer ought to be smart enough to
                        be able understand when the effort spent in beating the compiler is
                        worth it.
                        I wasn't trying to beat the compiler. I just measured its performance.

                        I disagree with your point that optimization should be limited to profiled
                        bottlenecks and choosing the right algorithm. In fact, I suspect that this
                        common belief (which really isn't news to me - you hear that from 90% of
                        all programmers these days) is responsible for software disasters like
                        Microsoft Vista. If you ignore efficiency issues completely while writing
                        your program you will not be able to just shave away all the wasted RAM and
                        CPU cycles at the end by rewriting a single central algorithm after
                        profiling. Whatever, when to optimize or not is not the topic of this
                        thread. In the professional world the answer to that question is determined
                        by market forces anyway I guess.

                        What I am trying to discuss here is what you can except a C compiler to
                        optimize and what not. I posted my results to counter the misinformation
                        which has been spread here in the past (without bad intent most of the time
                        for sure).

                        [snip]



                        Comment

                        • robertwessel2@yahoo.com

                          #13
                          Re: optimizers are overrated

                          On Apr 22, 10:41 pm, "copx" <c...@gazeta.pl wrote:
                          I wasn't trying to beat the compiler. I just measured its performance.

                          Of course, you didn't do that either - just counting generated
                          instructions is hardly definitive on modern processors. For example,
                          dec/jne is faster on many processors than a single loop instruction.
                          Also the current versions of MSVC generate "sub esi,1" rather than
                          "dec esi", since the former is faster on many CPUs.

                          Modern CPUs are complex enough that actually measuring performance is
                          the only way to tell if a particular optimization is successful.

                          Comment

                          • copx

                            #14
                            Re: optimizers are overrated


                            "Eric Sosman" <esosman@ieee-dot-org.invalidschr ieb im Newsbeitrag
                            news:8PCdne2DTc UcNJPVnZ2dnUVZ_ ommnZ2d@comcast .com...
                            [snip]
                            If you give your car a nice fresh coat of wax, do you
                            improve its fuel efficiency by reducing air drag or hurt
                            the efficiency by increasing weight?
                            >
                            Compiler writers do not stay up nights trying to figure
                            out how to optimize silly loops; they give their attention
                            to getting more "serious" programs to run well.
                            "Serious" programs probably contain many of such "silly loops". And what
                            exactly is "silly" about loops based on incrementing/decrementing an integer
                            value counting towards a maximum/minimum? Have you written many "serious"
                            programs without them?
                            A loop like this executed lets say a million times means one million wasted
                            instructions. Of course, that won't matter most the the time, I am not
                            trying to argue with that.
                            They worry about how to map many local variables to a few CPU registers;
                            you don't have enough variables to notice. They worry about
                            eliminating common sub-expressions; you don't have any and
                            again can't see any optimization. They worry about strength
                            reductions; your sample has no strength to be reduced. They
                            worry about cache lines, about prefetching, about filling the
                            various instruction pipelines, about branch prediction ...
                            and you are oblivious to all of these.
                            Thanks for giving me ideas for what to test next!
                            You'd probably call Bach overrated because he never wrote
                            any good kazoo concertos.
                            Error: analogy mismatch



                            Comment

                            • copx

                              #15
                              Re: optimizers are overrated


                              <robertwessel2@ yahoo.comschrie b im Newsbeitrag
                              news:dea1a1cb-9269-45dd-b49f-23006423ff6e@z7 2g2000hsb.googl egroups.com...
                              On Apr 22, 10:41 pm, "copx" <c...@gazeta.pl wrote:
                              I wasn't trying to beat the compiler. I just measured its performance.
                              >
                              >Of course, you didn't do that either - just counting generated
                              >instructions is hardly definitive on modern processors.
                              Ok, good point. I will measure execution time next time, too.



                              Comment

                              Working...