Benchmarks

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Nick Keighley

    #16
    Re: Benchmarks

    On 6 Nov, 15:53, s0s...@gmail.co m wrote:
    The task: Write a program that reads a set of words from standard
    input and prints the number of distinct words.
    >
    I came across a website that listed a few programs to accomplish this
    task:http://unthought.net/c++/c_vs_c++.html(ignore all the language
    flaming :-), and thought that all of them did unnecessary operations,
    so I wrote my own. But for some reason, my version turned out slower
    that ALL of the versions in the website, even though it seems to
    perform less operations (yes, I benchmarked them on my own computer).
    >
    According to the website, the slowest version is:
    >
    #include <set>
    #include <string>
    #include <iostream>
    >
    int main(int argc, char **argv)
    {
            // Declare and Initialize some variables
            std::string word;
            std::set<std::s tringwordcount;
            // Read words and insert in rb-tree
            while (std::cin >word) wordcount.inser t(word);
            // Print the result
            std::cout << "Words: " << wordcount.size( ) << std::endl;
            return 0;
    >
    }
    the above uses an rb tree (or equivalent) in the set class

    My version is about 12 times slower than that. It uses lower-level
    constructs. Here it is:
    [snip version using linear search]
    Any ideas as to what causes the big slowdown?

    this is a very important lesson. Print it out in big letters
    and post it on your wall.

    CAREFUL ALGORITH SELECTION CAN CREAM MICRO-OPTIMISATION

    I may get this printed on a tee shirt

    --
    Nick Keighley

    Comment

    • CBFalconer

      #17
      Re: Benchmarks

      Juha Nieminen wrote:
      s0suk3@gmail.co m wrote:
      >
      >Any ideas as to what causes the big slowdown?
      >
      Why do you expect that searching a linked list could be even
      close to the speed of searching a balanced binary tree?
      Which is O(log N), as compared to O(N). However a hashtable is
      even faster, being O(1) for suitable organization.

      F'ups set to c.l.c. only.

      --
      [mail]: Chuck F (cbfalconer at maineline dot net)
      [page]: <http://cbfalconer.home .att.net>
      Try the download section.

      Comment

      • CBFalconer

        #18
        Re: Benchmarks

        user923005 wrote:
        s0s...@gmail.co m wrote:
        >
        >The task: Write a program that reads a set of words from
        >standard input and prints the number of distinct words.
        >
        [snip]
        >
        I am surprised chuck has not chimed in here.
        A hash table is *ideal* for this.
        I just did, a few minutes ago. Been having problems with the
        news-server.

        --
        [mail]: Chuck F (cbfalconer at maineline dot net)
        [page]: <http://cbfalconer.home .att.net>
        Try the download section.

        Comment

        • Juha Nieminen

          #19
          Re: Benchmarks

          CBFalconer wrote:
          Juha Nieminen wrote:
          >s0suk3@gmail.co m wrote:
          >>
          >>Any ideas as to what causes the big slowdown?
          >Why do you expect that searching a linked list could be even
          >close to the speed of searching a balanced binary tree?
          >
          Which is O(log N), as compared to O(N). However a hashtable is
          even faster, being O(1) for suitable organization.
          >
          F'ups set to c.l.c. only.
          Why? Because computational complexities only apply to C?

          Comment

          • Phil Carmody

            #20
            Re: Benchmarks

            Juha Nieminen <nospam@thanks. invalidwrites:
            user923005 wrote:
            >The hash table solution is O(n).
            >
            You would have hard time proving that.
            He shouldn't. It should be pretty clear. No hash table needs
            to examine each element more than once.

            Phil
            --
            We must respect the other fellow's religion, but only in the sense and to the
            extent that we respect his theory that his wife is beautiful and his children
            smart. -- Henry Louis Mencken (1880-1956), American editor and critic

            Comment

            • Juha Nieminen

              #21
              Re: Benchmarks

              Nick Keighley wrote:
              CAREFUL ALGORITH SELECTION CAN CREAM MICRO-OPTIMISATION
              That's so true. However, we shouldn't completely dismiss
              "micro-optimization" and "hacker-optimization" once we have come
              up with the fastest possible overall algorithm. Even though two
              implementations might both by eg. O(n lg n) with a rather small
              constant factor, one of them might still be 10 times faster than
              the other if it performs clever low-level tricks inside the
              innermost loop (which might be run eg. millions of times).

              Of course when dealing with hacker optimization, there's always
              a point where it's too detrimental to the readability and
              maintainability of the code in relation to the speed benefit it
              gives. If the code becomes significantly more obfuscated while
              giving a speed boost of 1%, it might not be worth the trouble.
              Personally I prefer to write 100 lines of clear, readable and
              maintainable code, even if it's 1% slower than 1000 lines of code
              worthy of the IOCCC.

              Comment

              • Phil Carmody

                #22
                Re: Benchmarks

                Phil Carmody <thefatphil_dem unged@yahoo.co. ukwrites:
                Juha Nieminen <nospam@thanks. invalidwrites:
                >user923005 wrote:
                >>The hash table solution is O(n).
                >>
                > You would have hard time proving that.
                >
                He shouldn't. It should be pretty clear. No hash table needs
                to examine each element more than once.
                Oooops, suffering from Falconeritis - but you did over-snip rather.
                The whole task is indeed o(n^2), but omega(n).

                Phil
                --
                We must respect the other fellow's religion, but only in the sense and to the
                extent that we respect his theory that his wife is beautiful and his children
                smart. -- Henry Louis Mencken (1880-1956), American editor and critic

                Comment

                • Hendrik Schober

                  #23
                  Re: Benchmarks

                  Juha Nieminen wrote:
                  [...]
                  Personally I prefer to write 100 lines of clear, readable and
                  maintainable code, even if it's 1% slower than 1000 lines of code
                  worthy of the IOCCC.
                  It's always easier to make a bug-free program fast than to
                  bug-fix a fast program. :)

                  Schobi

                  Comment

                  • Juha Nieminen

                    #24
                    Re: Benchmarks

                    user923005 wrote:
                    Cuckoo hashing has guaranteed O(1) lookup and delete and amortized
                    O(1) insert.
                    Wikipedia says: "Lookup requires inspection of just two locations in
                    the hash table, which takes constant time in the worst case".

                    I don't understand how that is possible. Suppose an element is
                    inserted into the table. Then later a new element displaces this first
                    element to a second location. Then later another new element, to be put
                    in this second location, displaces this first element to a third
                    location. How exactly do you find this first element anymore? It's not
                    in either of the two alternative locations defined by the two hashing
                    functions for that element.

                    Comment

                    • James Kanze

                      #25
                      Re: Benchmarks

                      On Nov 7, 10:25 am, Juha Nieminen <nos...@thanks. invalidwrote:
                      Nick Keighley wrote:
                      CAREFUL ALGORITH SELECTION CAN CREAM MICRO-OPTIMISATION
                      That's so true. However, we shouldn't completely dismiss
                      "micro-optimization" and "hacker-optimization" once we have
                      come up with the fastest possible overall algorithm.
                      Yes, but this should only be undertaken when necessary, under
                      the control of a profiler---what optimizes on one platform may
                      slow down on another. And don't forget that a good compiler
                      will do many of these optimizations for you. A long time ago, I
                      tried replacing "h = 127 * h + *p" (in a hashing algorithm) with
                      "h = (h << 7 - h) + *p", knowing that my hardware didn't have
                      hardware multiply. The results were slower than the original;
                      the compiler also converted the multiplication into a shift and
                      subtract, and since it knew the final purpose in those
                      operations, was actually able to save one machine instruction
                      over an explicit shift and subtract.
                      Even though two implementations might both by eg. O(n lg n)
                      with a rather small constant factor, one of them might still
                      be 10 times faster than the other if it performs clever
                      low-level tricks inside the innermost loop (which might be run
                      eg. millions of times).
                      I've never seen a factor of 10 here. Even a factor of 2 is
                      exceptional (and usually indicative of a very poor optimizer in
                      the compiler).

                      --
                      James Kanze (GABI Software) email:james.kan ze@gmail.com
                      Conseils en informatique orientée objet/
                      Beratung in objektorientier ter Datenverarbeitu ng
                      9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

                      Comment

                      • James Dow Allen

                        #26
                        Re: Benchmarks

                        On Nov 7, 4:51 am, c...@tiac.net (Richard Harter) wrote:
                        Just as a note, the common claim that hash table are O(1) per
                        access whereas search trees are O(log n) per access is
                        misleading, ...computing the hash code ...
                        is necessarily greater
                        than log2(n)....
                        hash table      - O(log n)
                        Note that, for sufficiently pedantic O(.)
                        estimates, many of the usual O(.) measures are wrong.
                        A single arithmetic op may be considered to require
                        O(log N) time since log N bits are needed to represent
                        an index on a set of size N.

                        Returning to OP's problem, I wonder: are Compact
                        Hash Trees are in wide use? If there's interest
                        I'll post my implementation.

                        James Dow Allen

                        Comment

                        • Juha Nieminen

                          #27
                          Re: Benchmarks

                          James Kanze wrote:
                          >Even though two implementations might both by eg. O(n lg n)
                          >with a rather small constant factor, one of them might still
                          >be 10 times faster than the other if it performs clever
                          >low-level tricks inside the innermost loop (which might be run
                          >eg. millions of times).
                          >
                          I've never seen a factor of 10 here. Even a factor of 2 is
                          exceptional (and usually indicative of a very poor optimizer in
                          the compiler).
                          There's just so much a compiler can do. There are cases where only the
                          programmer can understand what's going on (rather than the compiler) and
                          be able to optimize something.

                          For example, comparing two strings for inequality is O(1) if the
                          maximum size of the string is fixed (eg. you know that no string has
                          more than 30 characters), but still much slower than comparing two
                          integers. If you can somehow change the string comparison to an integer
                          comparison, and these comparisons are performed millions of times in the
                          innermost loop of your code, the program may speed up considerably. This
                          is a kind of optimization which is completely out or the realm of
                          compiler optimizations.

                          Comment

                          • Jerry Coffin

                            #28
                            Re: Benchmarks

                            In article <79b10b2f-6abf-4440-9d04-f731e6b51bf2
                            @f40g2000pri.go oglegroups.com> , james.kanze@gma il.com says...

                            [ ... ]
                            Even though two implementations might both by eg. O(n lg n)
                            with a rather small constant factor, one of them might still
                            be 10 times faster than the other if it performs clever
                            low-level tricks inside the innermost loop (which might be run
                            eg. millions of times).
                            >
                            I've never seen a factor of 10 here. Even a factor of 2 is
                            exceptional (and usually indicative of a very poor optimizer in
                            the compiler).
                            I've seen a factor of 10, and even a _little_ bit more -- but it was
                            _extremely_ hardware specific, depending almost entirely on the cache
                            organization of the processor. As the data was originally placed in
                            memory, two crucial items ended up contending for the same cache lines.
                            Rearranging the data allowed those items to live in different cache
                            lines.

                            In this case, the CPU involved wasn't even made when the compiler was
                            written, so I have a hard time blaming the compiler for the poor
                            optimization. Unfortunately, I don't know of any newer compilers that
                            would do much better (and when they do, it's mostly by accident...)

                            --
                            Later,
                            Jerry.

                            The universe is a figment of its own imagination.

                            Comment

                            • Jerry Coffin

                              #29
                              Re: Benchmarks

                              In article <gf120o$jnq$2@h oshi.visyn.net> , spamtrap@gmx.de says...
                              Juha Nieminen wrote:
                              [...]
                              Personally I prefer to write 100 lines of clear, readable and
                              maintainable code, even if it's 1% slower than 1000 lines of code
                              worthy of the IOCCC.
                              >
                              It's always easier to make a bug-free program fast than to
                              bug-fix a fast program. :)
                              Oft repeated, but not really true. Sometimes a fast program still
                              contains a trivial bug, but I can remember one program in particular
                              that was bug-free as far as I know, but a horrible mess, to the point
                              that by _far_ the easiest way to improve it was write a set of
                              specifications for what it did, and start over. By the time I realized
                              that, however, I'd spent more time trying to optimize one small part
                              than it eventually took to rewrite the whole thing...

                              --
                              Later,
                              Jerry.

                              The universe is a figment of its own imagination.

                              Comment

                              • Richard Harter

                                #30
                                Re: Benchmarks

                                On Fri, 07 Nov 2008 09:41:08 GMT, Juha Nieminen
                                <nospam@thanks. invalidwrote:
                                >user923005 wrote:
                                >Cuckoo hashing has guaranteed O(1) lookup and delete and amortized
                                >O(1) insert.
                                >
                                Wikipedia says: "Lookup requires inspection of just two locations in
                                >the hash table, which takes constant time in the worst case".
                                >
                                I don't understand how that is possible. Suppose an element is
                                >inserted into the table. Then later a new element displaces this first
                                >element to a second location. Then later another new element, to be put
                                >in this second location, displaces this first element to a third
                                >location. How exactly do you find this first element anymore? It's not
                                >in either of the two alternative locations defined by the two hashing
                                >functions for that element.
                                It doesn't work like that, though you had me perplexed for a
                                moment. The first element goes back to its original location,
                                displacing the second. Thus, suppose that h0, h1, etc are
                                locations and the first three element are e0, e1, e2, and that
                                their address pairs are:

                                e0: h0,h1
                                e1: h0,h2
                                e2: h1,h3

                                Initially e0 is inserted in location h0. When e1 is inserted, e0
                                is moved to h1, and e1 is inserted in h0. The contents of the
                                table look like this:

                                h0: e1
                                h1: e0
                                h2: -
                                h3: -

                                Now e2 is inserted; it goes into location h1, displacing e0. In
                                turn e0 goes back to its alternate location h0, displacing e1,
                                which has to go to its alternate location, h2. The table now
                                looks like this:

                                h0: e0
                                h1: e2
                                h2: e1
                                h3: -




                                Richard Harter, cri@tiac.net
                                http://home.tiac.net/~cri, http://www.varinoma.com
                                Save the Earth now!!
                                It's the only planet with chocolate.

                                Comment

                                Working...