Expert Help needed with hash_multimap

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • JN

    Expert Help needed with hash_multimap

    Hello. Sorry about the length of this post. I am trying to implement a hash
    table for my chess program, and decided to try using the VC++ . NET
    (Dinkumware) version of hash_map.
    The role of the hashtable is to store HashEntry objects, which contain
    information about chess positions.
    A BitBoard is just an __int64, and this is used to determine the hash key.
    I have not shown the definition of class ChessPosition, but all you need to
    know is that it contains a number of BitBoards as members. The Member 'A' is
    used to create the key.

    All of this seems to work quite well (and fast!), but I have a few
    questions:

    1. I am still confused about the correct way to use hash_compare. The
    documentation suggests that you either make your own, or derive your own
    class from hash_compare and selectively override functions as required (eg
    the hash function). I have gone for the first option, and called it
    MyHashCompare . Can you please tell me if I have done anything wrong ?

    2. As I insert items, the hash_multimap just gets bigger and BIGGER, and
    BIGGER !! (it is eating about 3 megs/second on my system). Is there any way
    to put a limit on the size of the hash_multimap, say limiting it 256 MB ? If
    not, is it more appropriate to simply use a container with a fixed size. (A
    static size would be totally acceptable)

    3. When retrieving HashEntry's, another part of my program (the search()
    function) expects to get a pointer to a HashEntry, which is why I have
    really ugly code like:
    pH=&((*it).seco nd);
    Is there a better way of interfacing with the rest of the program, and if
    so, what function prototype would you recommend for LookupHash() ?


    // hash.h:

    #ifndef _HASH_H
    #define _HASH_H 1

    #include <hash_map>
    #include "movegen.h"

    using namespace std;

    size_t Hash(BitBoard B);

    enum HashEntryFlags
    {
    hashfVALUNKNOWN ,
    hashfEXACT,
    hashfALPHA,
    hashfBETA
    };

    class HashEntry
    {
    public:
    HashEntry()
    {
    Flags=hashfVALU NKNOWN;
    }
    ChessPosition P;
    int Depth;
    HashEntryFlags Flags;
    int Value;
    Move bestMove;
    };

    class MyHashCompare
    {
    public:
    enum
    {
    // parameters for hash table
    bucket_size = 4, // 0 < bucket_size
    min_buckets = 8 // min_buckets = 2 ^^ N, 0 < N
    };

    size_t operator()(cons t BitBoard& A) const
    {

    // hash function. Thanks to Thomas Wang for this
    // (http://www.concentric.net/~Ttwang/tech/inthash.htm)

    BitBoard B(A);
    B += ~(B << 32);
    B ^= (B >> 22);
    B += ~(B << 13);
    B ^= (B >> 8);
    B += (B << 3);
    B ^= (B >> 15);
    B += ~(B << 27);
    B ^= (B >> 31);
    return static_cast<siz e_t>(B);
    }

    bool operator()(cons t BitBoard A, const BitBoard B) const
    {
    // Used for determining order when keys are the same
    return (A<B);
    }
    };

    typedef hash_multimap<B itBoard,HashEnt ry,MyHashCompar e> HashTable;

    bool InsertHash(Hash Table& rT, const HashEntry& rH);
    const HashEntry* LookupHash(cons t ChessPosition& rP, const HashTable& rT);

    #endif


    // hash.cpp:

    #include "hash.h"

    bool InsertHash(Hash Table& rT, const HashEntry& rH)
    {
    rT.insert(HashT able::value_typ e((rH.P.A) /* BitBoard 'A' of Position used
    as key */,rH));
    return true;
    }

    const HashEntry* LookupHash(cons t ChessPosition& rP, const HashTable& rT)
    {

    // Define a pair of iterators, and get an equal range
    pair<HashTable: :const_iterator ,HashTable::con st_iterator> p=
    rT.equal_range( rP.A);

    // Return first entry with a matching Position
    const HashEntry* pH=NULL;
    for(HashTable:: const_iterator it = p.first; it != p.second; ++it)
    {

    if( ((*it).second.P ) == rP )
    {
    pH=&((*it).seco nd);
    break;
    }
    }
    return pH;
    }


  • Buster

    #2
    Re: Expert Help needed with hash_multimap

    JN wrote:
    [color=blue]
    > 3. When retrieving HashEntry's, another part of my program (the search()
    > function) expects to get a pointer to a HashEntry, which is why I have
    > really ugly code like:
    > pH=&((*it).seco nd);[/color]

    It's not that ugly. If you translate it to "ph = & it->second", it's
    crystal clear. Your LookupHash function provides a simple syntax for a
    special case, to supplement the existing more general syntax. It's
    appropriate for the little translation from iterator to pointer to be
    part of this function.
    [color=blue]
    > Is there a better way of interfacing with the rest of the program, and if
    > so, what function prototype would you recommend for LookupHash() ?[/color]

    It strikes me that what you have is equivalent to using a
    hash_[multi]set of HashEntry objects, with hash function

    class HashEntryCompar e
    {
    public:
    // ...
    size_t operator () (const HashEntry & entry) const
    { return compare_bitboar ds (entry.P.A); }

    size_t operator () (const HashEntry & a, const HashEntry & b) const
    { return compare_bitboar ds (a.P.A, b.P.A); }

    private:
    // ...
    MyHashCompare compare_bitboar ds;
    };

    (I've tried to imitate your syntax, because I haven't got the same
    hash_map implementation as you, or its documentation.)

    If that's the case your code would be simpler with sets. You might need
    fewer syntax-simplifying functions.

    Regards,
    Buster.

    Comment

    • John Harrison

      #3
      Re: Expert Help needed with hash_multimap

      >[color=blue]
      > 2. As I insert items, the hash_multimap just gets bigger and BIGGER, and
      > BIGGER !! (it is eating about 3 megs/second on my system). Is there any[/color]
      way[color=blue]
      > to put a limit on the size of the hash_multimap, say limiting it 256 MB ?[/color]
      If[color=blue]
      > not, is it more appropriate to simply use a container with a fixed size.[/color]
      (A[color=blue]
      > static size would be totally acceptable)[/color]

      I believe that real chess programs handle this by selectively discarding
      infrequently accessed parts of the hash table.

      Since efficiency is obviously going to be a concern, I think you are
      inevitably going to have to write your own custom data structure sooner or
      later. Hash tables are actually pretty easy to program, suggest you get a
      book on data structures.

      john


      Comment

      • P.J. Plauger

        #4
        Re: Expert Help needed with hash_multimap

        "JN" <jniemann@ipa.c om.au> wrote in message
        news:404527fe$1 _1@news.iprimus .com.au...
        [color=blue]
        > Hello. Sorry about the length of this post. I am trying to implement a[/color]
        hash[color=blue]
        > table for my chess program, and decided to try using the VC++ . NET
        > (Dinkumware) version of hash_map.
        > The role of the hashtable is to store HashEntry objects, which contain
        > information about chess positions.
        > A BitBoard is just an __int64, and this is used to determine the hash key.
        > I have not shown the definition of class ChessPosition, but all you need[/color]
        to[color=blue]
        > know is that it contains a number of BitBoards as members. The Member 'A'[/color]
        is[color=blue]
        > used to create the key.
        >
        > All of this seems to work quite well (and fast!), but I have a few
        > questions:
        >
        > 1. I am still confused about the correct way to use hash_compare. The
        > documentation suggests that you either make your own, or derive your own
        > class from hash_compare and selectively override functions as required (eg
        > the hash function). I have gone for the first option, and called it
        > MyHashCompare . Can you please tell me if I have done anything wrong ?[/color]

        Your compare function is:
        [color=blue]
        > return (A<B);[/color]

        which is quite likely a strict weak ordering, as required. Looks fine
        to me.
        [color=blue]
        > 2. As I insert items, the hash_multimap just gets bigger and BIGGER, and
        > BIGGER !! (it is eating about 3 megs/second on my system). Is there any[/color]
        way[color=blue]
        > to put a limit on the size of the hash_multimap, say limiting it 256 MB ?[/color]
        If[color=blue]
        > not, is it more appropriate to simply use a container with a fixed size.[/color]
        (A[color=blue]
        > static size would be totally acceptable)[/color]

        You can slow its growth by making bucket_size larger. Right now you
        have the default:
        [color=blue]
        > bucket_size = 4, // 0 < bucket_size[/color]

        But note that you'll trade off a smaller hash table against a
        slightly slower lookup and insert.
        [color=blue]
        > 3. When retrieving HashEntry's, another part of my program (the search()
        > function) expects to get a pointer to a HashEntry, which is why I have
        > really ugly code like:
        > pH=&((*it).seco nd);
        > Is there a better way of interfacing with the rest of the program, and if
        > so, what function prototype would you recommend for LookupHash() ?[/color]

        You can also write it as:

        pH = &it->second;

        P.J. Plauger
        Dinkumware, Ltd.



        Comment

        • JN

          #5
          Re: Expert Help needed with hash_multimap

          Thanks John,

          I certainly have access to many examples of HashTable code, but I wanted to
          try an STL-style implentation. However, I seem to already have wasted far
          too much time going down this path, so I think I will revert back to a
          traditional (C-Style) data structure. btw, when you referred to "real chess
          programs", I'm sure you meant it in the nicest possible way :) ..

          Regards,
          JN

          "John Harrison" <john_andronicu s@hotmail.com> wrote in message
          news:c23uu7$1ni oqm$1@ID-196037.news.uni-berlin.de...[color=blue][color=green]
          > >
          > > 2. As I insert items, the hash_multimap just gets bigger and BIGGER, and
          > > BIGGER !! (it is eating about 3 megs/second on my system). Is there any[/color]
          > way[color=green]
          > > to put a limit on the size of the hash_multimap, say limiting it 256 MB[/color][/color]
          ?[color=blue]
          > If[color=green]
          > > not, is it more appropriate to simply use a container with a fixed size.[/color]
          > (A[color=green]
          > > static size would be totally acceptable)[/color]
          >
          > I believe that real chess programs handle this by selectively discarding
          > infrequently accessed parts of the hash table.
          >
          > Since efficiency is obviously going to be a concern, I think you are
          > inevitably going to have to write your own custom data structure sooner or
          > later. Hash tables are actually pretty easy to program, suggest you get a
          > book on data structures.
          >
          > john
          >
          >[/color]


          Comment

          • JN

            #6
            Re: Expert Help needed with hash_multimap

            Thanks Buster.

            This is good.

            Cheers,

            "Buster" <noone@nowhere. com> wrote in message
            news:c23gsk$ogd $1@newsg3.svr.p ol.co.uk...[color=blue]
            > JN wrote:
            >[color=green]
            > > 3. When retrieving HashEntry's, another part of my program (the search()
            > > function) expects to get a pointer to a HashEntry, which is why I have
            > > really ugly code like:
            > > pH=&((*it).seco nd);[/color]
            >
            > It's not that ugly. If you translate it to "ph = & it->second", it's
            > crystal clear. Your LookupHash function provides a simple syntax for a
            > special case, to supplement the existing more general syntax. It's
            > appropriate for the little translation from iterator to pointer to be
            > part of this function.
            >[color=green]
            > > Is there a better way of interfacing with the rest of the program, and[/color][/color]
            if[color=blue][color=green]
            > > so, what function prototype would you recommend for LookupHash() ?[/color]
            >
            > It strikes me that what you have is equivalent to using a
            > hash_[multi]set of HashEntry objects, with hash function
            >
            > class HashEntryCompar e
            > {
            > public:
            > // ...
            > size_t operator () (const HashEntry & entry) const
            > { return compare_bitboar ds (entry.P.A); }
            >
            > size_t operator () (const HashEntry & a, const HashEntry & b) const
            > { return compare_bitboar ds (a.P.A, b.P.A); }
            >
            > private:
            > // ...
            > MyHashCompare compare_bitboar ds;
            > };
            >
            > (I've tried to imitate your syntax, because I haven't got the same
            > hash_map implementation as you, or its documentation.)
            >
            > If that's the case your code would be simpler with sets. You might need
            > fewer syntax-simplifying functions.
            >
            > Regards,
            > Buster.[/color]


            Comment

            • John Harrison

              #7
              Re: Expert Help needed with hash_multimap


              "JN" <jniemann@ipa.c om.au> wrote in message
              news:4046a7ef_1 @news.iprimus.c om.au...[color=blue]
              > Thanks John,
              >
              > I certainly have access to many examples of HashTable code, but I wanted[/color]
              to[color=blue]
              > try an STL-style implentation. However, I seem to already have wasted far
              > too much time going down this path, so I think I will revert back to a
              > traditional (C-Style) data structure. btw, when you referred to "real[/color]
              chess[color=blue]
              > programs", I'm sure you meant it in the nicest possible way :) ..
              >[/color]

              Yes of course. Maybe I was think of my own attempts to write a chess
              program. It ran on a batch system (i.e. no user interaction allowed) where I
              was limited to 6 submissions a day. Therefore I could only play six moves a
              day. It only ever played one game, which was unfinished due to the program
              crashing when pawn promotion was a possibility.

              I never wrote another chess playing program.

              A serious point, I sometimes doubt the benefits of OO for algorithmically
              complex situations like a chess playing engine. Because the things that OO
              is good at like separating interface and implementation and assigning
              responsibilitie s to well defined sections of code don't really address the
              important issues in getting complex algorithm right. The user interface of
              your chess program would be a completely different matter of course.

              With the STL hash table you'll undoubtedly get a good implementation of a
              general purpose hash table, but that doesn't mean that you can't do better
              with your application specific knowledge.

              John


              Comment

              • JN

                #8
                Re: Expert Help needed with hash_multimap

                > Yes of course. Maybe I was think of my own attempts to write a chess[color=blue]
                > program. It ran on a batch system (i.e. no user interaction allowed) where[/color]
                I[color=blue]
                > was limited to 6 submissions a day. Therefore I could only play six moves[/color]
                a[color=blue]
                > day. It only ever played one game, which was unfinished due to the program
                > crashing when pawn promotion was a possibility.[/color]

                LOL, you must have been a nervous wreck. The anticipation must have been
                unbearable.
                [color=blue]
                > A serious point, I sometimes doubt the benefits of OO for algorithmically
                > complex situations like a chess playing engine. Because the things that OO
                > is good at like separating interface and implementation and assigning
                > responsibilitie s to well defined sections of code don't really address the
                > important issues in getting complex algorithm right. The user interface of
                > your chess program would be a completely different matter of course.[/color]

                I agree with you. I was one of those people who programmed in C before
                learning C++, but I am trying to be a good C++ programmer.
                I am trying to avoid using all the things that Marshall Cline says are
                "Evil" eg arrays, macros, pointers etc, and adopt a modern "standard C++"
                style.
                However, every time I have tried to rewrite my code to use STL and/or OO, I
                have been suffering HUGE performance hits (like 1/10 of the original speed).

                As an example, My move generator function populates a fixed-length array of
                class Move. I thought it would be a nice idea to use a deque<Move> instead.
                That way, I could achieve some nice pre-ordering of moves by pushing
                "interestin g" moves (eg checks,captures etc) to the front, and "ordinary"
                moves to the back. (The move-ordering is fairly important, as it improves
                the efficiency of the minimax search algorithm.)
                After spending way too much time rewriting my GenerateMoves() function to
                use a deque instead of an array, I was horribly dissappointed to find that
                my move-generation performance benchmark (which calls GenerateMoves() 1
                million times) changed from 5 seconds to over 60 seconds !! Admittedly, the
                deque is using dynamic allocation/deallocation, which is probably where most
                of my CPU cycles were going.

                Needless to say, I reverted back to the plain old array again. It ended up
                being much cheaper to simply run a sort routine on the array at the end of
                the function to achieve the move-ordering. I know that in theory, STL
                containers should be very fast, but in my experience, I have always been
                underwhelmed by the speed. Maybe it is the type of apps I write, or maybe I
                just "don't get" modern OO paradigms. I don't know. I want to be a believer,
                but I am struggling to keep the faith ...

                I think you are right about using OO for algorithms. There is no question
                that OO is ideal for designing interfaces, and I certainly don't have a
                problem with that. If there are any STL gurus out there that want to comment
                on this, then I'd love to hear from you ...

                -Judd

                "John Harrison" <john_andronicu s@hotmail.com> wrote in message
                news:c26p69$1q5 c91$1@ID-196037.news.uni-berlin.de...[color=blue]
                >
                > "JN" <jniemann@ipa.c om.au> wrote in message
                > news:4046a7ef_1 @news.iprimus.c om.au...[color=green]
                > > Thanks John,
                > >
                > > I certainly have access to many examples of HashTable code, but I wanted[/color]
                > to[color=green]
                > > try an STL-style implentation. However, I seem to already have wasted[/color][/color]
                far[color=blue][color=green]
                > > too much time going down this path, so I think I will revert back to a
                > > traditional (C-Style) data structure. btw, when you referred to "real[/color]
                > chess[color=green]
                > > programs", I'm sure you meant it in the nicest possible way :) ..
                > >[/color]
                >
                > Yes of course. Maybe I was think of my own attempts to write a chess
                > program. It ran on a batch system (i.e. no user interaction allowed) where[/color]
                I[color=blue]
                > was limited to 6 submissions a day. Therefore I could only play six moves[/color]
                a[color=blue]
                > day. It only ever played one game, which was unfinished due to the program
                > crashing when pawn promotion was a possibility.
                >
                > I never wrote another chess playing program.
                >
                > A serious point, I sometimes doubt the benefits of OO for algorithmically
                > complex situations like a chess playing engine. Because the things that OO
                > is good at like separating interface and implementation and assigning
                > responsibilitie s to well defined sections of code don't really address the
                > important issues in getting complex algorithm right. The user interface of
                > your chess program would be a completely different matter of course.
                >
                > With the STL hash table you'll undoubtedly get a good implementation of a
                > general purpose hash table, but that doesn't mean that you can't do[/color]
                better[color=blue]
                > with your application specific knowledge.
                >
                > John
                >
                >[/color]


                Comment

                Working...