Are dictionaries the same as hashtables?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • cnb

    Are dictionaries the same as hashtables?

    Are dictionaries the same as hashtables?
  • Martin Marcher

    #2
    Re: Are dictionaries the same as hashtables?

    On 2008-08-26 00:32:20, cnb wrote:
    Are dictionaries the same as hashtables?
    Yes, but there is nothing in there that does sane collision handling
    like making a list instead of simply overwriting.

    PS: your sig was *a bit* longer than you question. please don't do
    that...

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.6 (GNU/Linux)

    iQIVAwUBSLO0KNj k1vit/YR7AQLAxhAAotF4 xEbquq1C+fnVU93 PhETiotFSw/ni
    r32njO5MPu2ecyY 4eOxGshzw6P2ez1 i9K3Yk8bpm89Frs ZeqnrdDhh8xFo7C KFjX
    DeoBVE3x6jo4Q9i TvNpF6tUHCAgyxV uaV/F+Hz9mNdaJ5o+xx G474429Pumn1N4W
    m5X4wk5BtIj1nV7 R4xh+jOFDXFShSo CZBwyVGB+MIZUsQ hAt4qN9MSuO3lhm NgtD
    Eo03CVpdqiKcxqS 77xnHIR8TtgahLy RAosMBenNCe5ELE 7CALkVU/v4oNI+4G3Ju
    n0mfkZA2iqWulcJ D3QAdYwImFlXrlL Yg6Y5OMvL/Tm4729htsvGBPZD 0DSrR0rQs
    tjgRIljhv3BwKox sk/UUx2glbUShjEfgC imIePKmuSh7jN+C uCVsgaRj5zYGiCV Q
    gintIQaiFheqjkp SfkRtLqBQBbLwzC S86Bzc++I0m+IEN xExvuE3AY4Lmt5d VCAT
    ox8TubGe50xgomW r9ms/szri9u5qYuUWzC2 ByjUVauuGAQxHwQ UQvjCBFM9TI6bc
    do+TMTY8qJhb/qsW3TlX5HagW/Cn+jw+xwP8b+xeM NXAEA/Q3dwSKFTx1s5NwG g6
    vvjwEdPooq3k2XE QFXZq12CUn0FYjh UAzwR/06JG6muAR0YEMHK GeqcuamuEKeU/
    2VBXpaqVkPY=
    =7TsX
    -----END PGP SIGNATURE-----

    Comment

    • cnb

      #3
      Re: Are dictionaries the same as hashtables?

      On Aug 26, 9:43 am, Martin Marcher <mar...@marcher .namewrote:
      On 2008-08-26 00:32:20, cnb wrote:
      >
      Are dictionaries the same as hashtables?
      >
      Yes, but there is nothing in there that does sane collision handling
      like making a list instead of simply overwriting.
      >
      PS: your sig was *a bit* longer than you question. please don't do
      that...
      >
       signature.asc
      < 1KViewDownload
      my what?

      Comment

      • John Machin

        #4
        Re: Are dictionaries the same as hashtables?

        On Aug 26, 5:43 pm, Martin Marcher <mar...@marcher .namewrote:
        On 2008-08-26 00:32:20, cnb wrote:
        >
        Are dictionaries the same as hashtables?
        >
        Yes, but there is nothing in there that does sane collision handling
        like making a list instead of simply overwriting.
        >
        Please clarify: (1) Nothing in where? In the Python dictionary
        implementation? (2) What do you mean by "sane collision handling"?
        What do you mean by "simply overwriting"?

        TIA,
        John

        Comment

        • Diez B. Roggisch

          #5
          Re: Are dictionaries the same as hashtables?

          Martin Marcher wrote:
          On 2008-08-26 00:32:20, cnb wrote:
          >Are dictionaries the same as hashtables?
          >
          Yes, but there is nothing in there that does sane collision handling
          like making a list instead of simply overwriting.
          The term "collision" is rather well defined when talking about associative
          arrays using hashing (which python's dicts are): it means that two distinct
          keys produce the same hash-value, leading to a bucket collision. And of
          course python deals with that sanely, and creates a list of objects inside
          the bucket which is traversed and comparison is used to determine which
          actual value to retrieve.

          Python does not have a "one key maps to a list of values"-semantics - which
          I consider the sane choice...

          However, you can have that using the defaultdict for example:

          listdict = defaultdict(lis t)

          listdict[key].append(value)

          Diez

          Comment

          • Ben Finney

            #6
            Re: Are dictionaries the same as hashtables?

            cnb <circularfunc@y ahoo.sewrites:
            Are dictionaries the same as hashtables?
            The 'dict' type in Python has certain behaviour, as specified in the
            language reference. In CPython they are implemented as hash tables,
            but I don't recall anything that specifies they *must* be implemented
            that way.

            So my answer would be "maybe, but don't count on it". Write your
            program to the specified behaviour of the language, not to the
            underlying implementation.

            --
            \ “With Lisp or Forth, a master programmer has unlimited power |
            `\ and expressiveness. With Python, even a regular guy can reach |
            _o__) for the stars.” —Raymond Hettinger |
            Ben Finney

            Comment

            • John Machin

              #7
              Re: Are dictionaries the same as hashtables?

              On Aug 26, 7:36 pm, "Diez B. Roggisch" <de...@nospam.w eb.dewrote:
              Martin Marcher wrote:
              On 2008-08-26 00:32:20, cnb wrote:
              Are dictionaries the same as hashtables?
              >
              Yes, but there is nothing in there that does sane collision handling
              like making a list instead of simply overwriting.
              >
              The term "collision" is rather well defined when talking about associative
              arrays using hashing (which python's dicts are): it means that two distinct
              keys produce the same hash-value, leading to a bucket collision. And of
              course python deals with that sanely, and creates a list of objects inside
              the bucket which is traversed and comparison is used to determine which
              actual value to retrieve.
              I see neither buckets nor lists in the CPython svn trunk version of
              dictobject.c and IIRC it's been using open addressing for a very long
              time.

              Comment

              • Diez B. Roggisch

                #8
                Re: Are dictionaries the same as hashtables?

                John Machin wrote:
                On Aug 26, 7:36 pm, "Diez B. Roggisch" <de...@nospam.w eb.dewrote:
                >Martin Marcher wrote:
                On 2008-08-26 00:32:20, cnb wrote:
                >Are dictionaries the same as hashtables?
                >>
                Yes, but there is nothing in there that does sane collision handling
                like making a list instead of simply overwriting.
                >>
                >The term "collision" is rather well defined when talking about
                >associative arrays using hashing (which python's dicts are): it means
                >that two distinct keys produce the same hash-value, leading to a bucket
                >collision. And of course python deals with that sanely, and creates a
                >list of objects inside the bucket which is traversed and comparison is
                >used to determine which actual value to retrieve.
                >
                I see neither buckets nor lists in the CPython svn trunk version of
                dictobject.c and IIRC it's been using open addressing for a very long
                time.
                I don't know the exact names of the involved structures - I named them
                liberally from my understanding of how associative arrays based on hashing
                are implemented. But the below code shows that hash-collisions can occur
                without corrupting data, while OTOH of course degenerating the lookup from
                usually O(1) to O(n). Try playing around with it, commenting out the proper
                hashing of the key will lead to great performance enhancements.

                Diez

                import pprint

                class StupidThing(obj ect):

                def __init__(self, key):
                self.key = key


                def __hash__(self):
                #return hash(self.key)
                return 1


                def __cmp__(self, other):
                return cmp(self.key, other.key)


                def __repr__(self):
                return "<%s>" % self.key


                d = {}

                for a in xrange(1000):
                d[StupidThing(str (a))] = a



                pprint.pprint(d )

                Comment

                • Martin

                  #9
                  Re: Are dictionaries the same as hashtables?

                  On Tue, Aug 26, 2008 at 9:52 AM, cnb <circularfunc@y ahoo.sewrote:
                  On Aug 26, 9:43 am, Martin Marcher <mar...@marcher .namewrote:
                  >On 2008-08-26 00:32:20, cnb wrote:
                  >>
                  Are dictionaries the same as hashtables?
                  >>
                  >Yes, but there is nothing in there that does sane collision handling
                  >like making a list instead of simply overwriting.
                  >>
                  >PS: your sig was *a bit* longer than you question. please don't do
                  >that...
                  >>
                  > signature.asc
                  >< 1KViewDownload
                  >
                  my what?
                  Hehe,

                  it was *my* sig... i was using some old box with a mut config that
                  still had the fortune script in it... sorry.

                  Anyway what I meant by collision handling was:

                  $ python
                  Python 2.5.2 (r252:60911, Jul 31 2008, 17:31:22)
                  [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2
                  Type "help", "copyright" , "credits" or "license" for more information.
                  >>l = {}
                  >>l
                  {}
                  >>l["a"] = 12
                  >>l["b"] = 13
                  >>l
                  {'a': 12, 'b': 13}
                  >>l["a"] = "This is a collision"
                  >>l
                  {'a': 'This is a collision', 'b': 13}
                  >>>
                  As you see position "a" is simply overwritten. Probably "sane" doesn't
                  really apply because how could the python creators possibly know
                  whether I just want to overwrite the value or indeed want some form of
                  collision handling (talk about lists vs. trees with there
                  subforms...)...

                  If one would want that he/she/it could still subclass dict and do more magic.

                  /martin

                  --


                  You are not free to read this message,
                  by doing so, you have violated my licence
                  and are required to urinate publicly. Thank you.

                  Comment

                  • Cameron Laird

                    #10
                    Re: Are dictionaries the same as hashtables?

                    In article <6hi153Fliuu4U1 @mid.uni-berlin.de>,
                    Diez B. Roggisch <deets@nospam.w eb.dewrote:
                    >Martin Marcher wrote:
                    >
                    >On 2008-08-26 00:32:20, cnb wrote:
                    >>Are dictionaries the same as hashtables?

                    Comment

                    • Fredrik Lundh

                      #11
                      Re: Are dictionaries the same as hashtables?

                      Martin Marcher wrote:
                      >Are dictionaries the same as hashtables?
                      >
                      Yes, but there is nothing in there that does sane collision handling
                      like making a list instead of simply overwriting.
                      are you sure you know what "collision handling" means in this context?

                      </F>

                      Comment

                      • Fredrik Lundh

                        #12
                        Re: Are dictionaries the same as hashtables?

                        Diez B. Roggisch wrote:
                        I don't know the exact names of the involved structures - I named them
                        liberally from my understanding of how associative arrays based on hashing
                        are implemented. But the below code shows that hash-collisions can occur
                        without corrupting data


                        "Open addressing, or closed hashing, is a method of collision resolution
                        in hash tables. With this method a hash collision is resolved by
                        probing, or searching through alternate locations in the array (the
                        probe sequence) until either the target record is found, or an unused
                        array slot is found, which indicates that there is no such key in the
                        table."

                        </F>

                        Comment

                        • Diez B. Roggisch

                          #13
                          Re: Are dictionaries the same as hashtables?

                          Cameron Laird wrote:
                          In article <6hi153Fliuu4U1 @mid.uni-berlin.de>,
                          Diez B. Roggisch <deets@nospam.w eb.dewrote:
                          >>Martin Marcher wrote:
                          >>
                          >>On 2008-08-26 00:32:20, cnb wrote:
                          >>>Are dictionaries the same as hashtables?
                          .
                          .
                          .
                          >>Python does not have a "one key maps to a list of values"-semantics -
                          >>which I consider the sane choice...
                          >>
                          >>However, you can have that using the defaultdict for example:
                          >>
                          >>listdict = defaultdict(lis t)
                          >>
                          >>listdict[key].append(value)
                          >>
                          >>Diez
                          >
                          ? I'm lost. As I understand your terms, Python's dictionaries
                          map keys to objects, but you would prefer that Python's
                          dictionaries map keys only to lists of values. That *sounds*
                          like a complexificatio n, at best. Are you trying to make a
                          point about implementation aligning with semantics?
                          The OP seems to want that (or at least sees it as one of two viable design
                          choices), see his other answer in this thread.

                          I certainly *don't* agree with that, I merely pointed out that python comes
                          with means to easily create such a data-structure in the case it is needed.

                          Diez

                          Comment

                          • Chris Mellon

                            #14
                            Re: Are dictionaries the same as hashtables?

                            On Tue, Aug 26, 2008 at 11:12 AM, Fredrik Lundh <fredrik@python ware.comwrote:
                            Martin Marcher wrote:
                            >
                            >>Are dictionaries the same as hashtables?
                            >>
                            >Yes, but there is nothing in there that does sane collision handling
                            >like making a list instead of simply overwriting.
                            >
                            are you sure you know what "collision handling" means in this context?
                            I think the confusion comes from thinking that dictionaries are
                            (really) hash tables and thus that things like collision handling are
                            exposed to the user of the data structure. In this sense, no,
                            dictionaries are *not* hash tables. They are mappings of keys to
                            values, and they use hash tables as part of their implementation.

                            Comment

                            • Cameron Laird

                              #15
                              Re: Are dictionaries the same as hashtables?

                              In article <6hioi7Fmdo37U1 @mid.uni-berlin.de>,
                              Diez B. Roggisch <deets@nospam.w eb.dewrote:
                              >Cameron Laird wrote:
                              >
                              >In article <6hi153Fliuu4U1 @mid.uni-berlin.de>,
                              >Diez B. Roggisch <deets@nospam.w eb.dewrote:
                              >>>Martin Marcher wrote:
                              >>>
                              >>>On 2008-08-26 00:32:20, cnb wrote:
                              >>>>Are dictionaries the same as hashtables?
                              >.
                              >.
                              >.
                              >>>Python does not have a "one key maps to a list of values"-semantics -
                              >>>which I consider the sane choice...
                              >>>
                              >>>However, you can have that using the defaultdict for example:
                              >>>
                              >>>listdict = defaultdict(lis t)
                              >>>
                              >>>listdict[key].append(value)
                              >>>
                              >>>Diez
                              >>
                              >? I'm lost. As I understand your terms, Python's dictionaries
                              >map keys to objects, but you would prefer that Python's
                              >dictionaries map keys only to lists of values. That *sounds*
                              >like a complexificatio n, at best. Are you trying to make a
                              >point about implementation aligning with semantics?
                              >
                              >The OP seems to want that (or at least sees it as one of two viable design
                              >choices), see his other answer in this thread.
                              >
                              >I certainly *don't* agree with that, I merely pointed out that python comes
                              >with means to easily create such a data-structure in the case it is needed.
                              >
                              >Diez
                              Oh! Thanks for clearing *that* up; I certainly had a different
                              impression.

                              To the original poster then: please be aware that the values of
                              Python's dictionaries not only can be any first-class objects,
                              but it's quite common--quite common in *my* code, anyway--for
                              dictionaries to range over lists, tuples, functions, other
                              dictionaries, and more. Python needn't change to allow you to
                              write any of this.

                              Comment

                              Working...