Possibly dumb question about dicts and __hash__()

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Joel Hedlund

    Possibly dumb question about dicts and __hash__()

    Hi!

    There's one thing about dictionaries and __hash__() methods that puzzle me. I
    have a class with several data members, one of which is 'name' (a str). I would
    like to store several of these objects in a dict for quick access
    ({name:object} style). Now, I was thinking that given a list of objects I might
    do something like

    d = {}
    for o in objects:
    d[o] = o

    and still be able to retrieve the data like so:

    d[name]

    if I just defined a __hash__ method like so:

    def __hash__(self):
    return self.name.__has h__()

    but this fails miserably. Feel free to laugh if you feel like it. I cooked up a
    little example with sample output below if you care to take the time.

    Code:
    ---------------------------------------------------------------
    class NamedThing(obje ct):
    def __init__(self, name):
    self.name = name
    def __hash__(self):
    return self.name.__has h__()
    def __repr__(self):
    return '<foo>'
    name = 'moo'
    o = NamedThing(name )
    print "This output puzzles me:"
    d = {}
    d[o] = o
    d[name] = o
    print d
    print
    print "If I wrap all keys in hash() calls I'm fine:"
    d = {}
    d[hash(o)] = o
    d[hash(name)] = o
    print d
    print
    print "But how come the first method didn't work?"
    ---------------------------------------------------------------

    Output:
    ---------------------------------------------------------------
    This output puzzles me:
    {'moo': <foo>, <foo>: <foo>}

    If I wrap all keys in hash() calls I'm fine:
    {2038943316: <foo>}

    But how come the first method didn't work?
    ---------------------------------------------------------------

    I'd be grateful if anyone can shed a litte light on this, or point me to some
    docs I might have missed.

    Also:
    Am I in fact abusing the __hash__() method? If so - what's the intended use of
    the __hash__() method?

    Is there a better way of implementing this?

    I realise I could just write

    d[o.name] = o

    but this problem seems to pop up every now and then and I'm curious if there's
    some neat syntactic trick that I could legally apply here.

    Thanks for your time!
    /Joel Hedlund
  • johnzenger@gmail.com

    #2
    Re: Possibly dumb question about dicts and __hash__()

    Use __repr__. Behold:
    [color=blue][color=green][color=darkred]
    >>> class NamedThing(obje ct):[/color][/color][/color]
    def __init__(self, name):
    self.name = name
    def __repr__(self):
    return self.name
    [color=blue][color=green][color=darkred]
    >>> a = NamedThing("Del aware")
    >>> b = NamedThing("Haw aii")
    >>> d = {}
    >>> d[a] = 1
    >>> d[b] = 50
    >>> print d[/color][/color][/color]
    {Delaware: 1, Hawaii: 50}[color=blue][color=green][color=darkred]
    >>> d[a][/color][/color][/color]
    1[color=blue][color=green][color=darkred]
    >>> d[b][/color][/color][/color]
    50

    Although this is a bit illegal, because repr is not supposed to be used
    this way.

    Joel Hedlund wrote:[color=blue]
    > Hi!
    >
    > There's one thing about dictionaries and __hash__() methods that puzzle me. I
    > have a class with several data members, one of which is 'name' (a str). I would
    > like to store several of these objects in a dict for quick access
    > ({name:object} style). Now, I was thinking that given a list of objects I might
    > do something like
    >
    > d = {}
    > for o in objects:
    > d[o] = o
    >
    > and still be able to retrieve the data like so:
    >
    > d[name]
    >
    > if I just defined a __hash__ method like so:
    >
    > def __hash__(self):
    > return self.name.__has h__()
    >
    > but this fails miserably. Feel free to laugh if you feel like it. I cooked up a
    > little example with sample output below if you care to take the time.
    >
    > Code:
    > ---------------------------------------------------------------
    > class NamedThing(obje ct):
    > def __init__(self, name):
    > self.name = name
    > def __hash__(self):
    > return self.name.__has h__()
    > def __repr__(self):
    > return '<foo>'
    > name = 'moo'
    > o = NamedThing(name )
    > print "This output puzzles me:"
    > d = {}
    > d[o] = o
    > d[name] = o
    > print d
    > print
    > print "If I wrap all keys in hash() calls I'm fine:"
    > d = {}
    > d[hash(o)] = o
    > d[hash(name)] = o
    > print d
    > print
    > print "But how come the first method didn't work?"
    > ---------------------------------------------------------------
    >
    > Output:
    > ---------------------------------------------------------------
    > This output puzzles me:
    > {'moo': <foo>, <foo>: <foo>}
    >
    > If I wrap all keys in hash() calls I'm fine:
    > {2038943316: <foo>}
    >
    > But how come the first method didn't work?
    > ---------------------------------------------------------------
    >
    > I'd be grateful if anyone can shed a litte light on this, or point me to some
    > docs I might have missed.
    >
    > Also:
    > Am I in fact abusing the __hash__() method? If so - what's the intended use of
    > the __hash__() method?
    >
    > Is there a better way of implementing this?
    >
    > I realise I could just write
    >
    > d[o.name] = o
    >
    > but this problem seems to pop up every now and then and I'm curious if there's
    > some neat syntactic trick that I could legally apply here.
    >
    > Thanks for your time!
    > /Joel Hedlund[/color]

    Comment

    • Joel Hedlund

      #3
      Re: Possibly dumb question about dicts and __hash__()

      Hi!

      Thanks for the quick response!
      [color=blue]
      > Although this is a bit illegal, because repr is not supposed to be used
      > this way.[/color]

      How illegal is it? If I document it and put it in an opensource project, will
      people throw tomatoes?

      /Joel

      johnzenger@gmai l.com wrote:[color=blue]
      > Use __repr__. Behold:
      >
      >[color=green][color=darkred]
      >>>>class NamedThing(obje ct):[/color][/color]
      >
      > def __init__(self, name):
      > self.name = name
      > def __repr__(self):
      > return self.name
      >
      >[color=green][color=darkred]
      >>>>a = NamedThing("Del aware")
      >>>>b = NamedThing("Haw aii")
      >>>>d = {}
      >>>>d[a] = 1
      >>>>d[b] = 50
      >>>>print d[/color][/color]
      >
      > {Delaware: 1, Hawaii: 50}
      >[color=green][color=darkred]
      >>>>d[a][/color][/color]
      >
      > 1
      >[color=green][color=darkred]
      >>>>d[b][/color][/color]
      >
      > 50
      >
      > Although this is a bit illegal, because repr is not supposed to be used
      > this way.
      >
      > Joel Hedlund wrote:
      >[color=green]
      >>Hi!
      >>
      >>There's one thing about dictionaries and __hash__() methods that puzzle me. I
      >>have a class with several data members, one of which is 'name' (a str). I would
      >>like to store several of these objects in a dict for quick access
      >>({name:object } style). Now, I was thinking that given a list of objects I might
      >>do something like
      >>
      >>d = {}
      >>for o in objects:
      >> d[o] = o
      >>
      >>and still be able to retrieve the data like so:
      >>
      >>d[name]
      >>
      >>if I just defined a __hash__ method like so:
      >>
      >>def __hash__(self):
      >> return self.name.__has h__()
      >>
      >>but this fails miserably. Feel free to laugh if you feel like it. I cooked up a
      >>little example with sample output below if you care to take the time.
      >>
      >>Code:
      >>---------------------------------------------------------------
      >>class NamedThing(obje ct):
      >> def __init__(self, name):
      >> self.name = name
      >> def __hash__(self):
      >> return self.name.__has h__()
      >> def __repr__(self):
      >> return '<foo>'
      >>name = 'moo'
      >>o = NamedThing(name )
      >>print "This output puzzles me:"
      >>d = {}
      >>d[o] = o
      >>d[name] = o
      >>print d
      >>print
      >>print "If I wrap all keys in hash() calls I'm fine:"
      >>d = {}
      >>d[hash(o)] = o
      >>d[hash(name)] = o
      >>print d
      >>print
      >>print "But how come the first method didn't work?"
      >>---------------------------------------------------------------
      >>
      >>Output:
      >>---------------------------------------------------------------
      >>This output puzzles me:
      >>{'moo': <foo>, <foo>: <foo>}
      >>
      >>If I wrap all keys in hash() calls I'm fine:
      >>{2038943316 : <foo>}
      >>
      >>But how come the first method didn't work?
      >>---------------------------------------------------------------
      >>
      >>I'd be grateful if anyone can shed a litte light on this, or point me to some
      >>docs I might have missed.
      >>
      >>Also:
      >>Am I in fact abusing the __hash__() method? If so - what's the intended use of
      >>the __hash__() method?
      >>
      >>Is there a better way of implementing this?
      >>
      >>I realise I could just write
      >>
      >>d[o.name] = o
      >>
      >>but this problem seems to pop up every now and then and I'm curious if there's
      >>some neat syntactic trick that I could legally apply here.
      >>
      >>Thanks for your time!
      >>/Joel Hedlund[/color]
      >
      >[/color]

      Comment

      • Bruno Desthuilliers

        #4
        Re: Possibly dumb question about dicts and __hash__()

        Joel Hedlund a écrit :
        (snip)[color=blue]
        > How illegal is it? If I document it and put it in an opensource project,
        > will people throw tomatoes?[/color]

        Don't know, but they'll sure do if you insist on top-posting !-)

        Comment

        • johnzenger@gmail.com

          #5
          Re: Possibly dumb question about dicts and __hash__()

          Actually, come to think of it, __str__ works just as well.
          [color=blue][color=green][color=darkred]
          >>> class NamedThing(obje ct):[/color][/color][/color]
          def __init__(self, name):
          self.name = name
          def __str__(self):
          return self.name[color=blue][color=green][color=darkred]
          >>> d = {}
          >>> d[a] = 1
          >>> d[b] = 50
          >>> d[/color][/color][/color]
          {<__main__.Name dThing object at 0x00C528D0>: 1, <__main__.Named Thing
          object at 0x00C529B0>: 50}[color=blue][color=green][color=darkred]
          >>> d[a][/color][/color][/color]
          1[color=blue][color=green][color=darkred]
          >>> d[b][/color][/color][/color]
          50

          This is what you should use, instead of my first answer. From the docs
          for __repr__: "If at all possible, this should look like a valid Python
          expression that could be used to recreate an object with the same value
          (given an appropriate environment). If this is not possible, a string
          of the form "<...some useful description...> " should be returned. ...
          This is typically used for debugging, so it is important that the
          representation is information-rich and unambiguous."



          johnzenger@gmai l.com wrote:[color=blue]
          > Use __repr__. Behold:
          >[color=green][color=darkred]
          > >>> class NamedThing(obje ct):[/color][/color]
          > def __init__(self, name):
          > self.name = name
          > def __repr__(self):
          > return self.name
          >[color=green][color=darkred]
          > >>> a = NamedThing("Del aware")
          > >>> b = NamedThing("Haw aii")
          > >>> d = {}
          > >>> d[a] = 1
          > >>> d[b] = 50
          > >>> print d[/color][/color]
          > {Delaware: 1, Hawaii: 50}[color=green][color=darkred]
          > >>> d[a][/color][/color]
          > 1[color=green][color=darkred]
          > >>> d[b][/color][/color]
          > 50
          >
          > Although this is a bit illegal, because repr is not supposed to be used
          > this way.
          >
          > Joel Hedlund wrote:[color=green]
          > > Hi!
          > >
          > > There's one thing about dictionaries and __hash__() methods that puzzle me. I
          > > have a class with several data members, one of which is 'name' (a str). I would
          > > like to store several of these objects in a dict for quick access
          > > ({name:object} style). Now, I was thinking that given a list of objects I might
          > > do something like
          > >
          > > d = {}
          > > for o in objects:
          > > d[o] = o
          > >
          > > and still be able to retrieve the data like so:
          > >
          > > d[name]
          > >
          > > if I just defined a __hash__ method like so:
          > >
          > > def __hash__(self):
          > > return self.name.__has h__()
          > >
          > > but this fails miserably. Feel free to laugh if you feel like it. I cooked up a
          > > little example with sample output below if you care to take the time.
          > >
          > > Code:
          > > ---------------------------------------------------------------
          > > class NamedThing(obje ct):
          > > def __init__(self, name):
          > > self.name = name
          > > def __hash__(self):
          > > return self.name.__has h__()
          > > def __repr__(self):
          > > return '<foo>'
          > > name = 'moo'
          > > o = NamedThing(name )
          > > print "This output puzzles me:"
          > > d = {}
          > > d[o] = o
          > > d[name] = o
          > > print d
          > > print
          > > print "If I wrap all keys in hash() calls I'm fine:"
          > > d = {}
          > > d[hash(o)] = o
          > > d[hash(name)] = o
          > > print d
          > > print
          > > print "But how come the first method didn't work?"
          > > ---------------------------------------------------------------
          > >
          > > Output:
          > > ---------------------------------------------------------------
          > > This output puzzles me:
          > > {'moo': <foo>, <foo>: <foo>}
          > >
          > > If I wrap all keys in hash() calls I'm fine:
          > > {2038943316: <foo>}
          > >
          > > But how come the first method didn't work?
          > > ---------------------------------------------------------------
          > >
          > > I'd be grateful if anyone can shed a litte light on this, or point me to some
          > > docs I might have missed.
          > >
          > > Also:
          > > Am I in fact abusing the __hash__() method? If so - what's the intended use of
          > > the __hash__() method?
          > >
          > > Is there a better way of implementing this?
          > >
          > > I realise I could just write
          > >
          > > d[o.name] = o
          > >
          > > but this problem seems to pop up every now and then and I'm curious if there's
          > > some neat syntactic trick that I could legally apply here.
          > >
          > > Thanks for your time!
          > > /Joel Hedlund[/color][/color]

          Comment

          • Joel Hedlund

            #6
            Re: Possibly dumb question about dicts and __hash__()

            Beautiful!

            But how come my attempt didn't work? I've seen docs that explain how __hash__()
            methods are used to put objects in dict buckets:



            But if it's really hash(str(o)) that's used for dict keys, what good are
            __hash__() methods? Or am I reading the docs wrong?

            /Joel

            Comment

            • Peter Otten

              #7
              Re: Possibly dumb question about dicts and __hash__()

              Joel Hedlund wrote:
              [color=blue]
              > There's one thing about dictionaries and __hash__() methods that puzzle
              > me. I have a class with several data members, one of which is 'name' (a
              > str). I would like to store several of these objects in a dict for quick
              > access ({name:object} style). Now, I was thinking that given a list of
              > objects I might do something like
              >
              > d = {}
              > for o in objects:
              > d[o] = o
              >
              > and still be able to retrieve the data like so:
              >
              > d[name]
              >
              > if I just defined a __hash__ method like so:
              >
              > def __hash__(self):
              > return self.name.__has h__()[/color]

              Just the hash is not enough. You need to define equality, too:
              [color=blue][color=green][color=darkred]
              >>> class Named(object):[/color][/color][/color]
              .... def __init__(self, name):
              .... self.name = name
              .... def __hash__(self):
              .... return hash(self.name)
              .... def __eq__(self, other):
              .... try:
              .... other_name = other.name
              .... except AttributeError:
              .... return self.name == other
              .... return self.name == other_name
              .... def __repr__(self):
              .... return "Named(name=%r) " % self.name
              ....[color=blue][color=green][color=darkred]
              >>> items = [Named(n) for n in "alpha beta gamma".split()]
              >>> d = dict(zip(items, items))
              >>> d["alpha"][/color][/color][/color]
              Named(name='alp ha')

              Peter

              Comment

              • Joel Hedlund

                #8
                Re: Possibly dumb question about dicts and __hash__()

                Hi!
                [color=blue]
                > Just the hash is not enough. You need to define equality, too:[/color]

                Thanks a million for clearing that up.

                Cheers!
                /Joel

                Comment

                Working...