Modules are hashable?!

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Leif K-Brooks

    Modules are hashable?!

    I was just playing around, and noticed that modules seem to be hashable.
    Can anyone explain that, especially given the fact that they're mutable?

    Python 2.3.3 (#1, May 7 2004, 10:31:40)
    [GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2
    Type "help", "copyright" , "credits" or "license" for more information.[color=blue][color=green][color=darkred]
    >>> import sys
    >>> hash(sys)[/color][/color][/color]
    -150589324[color=blue][color=green][color=darkred]
    >>> sys.x = 42
    >>> hash(sys)[/color][/color][/color]
    -150589324[color=blue][color=green][color=darkred]
    >>> foo = {sys:'bar'}
    >>> foo[sys][/color][/color][/color]
    'bar'
  • Miki Tebeka

    #2
    Re: Modules are hashable?!

    Hello Leif,
    [color=blue]
    > I was just playing around, and noticed that modules seem to be hashable.
    > Can anyone explain that, especially given the fact that they're mutable?[/color]


    HTH.
    --
    ------------------------------------------------------------------------
    Miki Tebeka <miki.tebeka@zo ran.com>

    The only difference between children and adults is the price of the toys

    Comment

    • Jason Lai

      #3
      Re: Modules are hashable?!

      Leif K-Brooks wrote:[color=blue]
      > I was just playing around, and noticed that modules seem to be hashable.
      > Can anyone explain that, especially given the fact that they're mutable?
      >[/color]

      Most objects in Python are hashable, and also mutable. If you define a
      class Foo and create x = Foo(), x will be hashable. But you can also do
      x.stuff = 3. So modules appear to be pretty much regular objects, which
      are usually hashed by ID -- which will be unique for any two objects.
      Try hash(sys) == id(sys) to see for yourself.

      Lists, tuples, dicts, strings, and a few other things are odd in that
      they're hashed/compared by their contents rather than ID. So you can
      have two different objects that are equivalent with regards to hashing
      or comparing.

      Due to the way hashing works, the hash value is not allowed to change
      while it's in a dict. IDs never change, so regular objects have an
      unchanging hash value even if you change their contents. For the special
      objects, changing the contents would change the hash value, and that's
      not allowed.

      I think that's how it works, anyway :P

      - Jason Lai

      Comment

      • Alex Martelli

        #4
        Re: Modules are hashable?!

        Leif K-Brooks <eurleif@ecritt ers.biz> wrote:
        [color=blue]
        > I was just playing around, and noticed that modules seem to be hashable.
        > Can anyone explain that, especially given the fact that they're mutable?[/color]

        Any object x is hashable if type(x) does not expose __eq__ nor __cmp__.
        In that case, the meaning of x==y for that object is 'x is y', that is,
        id(x)==id(y), so having hash(x) return id(x) is perfectly functional.
        Mutation is not a problem if it doesn't affect equality comparisons.


        Alex

        Comment

        • Alex Martelli

          #5
          Re: Modules are hashable?!

          Jason Lai <jmlai@uci.ed u> wrote:
          ...[color=blue]
          > Lists, tuples, dicts, strings, and a few other things are odd in that
          > they're hashed/compared by their contents rather than ID. So you can
          > have two different objects that are equivalent with regards to hashing
          > or comparing.[/color]

          Two _distinct_ objects that are not different, actually (this also
          applies to numbers -- their values get compared, not their identities).
          I think you have the right ideas, just trying to perfect your choice of
          words here -- distinct vs different. I think the best usage is:

          "A is different from B" -> "A != B"
          "A is distinct from B" -> "A is not B"
          [color=blue]
          > I think that's how it works, anyway :P[/color]

          Pretty much, yes.


          Alex

          Comment

          • Maurice LING

            #6
            Re: Modules are hashable?!

            Alex Martelli wrote:
            [color=blue]
            > Leif K-Brooks <eurleif@ecritt ers.biz> wrote:
            >
            >[color=green]
            >>I was just playing around, and noticed that modules seem to be hashable.
            >>Can anyone explain that, especially given the fact that they're mutable?[/color]
            >
            >
            > Any object x is hashable if type(x) does not expose __eq__ nor __cmp__.
            > In that case, the meaning of x==y for that object is 'x is y', that is,
            > id(x)==id(y), so having hash(x) return id(x) is perfectly functional.
            > Mutation is not a problem if it doesn't affect equality comparisons.
            >
            >
            > Alex[/color]

            The idea that I get from reading this thread is that objects that can be
            type compared (comparing the contents) are not hashable, and they are
            list, strings, tuples and dictionary. Is there any others that fall into
            this category? Is there any way to make them hashable?

            Hashable objects, on the other hand, are hashed based on say, the
            pointer address pointing the object or an identifier in the Python VM
            symbol table or something. It's like to say that when you hash a human,
            you get the name and not the physical dimensions of the person. The
            person can grow fat or slim down, but the name is still the same.

            Maurice

            Comment

            • Jason Lai

              #7
              Re: Modules are hashable?!

              Maurice LING wrote:[color=blue]
              >
              > The idea that I get from reading this thread is that objects that can be
              > type compared (comparing the contents) are not hashable, and they are
              > list, strings, tuples and dictionary. Is there any others that fall into
              > this category? Is there any way to make them hashable?[/color]

              Well, strings and tuples are immutable, so they provide a hash function
              (since it's safe to hash them by contents; the contents are pointers
              that don't change.) Anything that provides a hash function can be
              hashed. You could theoretically create a new list type that works
              exactly like a normal list, but hashes based on ID.
              [color=blue]
              >
              > Hashable objects, on the other hand, are hashed based on say, the
              > pointer address pointing the object or an identifier in the Python VM
              > symbol table or something. It's like to say that when you hash a human,
              > you get the name and not the physical dimensions of the person. The
              > person can grow fat or slim down, but the name is still the same.
              >
              > Maurice[/color]

              Heh, like giving every person a unique govt-assigned ID number. You
              could have two people who are exactly alike -- cloned atom-by-atom, or
              something -- but they wouldn't have the same ID. And the term "hashing"
              originates from chopping things up, so hashing humans wouldn't be a good
              idea :P

              - Jason Lai

              Comment

              • Sam Holden

                #8
                Re: Modules are hashable?!

                On Fri, 03 Sep 2004 04:05:40 GMT, Maurice LING <mauriceling@ac m.org> wrote:[color=blue]
                > Alex Martelli wrote:
                >[color=green]
                >> Leif K-Brooks <eurleif@ecritt ers.biz> wrote:
                >>
                >>[color=darkred]
                >>>I was just playing around, and noticed that modules seem to be hashable.
                >>>Can anyone explain that, especially given the fact that they're mutable?[/color]
                >>
                >>
                >> Any object x is hashable if type(x) does not expose __eq__ nor __cmp__.
                >> In that case, the meaning of x==y for that object is 'x is y', that is,
                >> id(x)==id(y), so having hash(x) return id(x) is perfectly functional.
                >> Mutation is not a problem if it doesn't affect equality comparisons.
                >>
                >>
                >> Alex[/color]
                >
                > The idea that I get from reading this thread is that objects that can be
                > type compared (comparing the contents) are not hashable, and they are
                > list, strings, tuples and dictionary. Is there any others that fall into
                > this category? Is there any way to make them hashable?[/color]

                Define the __hash__ method for the class.

                Strings for example, are hashable even though they comparisons are done
                with respect to the contents and not the id. Tuples are also:

                ; python
                Python 2.3.4 (#2, Aug 18 2004, 13:18:19)
                [GCC 3.3.4 (Debian 1:3.3.4-9)] on linux2
                Type "help", "copyright" , "credits" or "license" for more information.[color=blue][color=green][color=darkred]
                >>> t1 = (1,2,3)
                >>> t2 = (1,2,3)
                >>> id(t1)[/color][/color][/color]
                1075806412[color=blue][color=green][color=darkred]
                >>> id(t2)[/color][/color][/color]
                1075841140[color=blue][color=green][color=darkred]
                >>> hash(t1)[/color][/color][/color]
                -821448277[color=blue][color=green][color=darkred]
                >>> hash(t2)[/color][/color][/color]
                -821448277[color=blue][color=green][color=darkred]
                >>>[/color][/color][/color]
                ;

                --
                Sam Holden

                Comment

                • Maurice LING

                  #9
                  Re: Modules are hashable?!

                  is there actually a practical reason to hash modules? can I call a
                  module using the hash key?

                  maurice

                  Comment

                  • Alex Martelli

                    #10
                    Re: Modules are hashable?!

                    Maurice LING <mauriceling@ac m.org> wrote:
                    [color=blue]
                    > Alex Martelli wrote:
                    >[color=green]
                    > > Leif K-Brooks <eurleif@ecritt ers.biz> wrote:
                    > >
                    > >[color=darkred]
                    > >>I was just playing around, and noticed that modules seem to be hashable.
                    > >>Can anyone explain that, especially given the fact that they're mutable?[/color]
                    > >
                    > >
                    > > Any object x is hashable if type(x) does not expose __eq__ nor __cmp__.
                    > > In that case, the meaning of x==y for that object is 'x is y', that is,
                    > > id(x)==id(y), so having hash(x) return id(x) is perfectly functional.
                    > > Mutation is not a problem if it doesn't affect equality comparisons.
                    > >
                    > >
                    > > Alex[/color]
                    >
                    > The idea that I get from reading this thread is that objects that can be
                    > type compared (comparing the contents) are not hashable, and they are[/color]

                    Never said that! I said the reverse: if objects are compared by id
                    they're also hashable (in the same way). "All cats are mammals" does
                    not imply "all mammals are cats". Objects can perfectly well be
                    hashable, AND compared by contents at the same time -- that's where
                    immutability (of those contents which affect comparison and thus
                    hashing) is a practical necessity.

                    "Practical" , not theoretical: "def __hash__(self): return 23" will in
                    fact ensure correct semantics. Unfortunately, it will do so at the
                    expense of intolerable performance hits any time a number of objects of
                    this type are used as dictionary keys... if you know anything about
                    hashing you can see why this can't fail to be so.
                    [color=blue]
                    > list, strings, tuples and dictionary. Is there any others that fall into
                    > this category? Is there any way to make them hashable?[/color]

                    strings are hashable, and so are tuples if all their items are hashable.
                    That is because they define proper __hash__ methods (or rather the C API
                    equivalent, of course) that cooperate properly with their __cmp__
                    methods, basically ensuring that x==y implies hash(x)==hash(y ). But
                    that's hard to ensure, together with the fact that hash(x) must always
                    give the same result for a given x, for general mutable containers such
                    as lists and dicts.

                    [color=blue]
                    > Hashable objects, on the other hand, are hashed based on say, the
                    > pointer address pointing the object or an identifier in the Python VM
                    > symbol table or something. It's like to say that when you hash a human,
                    > you get the name and not the physical dimensions of the person. The
                    > person can grow fat or slim down, but the name is still the same.[/color]

                    As long as your equality-comparison only relies on immutable
                    characteristics you're fine. Unfortunately in most countries it IS
                    legal to change name at least in some circumstances (e.g. it used to
                    happen almost automatically to a woman when she married, in many
                    countries, some time ago...) so this wouldn't work in this case;-).


                    Alex

                    Comment

                    • Alex Martelli

                      #11
                      Re: Modules are hashable?!

                      Jason Lai <jmlai@uci.ed u> wrote:
                      [color=blue]
                      > Maurice LING wrote:[color=green]
                      > >
                      > > The idea that I get from reading this thread is that objects that can be
                      > > type compared (comparing the contents) are not hashable, and they are
                      > > list, strings, tuples and dictionary. Is there any others that fall into
                      > > this category? Is there any way to make them hashable?[/color]
                      >
                      > Well, strings and tuples are immutable, so they provide a hash function
                      > (since it's safe to hash them by contents; the contents are pointers
                      > that don't change.) Anything that provides a hash function can be[/color]

                      Yes, a tuple's "pointers" don't change, but they may refer to objects
                      that do, and in this case the tuple need not be hashable. E.g.:
                      [color=blue][color=green][color=darkred]
                      >>> a=tuple([ {} ])
                      >>> hash(a)[/color][/color][/color]
                      Traceback (most recent call last):
                      File "<stdin>", line 1, in ?
                      TypeError: dict objects are unhashable

                      Since the tuple has (as its only item) a dict, the tuple itself isn't
                      hashable -- the error message indicates the type of the item that fails
                      to be hashable.
                      [color=blue]
                      > hashed. You could theoretically create a new list type that works
                      > exactly like a normal list, but hashes based on ID.[/color]

                      No, if a==b and hash(a)!=hash(b ) all hell would break loose. Don't go
                      there.


                      Alex

                      Comment

                      • Alex Martelli

                        #12
                        Re: Modules are hashable?!

                        Maurice LING <mauriceling@ac m.org> wrote:
                        [color=blue]
                        > is there actually a practical reason to hash modules? can I call a
                        > module using the hash key?[/color]

                        You cannot call a module: a module does not have a __call__ method.
                        This has nothing to do with hashing.

                        A practical reason to hash modules would be to associate to each module
                        some value or group of values -- without sticking those values in the
                        module itself -- or keep track of a set of modules having some special
                        characteristic whereby you want to "logically group them together" as
                        the set of keys into a certain dict (or in 2.4 the elements of a certain
                        set, since set is now a builtin type).

                        Consider for example a module that starts:

                        import foo, fee, fie, fo, fum, bar, baz, bat
                        yet_untested_mo dules = dict.fromkeys([ fee, bar, baz ])

                        later on you might have code like

                        if somemodule in yet_untested_mo dules:
                        somemodule.perf orm_all_tests()
                        del yet_untested_mo dules[somemodule]

                        for "just in time testing" of a certain set of modules. OK, weird-ish,
                        but it's what I could come up in 20 seconds;-). To have modules as dict
                        keys or set members they must be hashable, and since there's no reason
                        to make them unhashable, why not?


                        Alex

                        Comment

                        Working...