Py2.3: Feedback on Sets

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Terry Reedy

    #16
    Re: Py2.3: Feedback on Sets


    "Raymond Hettinger" <vze4rx4y@veriz on.net> wrote in message
    news:3b__a.9694 $u%2.7778@nwrdn y02.gnilink.net ...[color=blue]
    > "Istvan Albert"[color=green][color=darkred]
    > > > Then just by looking at the docs, it feels a little bit[/color][/color][/color]
    confusing to[color=blue][color=green]
    > > have discard() and remove() do essentially the same thing but only[/color][/color]
    one[color=blue][color=green]
    > > of them raising an exception. Which one? I already forgot. I don't[/color][/color]
    know[color=blue][color=green]
    > > which one I would prefer though.[/color][/color]

    I agree that this is confusing -- like having both str.find and
    str.index. I would prefer one delete function with an optional param
    'silent' to switch its 'not there' response from the default (either
    True or False, according to what seems to be the more common usage) to
    the other choice. (I know, I should have read draft more carefully
    and commented last fall -- but this seems like the sort of redundancy
    that Guido wants to remove in 3.0.)

    Terry J. Reedy


    Comment

    • Gerrit Holl

      #17
      Re: Py2.3: Feedback on Sets

      Raymond Hettinger wrote:[color=blue]
      > Subject: Py2.3: Feedback on Sets[/color]
      [color=blue]
      > * Do you care that sets can only contain hashable elements?[/color]

      This is the only disadvantage for me.

      For the rest, I am happy about it. I am already using it a lot
      on places where I used lists before, but where a Set is much
      better (no order, no duplicates, it really *is* a set)
      [color=blue]
      > User feedback is essential to determining the future direction
      > of sets (whether it will be implemented in C, change API,
      > and/or be given supporting language syntax).[/color]

      I really like them. I would also like to be able to do
      {elem for elem in set if foo(elem)} to construct a subset.

      Gerrit.

      --
      255. If he sublet the man's yoke of oxen or steal the seed-corn,
      planting nothing in the field, he shall be convicted, and for each one
      hundred gan he shall pay sixty gur of corn.
      -- 1780 BC, Hammurabi, Code of Law
      --
      Asperger Syndroom - een persoonlijke benadering:

      Het zijn tijden om je zelf met politiek te bemoeien:
      De website van de Socialistische Partij (SP) in Nederland: Informatie, nieuws, agenda en publicaties.


      Comment

      • Raymond Hettinger

        #18
        Re: Py2.3: Feedback on Sets

        "Russell E. Owen"[color=blue]
        > I don't rely on sets heavily (I do have a few implemented as
        > dictionaries with value=None) and am not yet ready to make my users
        > upgrade to Python 2.3.
        >
        > I suspect the upgrade issue will significantly slow the incorporation of
        > sets and the other new modules, but that over time they're likely to
        > become quite popular. I am certainly looking forward to using sets and
        > csv.
        >
        > I think it'd speed the adoption of new modules if they were explicitly
        > written to be compatible with one previous generation of Python (and
        > documented as such) so users could manually include them with their code
        > until the current generation of Python had a bit more time to be adopted.[/color]

        Wish granted!

        The sets module now will run under Py2.2.
        It should be available for download from CVS after 24 hours:
        Download Python for free. The Python programming language, an object-oriented scripting and rapid application development language. You can download it from http://www.python.org/download

        y


        Raymond Hettinger


        Comment

        • Raymond Hettinger

          #19
          Re: Py2.3: Feedback on Sets

          "Gary Feldman"[color=blue][color=green]
          > >* Are the docs clear? Can you suggest improvements?[/color]
          >
          > I haven't used them yet, but since I'm working my way through
          > the docs in general, I thought I'd check them out and comment.[/color]

          All of the issues you found have been fixed (except for the discussion of
          what an iterable parameter means -- that will be addressed elsewhere).


          Raymond Hettinger


          Comment

          • Raymond Hettinger

            #20
            Re: Py2.3: Feedback on Sets

            "John Smith"[color=blue]
            > Suggestion: How about adding Set.isProperSub set() and
            > Set.isProperSup erset()?[/color]

            We have them in operator form: a<b a>b
            Spelling them out did not seem to add much value.
            This is doubly true because some people read it
            as s.isProperSubse tOf(t) and others read it as
            s.hasTheProperS ubset(t).


            Raymond Hettinger
            [color=blue]
            > Thanks for this wonderful module. I've been working on data mining and
            > machine
            > learning area using Python. Set operations are very important to me.[/color]

            Great. You'll love it even more when I implement it in C.



            Raymond Hettinger


            Comment

            • Christos TZOTZIOY Georgiou

              #21
              Re: Py2.3: Feedback on Sets

              On Tue, 12 Aug 2003 06:02:17 GMT, rumours say that "Raymond Hettinger"
              <vze4rx4y@veriz on.net> might have written:

              [replying only to those that I have something substantial to say]
              [color=blue]
              >* Is the support for sets of sets necessary for your work
              > and, if so, then is the implementation sufficiently
              > powerful?[/color]

              I have used sets in:
              - Unix sysadm tasks (comparing usernames between passwd and shadow,
              finding common files in sync requests et al)
              - a hangman game (when the computer guesses words, to continuously
              restrict the possibilities based on the human input)
              - an image recognition program (comparing haar coefficients)

              These come to mind at the moment, but I have used them even in the
              python command line; and mostly I care about intersections.
              [color=blue]
              >* Does the performance meet your expectations?[/color]

              In the game and image recognition programs I could use more power;-)
              [color=blue]
              >* Are sets helpful in your daily work or does the need arise
              > only rarely?[/color]

              I use them often, it's a very helpful construct.
              [color=blue]
              >User feedback is essential to determining the future direction
              >of sets (whether it will be implemented in C, change API,
              >and/or be given supporting language syntax).[/color]

              Reimplementatio n in C sounds appropriate, and supporting language syntax
              would be nice.


              A quick thought, in the spirit of C implementation: there are cases
              where I would like to get the intersection of dicts (based on the keys),
              without having to create sets from the dict keys and then getting the
              relevant values. That is, given dicts a and b, I'd like:
              [color=blue][color=green][color=darkred]
              >>> a & b # imaginary[/color][/color][/color]

              to mean
              [color=blue][color=green][color=darkred]
              >>> dict([x, a[x] for x in sets.Set(a) & sets.Set(b)]) # real[/color][/color][/color]

              You may notice that a&b wouldn't be equivalent to b&a.
              Perhaps the speed difference would not be much; I'll grow a function in
              dictobject.c, run some benchmarks and come back with results for you.

              Another thought: it is unfortunate that an intersection *has* to be
              through continuous lookups (talking about the ordering of dict keys re
              their hash values, I'll have to delve into dictobject.c it seems), even
              taking into account the great speed of key lookups... although building
              the result dict should account for more processing cycles than the
              comparisons; and in some cases doing a dict.copy() and then removing the
              uncommon elements would be faster. Hm, food for thought, and no more
              than two hours to sleep now.

              Another slogan: Python keeps your mind awake (and c.l.py keeps your body
              away from bed :)
              --
              TZOTZIOY, I speak England very best,
              Microsoft Security Alert: the Matrix began as open source.

              Comment

              • Christos TZOTZIOY Georgiou

                #22
                Re: Py2.3: Feedback on Sets - diffudict.txt (0/1)

                On Wed, 20 Aug 2003 06:10:19 +0300, rumours say that Christos "TZOTZIOY"
                Georgiou <tzot@sil-tec.gr> might have written:
                [color=blue]
                >A quick thought, in the spirit of C implementation: there are cases
                >where I would like to get the intersection of dicts (based on the keys),
                >without having to create sets from the dict keys and then getting the
                >relevant values. That is, given dicts a and b, I'd like:
                >[color=green][color=darkred]
                >>>> a & b # imaginary[/color][/color]
                >
                >to mean
                >[color=green][color=darkred]
                >>>> dict([x, a[x] for x in sets.Set(a) & sets.Set(b)]) # real[/color][/color]
                >
                >You may notice that a&b wouldn't be equivalent to b&a.
                >Perhaps the speed difference would not be much; I'll grow a function in
                >dictobject.c , run some benchmarks and come back with results for you.[/color]

                I implemented dict.intersect( ), and it is *quite* faster than the
                equivalent Python code.

                *************** *************** *************** *************** **********

                Python 2.4a0 (#3, Aug 20 2003, 16:31:22)
                [GCC 3.2 (Mandrake Linux 9.0 3.2-1mdk)] on linux2
                Type "help", "copyright" , "credits" or "license" for more information.[color=blue][color=green][color=darkred]
                >>> help(dict.inter sect)[/color][/color][/color]
                Help on method_descript or:

                intersect(...)
                D.intersect(E) -> a subset of D having common keys with E
                [color=blue][color=green][color=darkred]
                >>> import sets
                >>> odds = dict(zip("abcde fghijklmn", range(1, 55, 2)))
                >>> evens= dict(zip("asdfg hj", range(2, 55, 2)))
                >>>
                >>> odds[/color][/color][/color]
                {'a': 1, 'c': 5, 'b': 3, 'e': 9, 'd': 7, 'g': 13, 'f': 11, 'i': 17, 'h':
                15, 'k': 21, 'j': 19, 'm': 25, 'l': 23, 'n': 27}[color=blue][color=green][color=darkred]
                >>> evens[/color][/color][/color]
                {'a': 2, 'd': 6, 'g': 10, 'f': 8, 'h': 12, 'j': 14, 's': 4}[color=blue][color=green][color=darkred]
                >>>
                >>>
                >>> dict([(k, odds[k]) for k in sets.Set(odds) & sets.Set(evens)])[/color][/color][/color]
                {'a': 1, 'd': 7, 'g': 13, 'f': 11, 'h': 15, 'j': 19}[color=blue][color=green][color=darkred]
                >>> odds.intersect( evens)[/color][/color][/color]
                {'a': 1, 'h': 15, 'j': 19, 'd': 7, 'g': 13, 'f': 11}[color=blue][color=green][color=darkred]
                >>> dict([(k, evens[k]) for k in sets.Set(odds) & sets.Set(evens)])[/color][/color][/color]
                {'a': 2, 'd': 6, 'g': 10, 'f': 8, 'h': 12, 'j': 14}[color=blue][color=green][color=darkred]
                >>> evens.intersect (odds)[/color][/color][/color]
                {'a': 2, 'h': 12, 'j': 14, 'd': 6, 'g': 10, 'f': 8}[color=blue][color=green][color=darkred]
                >>>
                >>>
                >>> my_setup= 'import sets; odds=dict(zip(" abcdefghijklmn" , range(1, 55, 2))); evens=dict(zip( "asdfghj", range(2, 55, 2)))'
                >>> from timeit import Timer
                >>>
                >>> Timer(stmt="odd s.intersect(eve ns)", setup=my_setup) .repeat()[/color][/color][/color]
                [1.3545670509338 379, 1.3367550373077 393, 1.3366960287094 116][color=blue][color=green][color=darkred]
                >>> Timer(stmt="eve ns.intersect(od ds)", setup=my_setup) .repeat()[/color][/color][/color]
                [1.3214920759201 05, 1.2869999408721 924, 1.3203419446945 19][color=blue][color=green][color=darkred]
                >>> Timer(stmt="dic t([(k, odds[k]) for k in sets.Set(odds) & sets.Set(evens)])", setup=my_setup) .repeat()[/color][/color][/color]
                [63.413245916366 577, 63.526772975921 631, 63.503224968910 217][color=blue][color=green][color=darkred]
                >>> Timer(stmt="dic t([(k, evens[k]) for k in sets.Set(odds) & sets.Set(evens)])", setup=my_setup) .repeat()[/color][/color][/color]
                [63.498296976089 478, 63.493119001388 55, 63.425426959991 455]

                *************** *************** *************** *************** **********

                A substantial difference, over 50x on an Athlon XP 1700. Also note the
                difference in the key order of the results.

                I believe that dicts should grow such a method, perhaps with another
                name.

                Attached is the diff -u for dictobject.c compared to the one in last
                night's python-latest.tgz
                --
                TZOTZIOY, I speak England very best,
                Microsoft Security Alert: the Matrix began as open source.

                Comment

                Working...