[perl-python] exercise: partition a list by equivalence

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • John Machin

    #16
    Re: exercise: partition a list by equivalence


    John Lenton wrote:[color=blue]
    > On Fri, Feb 18, 2005 at 04:52:46PM -0800, John Machin wrote:[color=green][color=darkred]
    > > > > needs "if rev[first] == rev[second]: continue" here
    > > >
    > > > an 'is' is enough, and better.[/color]
    > >
    > > Good point. You're redeeming yourself :-)[/color]
    >
    > this, together with you saying that it is hard to explain, makes me
    > think that you aren't comfortable thinking of lists as mutable
    > objects.[/color]

    How so? There is no connection between is/== and mutability. Let me
    amplify: The point about 'is' is a good one, and aids your redemption
    after your failure to have adequate guards caused your algorithm not to
    work.
    [color=blue]
    >
    >[color=green][color=darkred]
    > > > what is magic about it? is it really that horrible?[/color]
    > >
    > > Try explaining to the newbies over on the tutor list how despite[/color][/color]
    "res"[color=blue][color=green]
    > > only ever *explicitly* having little bits like [3, 4] appended to[/color][/color]
    it,[color=blue][color=green]
    > > it (or more properly the thing to which it refers) is actually
    > > festering and growing and being mutated under the surface until at[/color][/color]
    the[color=blue][color=green]
    > > finale it bursts out dripping slime just like the creature from the
    > > black lagoon ...[/color]
    >
    > understanding why that works, and why it is 'is' and not '==', are
    > both part of the same thing.[/color]

    What same thing is that?
    [color=blue]
    > Lists are mutable, and you can mutate
    > them, and they mutate. Unless you actually write code that uses the
    > fact you will forget it, and it will bite you. Of course, don't use[/color]
    it[color=blue]
    > just for the heck of it, but that creature you dismiss as a
    > slime-dripping mutation is actually quite useful.[/color]

    You are confusing mutability of lists (without which they would be
    called tuples!) with my point that the 'res' list was having its
    contents fiddled with implicitly through other "pointers".
    [color=blue]
    >
    > While I'm at being unpolite, do you really think this code was harder
    > to understand than the code posted by anton, using numarray?[/color]

    I only read it as far as the bit where it creating a matrix of size
    O(N**2) -- in my app N can be over a million so I lost interest
    rapidly.
    [color=blue]
    >
    > And, of course, if this code were for anything non-throw-awayable,
    > there would've been quite a bit of explaining going on between those
    > lines of code.
    >[/color]
    Not of course, but of necessity.
    [color=blue]
    > Ok, now back to being polite :)[/color]

    Welcome back.

    Comment

    • John Lenton

      #17
      Re: exercise: partition a list by equivalence

      On Fri, Feb 18, 2005 at 09:57:59PM -0800, John Machin wrote:[color=blue][color=green]
      > >
      > > this, together with you saying that it is hard to explain, makes me
      > > think that you aren't comfortable thinking of lists as mutable
      > > objects.[/color]
      >
      > How so? There is no connection between is/== and mutability. Let me
      > amplify: The point about 'is' is a good one, and aids your redemption
      > after your failure to have adequate guards caused your algorithm not to
      > work.[/color]

      hey, it worked for all the test cases provided by the customer! :) and
      then some; I hadn't thought of checking for cycles nor repetetitions.
      [color=blue]
      >[snip]
      >
      > You are confusing mutability of lists (without which they would be
      > called tuples!) with my point that the 'res' list was having its
      > contents fiddled with implicitly through other "pointers".[/color]

      umm... nope, see, well, hair-splitting and all that, there is this
      list that holds the partitions; the partitions are lists of
      elements. There is a reverse mapping that, for each element, holds the
      partition that element is in. So basically you go down the list of
      pairings, modifying the partitions as you go. I'm certain if I had
      commented the function appropriately, we wouldn't be having this
      discussion :)

      Let me see if I can remedy that:

      def merge(pairings) :
      """
      merge(pairings) takes a list of pairs, each pair indicates
      the sameness of the two indexes. Returns a partitioned list
      of same indexes.

      For example, if the input is
      merge( [ [1,2], [2,4], [5,6] ] );

      that means 1 and 2 are the same. 2 and 4 are the
      same. Therefore 1==2==4. The result returned is

      [[4,2,1],[6,5]];

      (ordering of the returned list and sublists are not
      specified.)
      """

      # the list of partitions, or partitioned list.
      # (each partition is a list of elements)
      res = []

      # reverse mapping (element -> partition it is in)
      rev = {}
      for pairing in pairings:
      first, second = pairing
      has_first = first in rev
      has_second = second in rev
      if has_first and has_second:
      # both elements already in a partition...
      if rev[first] is rev[second]:
      # both elements in same partition, nothing to do
      continue
      # joining the two partitions:
      # - grow one partition with the other one
      rev[first].extend(rev[second])
      # - clear the other one
      rev[second][:] = []
      # update reverse mapping
      rev[second] = rev[first]
      elif has_first:
      # partition already exists, just add an element to it
      rev[first].append(second)
      # update reverse mapping
      rev[second] = rev[first]
      elif has_second:
      # ditto
      rev[second].append(first)
      rev[first] = rev[second]
      else:
      # create partition from scratch
      if first == second:
      new = [first]
      else:
      new = [first, second]
      # add it to list of partitions
      res.append(new)
      # update reverse mapping
      rev[first] = rev[second] = new
      # remove empty partitions
      return filter(None, res)

      hrmph, I should hit the sack. Sorry if this is still ugly, I'm too
      tired to tell.

      --
      John Lenton (john@grulic.or g.ar) -- Random fortune:
      Todo bicho que camina va a parar cuando se canse.

      -----BEGIN PGP SIGNATURE-----
      Version: GnuPG v1.4.0 (GNU/Linux)

      iD8DBQFCFulQgPq u395ykGsRAn9kAJ 9bTmJYwo5L90AE3 mUiQ30MINy0hwCZ ARCA
      npF5v2EOF8fnMT8 vS5AGed4=
      =ELjA
      -----END PGP SIGNATURE-----

      Comment

      • Xah Lee

        #18
        [perl-python] exercise: partition a list by equivalence

        here's the answer to the partition by equivalence exercise.

        -------------------------------------------

        # Perl code

        sub merge($) {
        my @pairings = @{$_[0]};

        my @interm; # array of hashs

        # chop the first value of @pairings into @interm
        $interm[0]={$pairings[0][0]=>'x'}; ${interm[0]}{$pairings[0][1]}='x';
        shift @pairings;

        N1: for my $aPair (@pairings) {
        for my $aGroup (@interm) {
        if (exists ${$aGroup}{$aPa ir->[0]})
        {${$aGroup}{$aP air->[1]}='x'; next N1}
        if (exists ${$aGroup}{$aPa ir->[1]})
        {${$aGroup}{$aP air->[0]}='x'; next N1}
        }
        push @interm, {$aPair->[0]=>'x'}; ${interm[-1]}{$aPair->[1]}='x';
        }

        my @fin = shift @interm;

        N2: for my $group (@interm) {
        for my $newcoup (@fin) {
        foreach my $k (keys %$group) {
        if (exists ${$newcoup}{$k} ) {map { ${$newcoup}{$_} ='x'}
        (keys %$group); next N2;}
        }
        }
        push @fin, $group;
        }
        return map {[keys (%$_)]} @fin;
        }

        -----------------------------------------------
        # Here's a direct translation of the Perl code above into python:
        ©
        ©def merge(pairings) : # pairings is a list of couples. e.g.
        [(9,2),(7,6),...]
        ©
        © # interm is a list of groups. Each group is a list that hold
        © # equivalent numbers. interm stands for interim result. Each
        group
        © # is a dictionary. Keys are numbers, values are all dummy
        © # 'x'. Dictionary is used for ease of dealing with duplicates or
        © # checking existence.
        © interm=[];
        ©
        © # move first pair of pairings into interm as the first group
        © interm.append({ pairings[0][0]:'x', pairings[0][1]:'x'}) ; del
        pairings[0]
        ©
        © # go thru pairings. For each pair, check if it is in any group in
        © # interm. If any part of pair is in a group, then add the other
        © # part into that group. If no part of the pair is in any group,
        © # then add this pair into interm as a new group.
        © for aPair in pairings:
        © for aGroup in interm:
        © if (aGroup.has_key (aPair[0])): aGroup[aPair[1]]='x';
        break
        © if (aGroup.has_key (aPair[1])): aGroup[aPair[0]]='x';
        break
        © else: interm.append( {aPair[0]:'x'} );
        interm[-1][aPair[1]]='x'
        ©
        © # now make another pass of the groups in interm, because some
        pair
        © # that may connect two groups (i.e. with one element in one
        group,
        © # and second element in another group), yet the pair is simply
        © # consumed by a group.
        © # This pass will check if there are any element in any other
        © # group, if so, such two groups will be unioned. In this pass, we
        © # move things from interm into fin. fin==final.
        © fin=[]; fin.append(inte rm.pop(0))
        © for group in interm:
        © for newcoup in fin:
        © for k in group.keys():
        © if newcoup.has_key (k):
        © for kk in group.keys(): newcoup[kk]='x';
        © break
        © break
        © fin.append(grou p)
        ©
        © # now turn the dictionaries into lists for return value
        © result=[];
        © for group in fin: result.append(g roup.keys())
        © return result
        ©
        -------------------
        I wrote this (Perl) program in 2003-09, and now basically forgot all
        about the internals. The original Perl code does not have inline
        comments, nor public consumable documentation as this is my own
        project. In the process of translation and the publication and
        explanation on this page, i eventually have to re-acquaint the
        algorithm i used as i go thru the lines. I was thinking of a quick
        brainless translation word-for-word, but that turned out not possible
        as i run into problems.

        (While i'm learning Python, i run into frustrations with the Python
        Documentation. (although it has far more quality than Perl
        documentations) . The frustrations with documentations will be appended
        to this page later: How to write a tutorial )

        The translation problem i run into is this. In Perl, there's a GOTO
        construct where in a loop one can say "break XYZ" to jump to a
        arbitrary outer loop labeled XYZ. Python has "break" but does not
        provide a GOTO jump as in Perl. In the process, i have to actually
        figure out (for the first time for me) how loops with GOTO jumps can be
        translated to alternative structure. This turned out not to be too
        hard. For a GOTO jump to a far outer group, one can use multiple breaks
        at the end of each loop, possibly in addiction adding a "else"
        clause to the different levels of the loops. (Python language can have
        a "else" clause for "for" loops. It is executed when the loop
        completes. (as opposed to when a break inside jumped out))

        Here is a loop with GOTO, translated into Python without:

        N1: for my $aPair (@pairings) {
        for my $aGroup (@interm) {
        if (exists ${$aGroup}{$aPa ir->[0]})
        {${$aGroup}{$aP air->[1]}='x'; next N1}
        if (exists ${$aGroup}{$aPa ir->[1]})
        {${$aGroup}{$aP air->[0]}='x'; next N1}
        }
        push @interm, {$aPair->[0]=>'x'}; ${interm[-1]}{$aPair->[1]}='x';
        }
        ©-----------------------------------
        © for aPair in pairings:
        © for aGroup in interm:
        © if (aGroup.has_key (aPair[0])): aGroup[aPair[1]]='x';
        break
        © if (aGroup.has_key (aPair[1])): aGroup[aPair[0]]='x';
        break
        © else: interm.append( {aPair[0]:'x'} );
        interm[-1][aPair[1]]='x'
        ©

        Below is another such trans-structure, distinct from the above.

        N2: for my $group (@interm) {
        for my $newcoup (@fin) {
        foreach my $k (keys %$group) {
        if (exists ${$newcoup}{$k} ) {map { ${$newcoup}{$_} ='x'}
        (keys %$group); next N2;}
        }
        }
        push @fin, $group;
        }
        ©-----------------------------------
        © for group in interm:
        © for newcoup in fin:
        © for k in group.keys():
        © if newcoup.has_key (k):
        © for kk in group.keys(): newcoup[kk]='x';
        © break
        © break
        © fin.append(grou p)
        ©

        The Python version above has not been tested much, but i suppose it is
        fine since it is basically a copy of the Perl code. The Perl version is
        fairly reliable, as it went thru the gauntlet of special cases testing
        and i've used it over the year in the larger program... however no any
        formal proof or exhaustive machine testing has been done. Later i might
        write some program to auto-test them... but that'd be another day. The
        Python version can also use some clean up, or even rewrite as a program
        in the Python language proper. Addenda or Errata will be added on this
        page.

        PS all together there are some 3 or so solutions posted on the
        newsgroups. (one by private email) I will have to filter them out and
        study them.

        Any interesting or important Addenda or Errata will be emailed out
        later.
        In addition to being archived here:


        Xah
        xah@xahlee.org


        Comment

        • Reinhold Birkenfeld

          #19
          Re: [perl-python] exercise: partition a list by equivalence

          Xah Lee wrote:[color=blue]
          > here's the answer to the partition by equivalence exercise.[/color]

          Your Python solution is, as expected, wrong (try with
          [[10, 8], [7, 3], [1, 7], [5, 4], [2, 2], [3, 8], [7, 10], [2, 3], [6,
          10], [3, 2]]
          for example).

          The solution by John Lenton is wrong, too.

          The solution by Brian Beck delivers the correct result for most input,
          but for some input lists like
          [[3, 3], [8, 7], [3, 2], [8, 5], [5, 6], [6, 3], [10, 8], [8, 10], [4,
          10], [10, 2]]
          it chokes and returns the empty list.

          My solution (which may not be the fastest or most effective, but till
          now is the shortest <wink> and it works):

          def merge(pairings) :
          sets = {}
          for x1, x2 in pairings:
          newset = (sets.get(x1, frozenset([x1]))
          | sets.get(x2, frozenset([x2])))
          for i in newset:
          sets[i] = newset

          return[list(aset) for aset in set(sets.iterva lues())]


          Reinhold

          Comment

          • bearophileHUGS@lycos.com

            #20
            Re: exercise: partition a list by equivalence

            David Eppstein:[color=blue]
            > However it might be easier to treat the input pairs as the edges of a
            > graph and apply a depth-first-search based graph connected components
            > algorithm. || This would be O(n), though.[/color]

            Is the DFS the best one here (instead of BFS)?

            With the graph implementation that I've just shown here it becomes:

            .. import graph
            .. def merge(l):
            .. g = graph.Graph()
            .. g.addArcs(l)
            .. return g.connectedComp onents()
            .. print merge( [ [1,2], [2,4], [5,6] ] )

            I presume the complexity is O(n+a); n (the nodes) is proportional to
            the number of pairs, and
            a (the arcs) depends on the "intricacy" of the input pairs. For a
            complete graph, this can
            become slow (but there is a high probably that all the pairs are in the
            same connected component).
            Bye,
            Bearophile

            Comment

            • bearophileHUGS@lycos.com

              #21
              Re: exercise: partition a list by equivalence

              Bearophile:[color=blue]
              > I presume the complexity is O(n+a); n (the nodes)
              > is proportional to the number of pairs, and a
              > (the arcs) depends on the "intricacy" of the input pairs.[/color]

              Opps... n (the number of nodes) is the number of different numbers
              contained in the pairs :-]

              Bearophile

              Comment

              • David Eppstein

                #22
                Re: exercise: partition a list by equivalence

                In article <1108837453.883 740.309270@f14g 2000cwb.googleg roups.com>,
                bearophileHUGS@ lycos.com wrote:
                [color=blue]
                > David Eppstein:[color=green]
                > > However it might be easier to treat the input pairs as the edges of a
                > > graph and apply a depth-first-search based graph connected components
                > > algorithm. || This would be O(n), though.[/color]
                >
                > Is the DFS the best one here (instead of BFS)?[/color]

                I'm not sure it makes much difference.
                [color=blue]
                > With the graph implementation that I've just shown here it becomes:
                >
                > . import graph
                > . def merge(l):
                > . g = graph.Graph()
                > . g.addArcs(l)
                > . return g.connectedComp onents()
                > . print merge( [ [1,2], [2,4], [5,6] ] )
                >
                > I presume the complexity is O(n+a); n (the nodes) is proportional to
                > the number of pairs, and
                > a (the arcs) depends on the "intricacy" of the input pairs. For a
                > complete graph, this can
                > become slow (but there is a high probably that all the pairs are in the
                > same connected component).[/color]

                It's still linear in the input size, which is the best you could hope
                for.

                --
                David Eppstein
                Computer Science Dept., Univ. of California, Irvine

                Comment

                • Xah Lee

                  #23
                  Re: exercise: partition a list by equivalence

                  The GOTO statement from Perl has been messed up.

                  This block:
                  © for group in interm:
                  © for newcoup in fin:
                  © for k in group.keys():
                  © if newcoup.has_key (k):
                  © for kk in group.keys(): newcoup[kk]='x';
                  © break
                  © break
                  © fin.append(grou p)
                  ©

                  should be:
                  © goto=False
                  © for group in interm:
                  © for newcoup in fin:
                  © for k in group.keys():
                  © if newcoup.has_key (k):
                  © newcoup.update( group);
                  © goto=True
                  © break
                  © if (goto):
                  © goto=False
                  © break
                  © else:
                  © fin.append(grou p)

                  comlete code is at:


                  Xah
                  xah@xahlee.org


                  Comment

                  • Brian Beck

                    #24
                    Re: [perl-python] exercise: partition a list by equivalence

                    Reinhold Birkenfeld wrote:[color=blue]
                    > def merge(pairings) :
                    > sets = {}
                    > for x1, x2 in pairings:
                    > newset = (sets.get(x1, frozenset([x1]))
                    > | sets.get(x2, frozenset([x2])))
                    > for i in newset:
                    > sets[i] = newset
                    >
                    > return[list(aset) for aset in set(sets.iterva lues())][/color]

                    Looks good. I used it as inspiration for this new one, which takes less
                    time for large datasets, and especially for those where a good number of
                    merges are expected (at least that looks to be the pattern; with some
                    large datasets this takes less than a quarter of the time as the one
                    above). I'm sure it can be improved still -- yours is still faster in
                    many cases.

                    def merge2(pairings ):
                    elements = {}
                    sets = []
                    for x1, x2 in pairings:
                    i = [elements.get(x1 , -1), elements.get(x2 , -1)]
                    i.sort()
                    if i[1] == -1:
                    i[1] = len(sets)
                    sets.append(set ([x1, x2]))
                    elif i[0] == -1:
                    sets[i[1]] |= set([x1, x2])
                    elif i[0] == i[1]:
                    continue
                    else:
                    sets[i[1]] |= sets[i[0]]
                    sets[i[0]].clear()

                    for x in sets[i[1]]:
                    elements[x] = i[1]

                    return[list(s) for s in sets if s]

                    # Comparison
                    import profile
                    import random

                    # Play with these values
                    xMax, nPairs = 1000, 5000

                    l = [[random.randint( 0, xMax), random.randint( 0, xMax)] for i in
                    range(nPairs)]

                    print 'merge2:'
                    profile.run('me rge2(l)') # Mine

                    print 'merge:'
                    profile.run('me rge(l)') # Reinhold's

                    --
                    Brian Beck
                    Adventurer of the First Order

                    Comment

                    • John Machin

                      #25
                      Re: exercise: partition a list by equivalence


                      Reinhold Birkenfeld wrote:[color=blue]
                      > Xah Lee wrote:[color=green]
                      > > here's the answer to the partition by equivalence exercise.[/color]
                      >
                      > Your Python solution is, as expected, wrong (try with
                      > [[10, 8], [7, 3], [1, 7], [5, 4], [2, 2], [3, 8], [7, 10], [2, 3],[/color]
                      [6,[color=blue]
                      > 10], [3, 2]]
                      > for example).
                      >
                      > The solution by John Lenton is wrong, too.
                      >
                      > The solution by Brian Beck delivers the correct result for most[/color]
                      input,[color=blue]
                      > but for some input lists like
                      > [[3, 3], [8, 7], [3, 2], [8, 5], [5, 6], [6, 3], [10, 8], [8, 10],[/color]
                      [4,[color=blue]
                      > 10], [10, 2]]
                      > it chokes and returns the empty list.
                      >
                      > My solution (which may not be the fastest or most effective, but till
                      > now is the shortest <wink> and it works):
                      >
                      > def merge(pairings) :
                      > sets = {}
                      > for x1, x2 in pairings:
                      > newset = (sets.get(x1, frozenset([x1]))
                      > | sets.get(x2, frozenset([x2])))
                      > for i in newset:
                      > sets[i] = newset
                      >
                      > return[list(aset) for aset in set(sets.iterva lues())]
                      >
                      >
                      > Reinhold[/color]

                      FWIW, here's a brief UAT report:

                      Appears to work: Reinhold, David, Xah (v2)
                      Has bug(s): John L (v*), Brian (v*)
                      Incomprehensibl e: Xah (v*)

                      'Come back after lunch' award goes to Xah v2, which at a glance appears
                      to be O(N**4) -- dictionary.upda te() inside 3-deep nest of 'for'
                      statements.

                      Here's a small input that busts all non-working versions:

                      input: [[1, 2], [3, 4], [2, 3], [4, 5]]
                      merge_RB -> [[1, 2, 3, 4, 5]]
                      merge_DE -> [[1, 2, 3, 4, 5]]
                      merge_JL -> [[1, 2, 3, 4], [5]]
                      merge_JL2 -> [[1, 2, 3, 4], [5]]
                      merge_BB -> []
                      merge_XL -> [[1, 2, 3, 4, 5], [3, 4, 5]]
                      merge_XL2 -> [[1, 2, 3, 4, 5]]

                      Comment

                      • Xah Lee

                        #26
                        Re: exercise: partition a list by equivalence

                        an interesting problem so developed now is to write a function that
                        generate test cases for the purpose of testing performance. (just for
                        fun)

                        the design of this function could be interesting. We want to be able to
                        give parameters in this function so as to spit out all possible screw
                        test cases. First of all, a range of n (some millions) numbers. Then, a
                        fraction that specifies the percentage of these number are equivalent.
                        1 being all equivalent. 0 being all "distinct" (having only one
                        equivalent member (since the input comes in pairs)). Then we want to
                        have a parameter that says something like the sizes of the equivalence
                        groups. Suppose 50% of number are equal among themselves (i.e. have
                        more than one equivalent member). 1 would be all in one group. 0 would
                        mean all partitions have size 3 or 2. (there are more to it here...
                        since this is a distribution) ... Then, we need to turn this range of
                        integers into pairs. That's something more to think about.

                        So with this function at hand, we'll be able to say for sure which code
                        performs better (and under what type of input)

                        the Order notion in computing mathematics is fairly useless for finer
                        details.

                        PS it is not trivial to design this pair generating function. I don't
                        know if the above is on the right track, but essentially we want to
                        categorize the type of inputs according to the mathematical operational
                        performance aspect of partition by equivalence, and distill them into
                        parameters.

                        another func to write is one that canonicalize the output. Such as
                        sorting. So that all results can be compared simply by = in Python.

                        failing to design a elaborate pair_gen, we can just have pairs of
                        random numbers. But exactly what nature is such input... is more to
                        think about.

                        (in my original application, each number represent a computer file,
                        there are up to tens of thousands of files, and much less than 0.1% is
                        same as another, and for files that are same, each equivalent group
                        number no more than 10 or so.)

                        Xah
                        xah@xahlee.org



                        John Machin wrote:[color=blue]
                        > FWIW, here's a brief UAT report:
                        >
                        > Appears to work: Reinhold, David, Xah...
                        > Has bug(s): ...[/color]

                        Comment

                        • bearophileHUGS@lycos.com

                          #27
                          Re: exercise: partition a list by equivalence

                          John Machin>FWIW, here's a brief UAT report:

                          Here is something else for you.
                          Note: for more correct comparisons, for the following tests I've
                          disabled the use of Psyco in Graph (usually enabled, if present).
                          This looks faster than merge2 for this (quite sparse) random pairs
                          test:

                          .. import graph # http://www.fantascienza.net/animalia/graph.zip
                          .. def merge4(l):
                          .. g = graph.Graph()
                          .. for n1,n2 in l:
                          .. g.addArc(n1,n2)
                          .. return g.connectedComp onents()

                          Creating a better addArcs produces good results (addArcs method is
                          present in Graph, but addArcs2 isn't present):

                          .. def merge5(l):
                          .. g = graph.Graph()
                          .. g.addArcs2(l)
                          .. return g.connectedComp onents()


                          If you like to see the actual code, without using Graph, I've made a
                          very stripped down version of addArcs2 and connectedCompon ents:

                          .. from collections import deque
                          .. from itertools import chain
                          ..
                          .. def merge6(l):
                          .. # graph creation
                          .. gi = {}
                          .. go = {}
                          .. for n1,n2 in l:
                          .. if n1 not in go:
                          .. go[n1] = {}
                          .. gi[n1] = {}
                          .. if n2 not in go:
                          .. go[n2] = {}
                          .. gi[n2] = {}
                          .. go[n1][n2] = 0
                          .. gi[n2][n1] = 0
                          .. # Connected components:
                          .. result = []
                          .. visited = set()
                          .. for root in go.iterkeys():
                          .. if root not in visited:
                          .. component = [root]
                          .. visited.add(roo t)
                          .. Q = deque() # A queue
                          .. Q.append(root)
                          .. while Q:
                          .. n1 = Q.popleft()
                          .. for n2 in chain( go[n1].iterkeys(), gi[n1].iterkeys()
                          ):
                          .. if n2 not in visited:
                          .. visited.add(n2)
                          .. Q.append(n2)
                          .. component.appen d(n2)
                          .. result.append(c omponent)
                          .. return result


                          At the end of the tests I've added this to be sure the results are
                          always the same:

                          r2 = set( frozenset(e) for e in merge2(l) )
                          r4 = set( frozenset(e) for e in merge4(l) )
                          r5 = set( frozenset(e) for e in merge5(l) )
                          r6 = set( frozenset(e) for e in merge6(l) )
                          assert r2 == r4
                          assert r2 == r6

                          For xMax, nPairs = 1000, 5000:
                          merge2: 15750 function calls in 0.939 CPU seconds
                          merge4: 11029 function calls in 0.560 CPU seconds
                          merge5: 6030 function calls in 0.303 CPU seconds
                          merge6: 6011 function calls in 0.297 CPU seconds

                          For xMax, nPairs = 2000, 10000:
                          merge2: 31503 function calls in 2.505 CPU seconds
                          merge6: 12011 function calls in 0.639 CPU seconds

                          Timings using clock() (much more reliable than the profilers!), for
                          xMax, nPairs = 2000, 10000:
                          merge2: 1.222
                          merge6: 0.201

                          Probably merge6 can be improved, reducing memory used...

                          Bear hugs,
                          Bearophile

                          Comment

                          • John Machin

                            #28
                            Re: exercise: partition a list by equivalence - merge_JM.py (0/1)

                            On 19 Feb 2005 17:56:27 -0800, bearophileHUGS@ lycos.com wrote:
                            [color=blue]
                            >John Machin>FWIW, here's a brief UAT report:
                            >
                            >Here is something else for you.
                            >Note: for more correct comparisons, for the following tests I've
                            >disabled the use of Psyco in Graph (usually enabled, if present).
                            >This looks faster than merge2 for this (quite sparse) random pairs
                            >test:[/color]

                            [snip]
                            [color=blue]
                            >
                            >Timings using clock() (much more reliable than the profilers!), for
                            >xMax, nPairs = 2000, 10000:
                            >merge2: 1.222
                            >merge6: 0.201
                            >
                            >Probably merge6 can be improved, reducing memory used...[/color]

                            Perhaps I'm missing a posting of yours -- what are merge2 and merge4?
                            What is "this random pairs test"? What is xMax (nPairs isn't hard to
                            guess), and how are you generating your input data?

                            FWIW, my offering is attached.

                            Comment

                            • Xah Lee

                              #29
                              [perl-python] exercise: partition a list by equivalence

                              when i try to run the following program, Python complains about some
                              global name frozenset is not defined. Is set some new facility in
                              Python 2.4?

                              ©# from Reinhold Birkenfeld
                              ©def merge(pairings) :
                              © sets = {}
                              © for x1, x2 in pairings:
                              © newset = (sets.get(x1, frozenset([x1]))
                              © | sets.get(x2, frozenset([x2])))
                              © for i in newset:
                              © sets[i] = newset
                              ©
                              © return[list(aset) for aset in set(sets.iterva lues())]

                              it would be nice if the two working programs do not use some package.
                              This problem shouldn't need to.

                              Xah
                              xah@xahlee.org


                              Comment

                              • John Machin

                                #30
                                Re: exercise: partition a list by equivalence


                                Xah Lee wrote:[color=blue]
                                > when i try to run the following program, Python complains about some
                                > global name frozenset is not defined. Is set some new facility in
                                > Python 2.4?[/color]

                                The official home of the Python Programming Language



                                You must be running 2.3. If you persist in running 2.3, put the
                                following at the top of the file:

                                import sys
                                PY_VERSION = sys.version_inf o[:2]
                                if PY_VERSION == (2,3):
                                from sets import Set as set
                                from sets import ImmutableSet as frozenset

                                That will get Reinhold's gadget working under 2.3. The bearophile's
                                function uses collections.deq ue which is built-in to 2.4.
                                [color=blue]
                                > it would be nice if the two working programs do not use some package.
                                > This problem shouldn't need to.[/color]

                                Look at the newsgroup again. By my count there are apparently-working
                                versions from SIX people: Reinhold, bear.+, Brian Beck, David Eppstein,
                                yourself, and me. Only David's uses stuff that's not in the Python 2.4
                                off-the-shelf distribution.

                                Comment

                                Working...