[perl-python] exercise: partition a list by equivalence

**John Machin** · Jul 18 '05, 09:49 PM

Re: exercise: partition a list by equivalence

John Lenton wrote:[color=blue]
> On Fri, Feb 18, 2005 at 04:52:46PM -0800, John Machin wrote:[color=green][color=darkred]
> > > > needs "if rev[first] == rev[second]: continue" here
> > >
> > > an 'is' is enough, and better.[/color]
> >
> > Good point. You're redeeming yourself :-)[/color]
>
> this, together with you saying that it is hard to explain, makes me
> think that you aren't comfortable thinking of lists as mutable
> objects.[/color]

How so? There is no connection between is/== and mutability. Let me
amplify: The point about 'is' is a good one, and aids your redemption
after your failure to have adequate guards caused your algorithm not to
work.
[color=blue]
>
>[color=green][color=darkred]
> > > what is magic about it? is it really that horrible?[/color]
> >
> > Try explaining to the newbies over on the tutor list how despite[/color][/color]
"res"[color=blue][color=green]
> > only ever *explicitly* having little bits like [3, 4] appended to[/color][/color]
it,[color=blue][color=green]
> > it (or more properly the thing to which it refers) is actually
> > festering and growing and being mutated under the surface until at[/color][/color]
the[color=blue][color=green]
> > finale it bursts out dripping slime just like the creature from the
> > black lagoon ...[/color]
>
> understanding why that works, and why it is 'is' and not '==', are
> both part of the same thing.[/color]

What same thing is that?
[color=blue]
> Lists are mutable, and you can mutate
> them, and they mutate. Unless you actually write code that uses the
> fact you will forget it, and it will bite you. Of course, don't use[/color]
it[color=blue]
> just for the heck of it, but that creature you dismiss as a
> slime-dripping mutation is actually quite useful.[/color]

You are confusing mutability of lists (without which they would be
called tuples!) with my point that the 'res' list was having its
contents fiddled with implicitly through other "pointers".
[color=blue]
>
> While I'm at being unpolite, do you really think this code was harder
> to understand than the code posted by anton, using numarray?[/color]

I only read it as far as the bit where it creating a matrix of size
O(N**2) -- in my app N can be over a million so I lost interest
rapidly.
[color=blue]
>
> And, of course, if this code were for anything non-throw-awayable,
> there would've been quite a bit of explaining going on between those
> lines of code.
>[/color]
Not of course, but of necessity.
[color=blue]
> Ok, now back to being polite :)[/color]

Welcome back.

**John Lenton** · Jul 18 '05, 09:49 PM

Re: exercise: partition a list by equivalence

On Fri, Feb 18, 2005 at 09:57:59PM -0800, John Machin wrote:[color=blue][color=green]
> >
> > this, together with you saying that it is hard to explain, makes me
> > think that you aren't comfortable thinking of lists as mutable
> > objects.[/color]
>
> How so? There is no connection between is/== and mutability. Let me
> amplify: The point about 'is' is a good one, and aids your redemption
> after your failure to have adequate guards caused your algorithm not to
> work.[/color]

hey, it worked for all the test cases provided by the customer! :) and
then some; I hadn't thought of checking for cycles nor repetetitions.
[color=blue]
>[snip]
>
> You are confusing mutability of lists (without which they would be
> called tuples!) with my point that the 'res' list was having its
> contents fiddled with implicitly through other "pointers".[/color]

umm... nope, see, well, hair-splitting and all that, there is this
list that holds the partitions; the partitions are lists of
elements. There is a reverse mapping that, for each element, holds the
partition that element is in. So basically you go down the list of
pairings, modifying the partitions as you go. I'm certain if I had
commented the function appropriately, we wouldn't be having this
discussion :)

Let me see if I can remedy that:

def merge(pairings) :
"""
merge(pairings) takes a list of pairs, each pair indicates
the sameness of the two indexes. Returns a partitioned list
of same indexes.

For example, if the input is
merge( [ [1,2], [2,4], [5,6] ] );

that means 1 and 2 are the same. 2 and 4 are the
same. Therefore 1==2==4. The result returned is

[[4,2,1],[6,5]];

(ordering of the returned list and sublists are not
specified.)
"""

# the list of partitions, or partitioned list.
# (each partition is a list of elements)
res = []

# reverse mapping (element -> partition it is in)
rev = {}
for pairing in pairings:
first, second = pairing
has_first = first in rev
has_second = second in rev
if has_first and has_second:
# both elements already in a partition...
if rev[first] is rev[second]:
# both elements in same partition, nothing to do
continue
# joining the two partitions:
# - grow one partition with the other one
rev[first].extend(rev[second])
# - clear the other one
rev[second][:] = []
# update reverse mapping
rev[second] = rev[first]
elif has_first:
# partition already exists, just add an element to it
rev[first].append(second)
# update reverse mapping
rev[second] = rev[first]
elif has_second:
# ditto
rev[second].append(first)
rev[first] = rev[second]
else:
# create partition from scratch
if first == second:
new = [first]
else:
new = [first, second]
# add it to list of partitions
res.append(new)
# update reverse mapping
rev[first] = rev[second] = new
# remove empty partitions
return filter(None, res)

hrmph, I should hit the sack. Sorry if this is still ugly, I'm too
tired to tell.

--
John Lenton (john@grulic.or g.ar) -- Random fortune:
Todo bicho que camina va a parar cuando se canse.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFCFulQgPq u395ykGsRAn9kAJ 9bTmJYwo5L90AE3 mUiQ30MINy0hwCZ ARCA
npF5v2EOF8fnMT8 vS5AGed4=
=ELjA
-----END PGP SIGNATURE-----

**Xah Lee** · Jul 18 '05, 09:49 PM

[perl-python] exercise: partition a list by equivalence

here's the answer to the partition by equivalence exercise.

-------------------------------------------

# Perl code

sub merge($) {
my @pairings = @{$_[0]};

my @interm; # array of hashs

# chop the first value of @pairings into @interm
$interm[0]={$pairings[0][0]=>'x'}; ${interm[0]}{$pairings[0][1]}='x';
shift @pairings;

N1: for my $aPair (@pairings) {
for my $aGroup (@interm) {
if (exists ${$aGroup}{$aPa ir->[0]})
{${$aGroup}{$aP air->[1]}='x'; next N1}
if (exists ${$aGroup}{$aPa ir->[1]})
{${$aGroup}{$aP air->[0]}='x'; next N1}
}
push @interm, {$aPair->[0]=>'x'}; ${interm[-1]}{$aPair->[1]}='x';
}

my @fin = shift @interm;

N2: for my $group (@interm) {
for my $newcoup (@fin) {
foreach my $k (keys %$group) {
if (exists ${$newcoup}{$k} ) {map { ${$newcoup}{$_} ='x'}
(keys %$group); next N2;}
}
}
push @fin, $group;
}
return map {[keys (%$_)]} @fin;
}

-----------------------------------------------
# Here's a direct translation of the Perl code above into python:
©
©def merge(pairings) : # pairings is a list of couples. e.g.
[(9,2),(7,6),...]
©
© # interm is a list of groups. Each group is a list that hold
© # equivalent numbers. interm stands for interim result. Each
group
© # is a dictionary. Keys are numbers, values are all dummy
© # 'x'. Dictionary is used for ease of dealing with duplicates or
© # checking existence.
© interm=[];
©
© # move first pair of pairings into interm as the first group
© interm.append({ pairings[0][0]:'x', pairings[0][1]:'x'}) ; del
pairings[0]
©
© # go thru pairings. For each pair, check if it is in any group in
© # interm. If any part of pair is in a group, then add the other
© # part into that group. If no part of the pair is in any group,
© # then add this pair into interm as a new group.
© for aPair in pairings:
© for aGroup in interm:
© if (aGroup.has_key (aPair[0])): aGroup[aPair[1]]='x';
break
© if (aGroup.has_key (aPair[1])): aGroup[aPair[0]]='x';
break
© else: interm.append( {aPair[0]:'x'} );
interm[-1][aPair[1]]='x'
©
© # now make another pass of the groups in interm, because some
pair
© # that may connect two groups (i.e. with one element in one
group,
© # and second element in another group), yet the pair is simply
© # consumed by a group.
© # This pass will check if there are any element in any other
© # group, if so, such two groups will be unioned. In this pass, we
© # move things from interm into fin. fin==final.
© fin=[]; fin.append(inte rm.pop(0))
© for group in interm:
© for newcoup in fin:
© for k in group.keys():
© if newcoup.has_key (k):
© for kk in group.keys(): newcoup[kk]='x';
© break
© break
© fin.append(grou p)
©
© # now turn the dictionaries into lists for return value
© result=[];
© for group in fin: result.append(g roup.keys())
© return result
©
-------------------
I wrote this (Perl) program in 2003-09, and now basically forgot all
about the internals. The original Perl code does not have inline
comments, nor public consumable documentation as this is my own
project. In the process of translation and the publication and
explanation on this page, i eventually have to re-acquaint the
algorithm i used as i go thru the lines. I was thinking of a quick
brainless translation word-for-word, but that turned out not possible
as i run into problems.

(While i'm learning Python, i run into frustrations with the Python
Documentation. (although it has far more quality than Perl
documentations) . The frustrations with documentations will be appended
to this page later: How to write a tutorial )

The translation problem i run into is this. In Perl, there's a GOTO
construct where in a loop one can say "break XYZ" to jump to a
arbitrary outer loop labeled XYZ. Python has "break" but does not
provide a GOTO jump as in Perl. In the process, i have to actually
figure out (for the first time for me) how loops with GOTO jumps can be
translated to alternative structure. This turned out not to be too
hard. For a GOTO jump to a far outer group, one can use multiple breaks
at the end of each loop, possibly in addiction adding a "else"
clause to the different levels of the loops. (Python language can have
a "else" clause for "for" loops. It is executed when the loop
completes. (as opposed to when a break inside jumped out))

Here is a loop with GOTO, translated into Python without:

N1: for my $aPair (@pairings) {
for my $aGroup (@interm) {
if (exists ${$aGroup}{$aPa ir->[0]})
{${$aGroup}{$aP air->[1]}='x'; next N1}
if (exists ${$aGroup}{$aPa ir->[1]})
{${$aGroup}{$aP air->[0]}='x'; next N1}
}
push @interm, {$aPair->[0]=>'x'}; ${interm[-1]}{$aPair->[1]}='x';
}
©-----------------------------------
© for aPair in pairings:
© for aGroup in interm:
© if (aGroup.has_key (aPair[0])): aGroup[aPair[1]]='x';
break
© if (aGroup.has_key (aPair[1])): aGroup[aPair[0]]='x';
break
© else: interm.append( {aPair[0]:'x'} );
interm[-1][aPair[1]]='x'
©

Below is another such trans-structure, distinct from the above.

N2: for my $group (@interm) {
for my $newcoup (@fin) {
foreach my $k (keys %$group) {
if (exists ${$newcoup}{$k} ) {map { ${$newcoup}{$_} ='x'}
(keys %$group); next N2;}
}
}
push @fin, $group;
}
©-----------------------------------
© for group in interm:
© for newcoup in fin:
© for k in group.keys():
© if newcoup.has_key (k):
© for kk in group.keys(): newcoup[kk]='x';
© break
© break
© fin.append(grou p)
©

The Python version above has not been tested much, but i suppose it is
fine since it is basically a copy of the Perl code. The Perl version is
fairly reliable, as it went thru the gauntlet of special cases testing
and i've used it over the year in the larger program... however no any
formal proof or exhaustive machine testing has been done. Later i might
write some program to auto-test them... but that'd be another day. The
Python version can also use some clean up, or even rewrite as a program
in the Python language proper. Addenda or Errata will be added on this
page.

PS all together there are some 3 or so solutions posted on the
newsgroups. (one by private email) I will have to filter them out and
study them.

Any interesting or important Addenda or Errata will be emailed out
later.
In addition to being archived here:

File Not Found

http://xahlee.org/perl-python/partition_by_equiv.html

Xah
xah@xahlee.org

http://xahlee.org/PageTwo_dir/more.html

**Reinhold Birkenfeld** · Jul 18 '05, 09:49 PM

Re: [perl-python] exercise: partition a list by equivalence

Xah Lee wrote:[color=blue]
> here's the answer to the partition by equivalence exercise.[/color]

Your Python solution is, as expected, wrong (try with
[[10, 8], [7, 3], [1, 7], [5, 4], [2, 2], [3, 8], [7, 10], [2, 3], [6,
10], [3, 2]]
for example).

The solution by John Lenton is wrong, too.

The solution by Brian Beck delivers the correct result for most input,
but for some input lists like
[[3, 3], [8, 7], [3, 2], [8, 5], [5, 6], [6, 3], [10, 8], [8, 10], [4,
10], [10, 2]]
it chokes and returns the empty list.

My solution (which may not be the fastest or most effective, but till
now is the shortest <wink> and it works):

def merge(pairings) :
sets = {}
for x1, x2 in pairings:
newset = (sets.get(x1, frozenset([x1]))
| sets.get(x2, frozenset([x2])))
for i in newset:
sets[i] = newset

return[list(aset) for aset in set(sets.iterva lues())]

Reinhold

**bearophileHUGS@lycos.com** · Jul 18 '05, 09:49 PM

Re: exercise: partition a list by equivalence

David Eppstein:[color=blue]
> However it might be easier to treat the input pairs as the edges of a
> graph and apply a depth-first-search based graph connected components
> algorithm. || This would be O(n), though.[/color]

Is the DFS the best one here (instead of BFS)?

With the graph implementation that I've just shown here it becomes:

.. import graph
.. def merge(l):
.. g = graph.Graph()
.. g.addArcs(l)
.. return g.connectedComp onents()
.. print merge( [ [1,2], [2,4], [5,6] ] )

I presume the complexity is O(n+a); n (the nodes) is proportional to
the number of pairs, and
a (the arcs) depends on the "intricacy" of the input pairs. For a
complete graph, this can
become slow (but there is a high probably that all the pairs are in the
same connected component).
Bye,
Bearophile

**bearophileHUGS@lycos.com** · Jul 18 '05, 09:49 PM

Re: exercise: partition a list by equivalence

Bearophile:[color=blue]
> I presume the complexity is O(n+a); n (the nodes)
> is proportional to the number of pairs, and a
> (the arcs) depends on the "intricacy" of the input pairs.[/color]

Opps... n (the number of nodes) is the number of different numbers
contained in the pairs :-]

Bearophile

**David Eppstein** · Jul 18 '05, 09:49 PM

Re: exercise: partition a list by equivalence

In article <1108837453.883 740.309270@f14g 2000cwb.googleg roups.com>,
bearophileHUGS@ lycos.com wrote:
[color=blue]
> David Eppstein:[color=green]
> > However it might be easier to treat the input pairs as the edges of a
> > graph and apply a depth-first-search based graph connected components
> > algorithm. || This would be O(n), though.[/color]
>
> Is the DFS the best one here (instead of BFS)?[/color]

I'm not sure it makes much difference.
[color=blue]
> With the graph implementation that I've just shown here it becomes:
>
> . import graph
> . def merge(l):
> . g = graph.Graph()
> . g.addArcs(l)
> . return g.connectedComp onents()
> . print merge( [ [1,2], [2,4], [5,6] ] )
>
> I presume the complexity is O(n+a); n (the nodes) is proportional to
> the number of pairs, and
> a (the arcs) depends on the "intricacy" of the input pairs. For a
> complete graph, this can
> become slow (but there is a high probably that all the pairs are in the
> same connected component).[/color]

It's still linear in the input size, which is the best you could hope
for.

--
David Eppstein
Computer Science Dept., Univ. of California, Irvine

David Eppstein

http://www.ics.uci.edu/~eppstein/

**Xah Lee** · Jul 18 '05, 09:49 PM

Re: exercise: partition a list by equivalence

The GOTO statement from Perl has been messed up.

This block:
© for group in interm:
© for newcoup in fin:
© for k in group.keys():
© if newcoup.has_key (k):
© for kk in group.keys(): newcoup[kk]='x';
© break
© break
© fin.append(grou p)
©

should be:
© goto=False
© for group in interm:
© for newcoup in fin:
© for k in group.keys():
© if newcoup.has_key (k):
© newcoup.update( group);
© goto=True
© break
© if (goto):
© goto=False
© break
© else:
© fin.append(grou p)

comlete code is at:

File Not Found

http://xahlee.org/perl-python/partition_by_equiv.html

Xah
xah@xahlee.org

http://xahlee.org/PageTwo_dir/more.html

**Brian Beck** · Jul 18 '05, 09:49 PM

Re: [perl-python] exercise: partition a list by equivalence

Reinhold Birkenfeld wrote:[color=blue]
> def merge(pairings) :
> sets = {}
> for x1, x2 in pairings:
> newset = (sets.get(x1, frozenset([x1]))
> | sets.get(x2, frozenset([x2])))
> for i in newset:
> sets[i] = newset
>
> return[list(aset) for aset in set(sets.iterva lues())][/color]

Looks good. I used it as inspiration for this new one, which takes less
time for large datasets, and especially for those where a good number of
merges are expected (at least that looks to be the pattern; with some
large datasets this takes less than a quarter of the time as the one
above). I'm sure it can be improved still -- yours is still faster in
many cases.

def merge2(pairings ):
elements = {}
sets = []
for x1, x2 in pairings:
i = [elements.get(x1 , -1), elements.get(x2 , -1)]
i.sort()
if i[1] == -1:
i[1] = len(sets)
sets.append(set ([x1, x2]))
elif i[0] == -1:
sets[i[1]] |= set([x1, x2])
elif i[0] == i[1]:
continue
else:
sets[i[1]] |= sets[i[0]]
sets[i[0]].clear()

for x in sets[i[1]]:
elements[x] = i[1]

return[list(s) for s in sets if s]

# Comparison
import profile
import random

# Play with these values
xMax, nPairs = 1000, 5000

l = [[random.randint( 0, xMax), random.randint( 0, xMax)] for i in
range(nPairs)]

print 'merge2:'
profile.run('me rge2(l)') # Mine

print 'merge:'
profile.run('me rge(l)') # Reinhold's

--
Brian Beck
Adventurer of the First Order

**John Machin** · Jul 18 '05, 09:49 PM

Re: exercise: partition a list by equivalence

Reinhold Birkenfeld wrote:[color=blue]
> Xah Lee wrote:[color=green]
> > here's the answer to the partition by equivalence exercise.[/color]
>
> Your Python solution is, as expected, wrong (try with
> [[10, 8], [7, 3], [1, 7], [5, 4], [2, 2], [3, 8], [7, 10], [2, 3],[/color]
[6,[color=blue]
> 10], [3, 2]]
> for example).
>
> The solution by John Lenton is wrong, too.
>
> The solution by Brian Beck delivers the correct result for most[/color]
input,[color=blue]
> but for some input lists like
> [[3, 3], [8, 7], [3, 2], [8, 5], [5, 6], [6, 3], [10, 8], [8, 10],[/color]
[4,[color=blue]
> 10], [10, 2]]
> it chokes and returns the empty list.
>
> My solution (which may not be the fastest or most effective, but till
> now is the shortest <wink> and it works):
>
> def merge(pairings) :
> sets = {}
> for x1, x2 in pairings:
> newset = (sets.get(x1, frozenset([x1]))
> | sets.get(x2, frozenset([x2])))
> for i in newset:
> sets[i] = newset
>
> return[list(aset) for aset in set(sets.iterva lues())]
>
>
> Reinhold[/color]

FWIW, here's a brief UAT report:

Appears to work: Reinhold, David, Xah (v2)
Has bug(s): John L (v*), Brian (v*)
Incomprehensibl e: Xah (v*)

'Come back after lunch' award goes to Xah v2, which at a glance appears
to be O(N**4) -- dictionary.upda te() inside 3-deep nest of 'for'
statements.

Here's a small input that busts all non-working versions:

input: [[1, 2], [3, 4], [2, 3], [4, 5]]
merge_RB -> [[1, 2, 3, 4, 5]]
merge_DE -> [[1, 2, 3, 4, 5]]
merge_JL -> [[1, 2, 3, 4], [5]]
merge_JL2 -> [[1, 2, 3, 4], [5]]
merge_BB -> []
merge_XL -> [[1, 2, 3, 4, 5], [3, 4, 5]]
merge_XL2 -> [[1, 2, 3, 4, 5]]

**Xah Lee** · Jul 18 '05, 09:50 PM

Re: exercise: partition a list by equivalence

an interesting problem so developed now is to write a function that
generate test cases for the purpose of testing performance. (just for
fun)

the design of this function could be interesting. We want to be able to
give parameters in this function so as to spit out all possible screw
test cases. First of all, a range of n (some millions) numbers. Then, a
fraction that specifies the percentage of these number are equivalent.
1 being all equivalent. 0 being all "distinct" (having only one
equivalent member (since the input comes in pairs)). Then we want to
have a parameter that says something like the sizes of the equivalence
groups. Suppose 50% of number are equal among themselves (i.e. have
more than one equivalent member). 1 would be all in one group. 0 would
mean all partitions have size 3 or 2. (there are more to it here...
since this is a distribution) ... Then, we need to turn this range of
integers into pairs. That's something more to think about.

So with this function at hand, we'll be able to say for sure which code
performs better (and under what type of input)

the Order notion in computing mathematics is fairly useless for finer
details.

PS it is not trivial to design this pair generating function. I don't
know if the above is on the right track, but essentially we want to
categorize the type of inputs according to the mathematical operational
performance aspect of partition by equivalence, and distill them into
parameters.

another func to write is one that canonicalize the output. Such as
sorting. So that all results can be compared simply by = in Python.

failing to design a elaborate pair_gen, we can just have pairs of
random numbers. But exactly what nature is such input... is more to
think about.

(in my original application, each number represent a computer file,
there are up to tens of thousands of files, and much less than 0.1% is
same as another, and for files that are same, each equivalent group
number no more than 10 or so.)

Xah
xah@xahlee.org

http://xahlee.org/PageTwo_dir/more.html

John Machin wrote:[color=blue]
> FWIW, here's a brief UAT report:
>
> Appears to work: Reinhold, David, Xah...
> Has bug(s): ...[/color]

**bearophileHUGS@lycos.com** · Jul 18 '05, 09:50 PM

Re: exercise: partition a list by equivalence

John Machin>FWIW, here's a brief UAT report:

Here is something else for you.
Note: for more correct comparisons, for the following tests I've
disabled the use of Psyco in Graph (usually enabled, if present).
This looks faster than merge2 for this (quite sparse) random pairs
test:

.. import graph # http://www.fantascienza.net/animalia/graph.zip
.. def merge4(l):
.. g = graph.Graph()
.. for n1,n2 in l:
.. g.addArc(n1,n2)
.. return g.connectedComp onents()

Creating a better addArcs produces good results (addArcs method is
present in Graph, but addArcs2 isn't present):

.. def merge5(l):
.. g = graph.Graph()
.. g.addArcs2(l)
.. return g.connectedComp onents()

If you like to see the actual code, without using Graph, I've made a
very stripped down version of addArcs2 and connectedCompon ents:

.. from collections import deque
.. from itertools import chain
..
.. def merge6(l):
.. # graph creation
.. gi = {}
.. go = {}
.. for n1,n2 in l:
.. if n1 not in go:
.. go[n1] = {}
.. gi[n1] = {}
.. if n2 not in go:
.. go[n2] = {}
.. gi[n2] = {}
.. go[n1][n2] = 0
.. gi[n2][n1] = 0
.. # Connected components:
.. result = []
.. visited = set()
.. for root in go.iterkeys():
.. if root not in visited:
.. component = [root]
.. visited.add(roo t)
.. Q = deque() # A queue
.. Q.append(root)
.. while Q:
.. n1 = Q.popleft()
.. for n2 in chain( go[n1].iterkeys(), gi[n1].iterkeys()
):
.. if n2 not in visited:
.. visited.add(n2)
.. Q.append(n2)
.. component.appen d(n2)
.. result.append(c omponent)
.. return result

At the end of the tests I've added this to be sure the results are
always the same:

r2 = set( frozenset(e) for e in merge2(l) )
r4 = set( frozenset(e) for e in merge4(l) )
r5 = set( frozenset(e) for e in merge5(l) )
r6 = set( frozenset(e) for e in merge6(l) )
assert r2 == r4
assert r2 == r6

For xMax, nPairs = 1000, 5000:
merge2: 15750 function calls in 0.939 CPU seconds
merge4: 11029 function calls in 0.560 CPU seconds
merge5: 6030 function calls in 0.303 CPU seconds
merge6: 6011 function calls in 0.297 CPU seconds

For xMax, nPairs = 2000, 10000:
merge2: 31503 function calls in 2.505 CPU seconds
merge6: 12011 function calls in 0.639 CPU seconds

Timings using clock() (much more reliable than the profilers!), for
xMax, nPairs = 2000, 10000:
merge2: 1.222
merge6: 0.201

Probably merge6 can be improved, reducing memory used...

Bear hugs,
Bearophile

**John Machin** · Jul 18 '05, 09:50 PM

Re: exercise: partition a list by equivalence - merge_JM.py (0/1)

On 19 Feb 2005 17:56:27 -0800, bearophileHUGS@ lycos.com wrote:
[color=blue]
>John Machin>FWIW, here's a brief UAT report:
>
>Here is something else for you.
>Note: for more correct comparisons, for the following tests I've
>disabled the use of Psyco in Graph (usually enabled, if present).
>This looks faster than merge2 for this (quite sparse) random pairs
>test:[/color]

[snip]
[color=blue]
>
>Timings using clock() (much more reliable than the profilers!), for
>xMax, nPairs = 2000, 10000:
>merge2: 1.222
>merge6: 0.201
>
>Probably merge6 can be improved, reducing memory used...[/color]

Perhaps I'm missing a posting of yours -- what are merge2 and merge4?
What is "this random pairs test"? What is xMax (nPairs isn't hard to
guess), and how are you generating your input data?

FWIW, my offering is attached.

**Xah Lee** · Jul 18 '05, 09:50 PM

[perl-python] exercise: partition a list by equivalence

when i try to run the following program, Python complains about some
global name frozenset is not defined. Is set some new facility in
Python 2.4?

©# from Reinhold Birkenfeld
©def merge(pairings) :
© sets = {}
© for x1, x2 in pairings:
© newset = (sets.get(x1, frozenset([x1]))
© | sets.get(x2, frozenset([x2])))
© for i in newset:
© sets[i] = newset
©
© return[list(aset) for aset in set(sets.iterva lues())]

it would be nice if the two working programs do not use some package.
This problem shouldn't need to.

Xah
xah@xahlee.org

http://xahlee.org/PageTwo_dir/more.html

**John Machin** · Jul 18 '05, 09:50 PM

Re: exercise: partition a list by equivalence

Xah Lee wrote:[color=blue]
> when i try to run the following program, Python complains about some
> global name frozenset is not defined. Is set some new facility in
> Python 2.4?[/color]

Welcome to Python.org

http://www.python.org/doc/2.3/whatsnew/

The official home of the Python Programming Language

Welcome to Python.org

http://www.python.org/doc/2.4/whatsnew/whatsnew24.html

The official home of the Python Programming Language

You must be running 2.3. If you persist in running 2.3, put the
following at the top of the file:

import sys
PY_VERSION = sys.version_inf o[:2]
if PY_VERSION == (2,3):
from sets import Set as set
from sets import ImmutableSet as frozenset

That will get Reinhold's gadget working under 2.3. The bearophile's
function uses collections.deq ue which is built-in to 2.4.
[color=blue]
> it would be nice if the two working programs do not use some package.
> This problem shouldn't need to.[/color]

Look at the newsgroup again. By my count there are apparently-working
versions from SIX people: Reinhold, bear.+, Brian Beck, David Eppstein,
yourself, and me. Only David's uses stuff that's not in the Python 2.4
off-the-shelf distribution.

[perl-python] exercise: partition a list by equivalence

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment