Ordering python sets

**Steven D'Aprano** · Oct 25 '08, 09:25 AM

Re: Ordering python sets

On Sat, 25 Oct 2008 08:58:18 +0000, Lie Ryan wrote:

<anotherrandomm using>
Since python is dynamic language, I think it should be possible to do
something like this:
>
a = list([1, 2, 3, 4, 5], implementation = 'linkedlist')
b = dict({'a': 'A'}, implementation = 'binarytree')
c = dict({'a': 'A'}, implementation = 'binarytree')

Oh I hope not. I think you have mistaken "dynamic" for "chaotic".

When I see a dict, I want to know that any two dicts work the same way. I
don't want to have to search the entire project's source code to find out
if it is a dict implemented as a hash table with O(1) lookups, or a dict
implemented as a binary tree with O(log N) lookups, or a dict implemented
as a linear array with O(N) lookups.

If I wanted that sort of nightmare, I can already do it by shadowing the
builtin:

dict = binarytree
D = dict({'a': 'A'}) # make a binary tree

There is no possible good that come from this suggestion. The beauty of
Python is that the built-in data structures (list, dict, set) are
powerful enough for 99% of uses[1], and for the other 1%, you can easily
and explicitly use something else.

But *explicitly* is the point. There's never any time where you can do
this:

type(mydict) is dict

and not know exactly what performance characteristics mydict will have.
(Unless you shadow dict or type, or otherwise do something that breaks
the rules.) You never need to ask, "Okay, it's a dict. What sort of dict?"

If you want a binary tree, ask for a binary tree.

[1] Your mileage may vary.

--
Steven

**Lie Ryan** · Oct 25 '08, 09:55 PM

Re: Ordering python sets

On Sat, 25 Oct 2008 09:21:05 +0000, Steven D'Aprano wrote:

On Sat, 25 Oct 2008 08:58:18 +0000, Lie Ryan wrote:
>

><anotherrandom musing>
>Since python is dynamic language, I think it should be possible to do
>something like this:
>>
>a = list([1, 2, 3, 4, 5], implementation = 'linkedlist') b = dict({'a':
>'A'}, implementation = 'binarytree') c = dict({'a': 'A'},
>implementati on = 'binarytree')

>
Oh I hope not. I think you have mistaken "dynamic" for "chaotic".
>
When I see a dict, I want to know that any two dicts work the same way.

Oh no, the two dict implementation would work _exactly_ the same from the
outside, they are transparently interchangeable . Only the performance
characteristic differs because of the different implementation. Actually
I got this idea from a book about algorithm and data structure, that book
said that an abstract data type (e.g. dict, set, list) have several
competing implementations or data structures (e.g. binary tree dict,
hashed dict, array dict). A data's implementation and interface can be
separated so that we can switch the data structure's implementation
without changing the rest of the code. The book is Algorithm Design
Manual by Skiena.

hint: read the PS below

I don't want to have to search the entire project's source code to find
out if it is a dict implemented as a hash table with O(1) lookups, or a
dict implemented as a binary tree with O(log N) lookups, or a dict
implemented as a linear array with O(N) lookups.

No, you'd only need to look at the dict's creation point (or actually
much better at projects docs, but not everyone create good docs). The
alternative you mentioned below, by shadowing built-in, is both a hack
and is much more likely to be missed.

If I wanted that sort of nightmare, I can already do it by shadowing the
builtin:
>
dict = binarytree
D = dict({'a': 'A'}) # make a binary tree

I DON'T want THAT sort of nightmare you mentioned...
And it'd be impossible to have two dictionary that have two different
implementations .

There is no possible good that come from this suggestion. The beauty of
Python is that the built-in data structures (list, dict, set) are
powerful enough for 99% of uses[1], and for the other 1%, you can easily
and explicitly use something else.

Oh really? As far as I know, python's list is extremely bad if you're
inserting data at the beginning of the list (e.g. lst.insert(0) requires
the whole array be "copied" one address away). This is because python's
list uses array data structure, making addressing (e.g. a[2]) fast, but
insertion slow. If, on the other hand, it is implemented using binary
tree, it'd make insertion O(log n) but addressing is a bit tricky on it.

The keyword is "tradeoffs" .

But *explicitly* is the point. There's never any time where you can do
this:

Yes, true, explicitly IS the point. How more explicit can you be than:
dict({foo: bar}, implementation = 'binarytree')

type(mydict) is dict

If my memory serves right, binary tree dict and hashed table dict is both
a dict right? (Duck Typing)
Only their implementation differs. Implementation is... well,
"implementa tion detail".

and not know exactly what performance characteristics mydict will have.

Oh... why do I need to know what the performance characteristic of mydict
is? Unless I know what I'm doing.

(Unless you shadow dict or type, or otherwise do something that breaks
the rules.) You never need to ask, "Okay, it's a dict. What sort of
dict?"

Okay, it's a dict. What sort of dict? Who the hell cares? I don't need to
know, they all looks and behave the same (Duck Typing)... at least until
I profile them (since profiling is a deep black magic by itself, it
cannot be used to discredit switching implementations ).

Sometimes we need a data type to use a specific data structure that have
some specific performance characteristic, because we know we'll be doing
a specific operation a lot more than other operations.

If you actually need to know what the implementation is currently being
used, you could implement a dict.implementa tion property.

If you want a binary tree, ask for a binary tree.

Yeah, you ask for binary tree EXPLICITLY:
bintreedict = dict({a:b}, implementation = 'binarytree')

this:
regularhasheddi ct = dict({a:b})

would have a reasonable default.

PS: I do admit I have used the wrong terms in the last post. I used the
term "data structure", instead it should be "abstract data type", "data
structure" is a synonym for "implementation ". In this post, I hope I've
corrected all of the usage.

**Terry Reedy** · Oct 25 '08, 10:25 PM

Re: Ordering python sets

Lie Ryan wrote:

>
<anotherrandomm using>
Since python is dynamic language, I think it should be possible to do
something like this:
>
a = list([1, 2, 3, 4, 5], implementation = 'linkedlist')

For this to work, the abstract list would have to know about all
implementations of the abstraction.

b = dict({'a': 'A'}, implementation = 'binarytree')
c = dict({'a': 'A'}, implementation = 'binarytree')
>
i.e. basically since a data structure can have different implementations ,
and different implementations have different performance characteristics ,
it should be possible to dynamically change the implementation used.
>
In the far future, the data structure and its implementation could be
abstracted even further:
>
a = list() # ordered list
b = set() # unordered list
c = dict() # unordered dictionary
d = sorteddict() # ordered dictionary
>
Each of the implementations would share a common subset of methods and
possibly a few implementation dependent method that could only work on
certain implementations (or is extremely slow except in the correct
implementation) .
</anotherrandommu sing>

The future is 3.0, at least in part, with Abstract Base Classes.
There are 16 in the collectons module.
"In addition to containers, the collections module provides some ABCs
(abstract base classes) that can be used to test whether a class
provides a particular interface, for example, is it hashable or a
mapping, and some of them can also be used as mixin classes."

The ABCs for numbers are in the numbers module.

tjr

**Lie Ryan** · Oct 25 '08, 10:35 PM

Re: Ordering python sets

On Sat, 25 Oct 2008 18:20:46 -0400, Terry Reedy wrote:

Lie Ryan wrote:
>
>

><anotherrandom musing>
>Since python is dynamic language, I think it should be possible to do
>something like this:
>>
>a = list([1, 2, 3, 4, 5], implementation = 'linkedlist')

>
For this to work, the abstract list would have to know about all
implementations of the abstraction.

/the exact syntax isn't really important/
/abstract type and implementation separation is the important point/

Actually, if I want to force it, that syntax could work using the same
magic used by event-based syst

**Lie Ryan** · Oct 25 '08, 10:45 PM

Re: Ordering python sets

On Sat, 25 Oct 2008 18:20:46 -0400, Terry Reedy wrote:

Lie Ryan wrote:
>
>

><anotherrandom musing>
>Since python is dynamic language, I think it should be possible to do
>something like this:
>>
>a = list([1, 2, 3, 4, 5], implementation = 'linkedlist')

>
For this to work, the abstract list would have to know about all
implementations of the abstraction.

# Sorry the last message is truncated because of an "accident"

/the exact syntax isn't really important/
/abstract type and implementation separation is the important point/

Actually, if I want to force it, that syntax could work using the same
magic used by event-based systems: registration. Although I agree it
might be a bit cumbersome to do registration for something like this, but
as I've said before, exact syntax is not really important.

**Steven D'Aprano** · Oct 26 '08, 12:55 AM

Re: Ordering python sets

On Sat, 25 Oct 2008 21:53:10 +0000, Lie Ryan wrote:

On Sat, 25 Oct 2008 09:21:05 +0000, Steven D'Aprano wrote:
>

>On Sat, 25 Oct 2008 08:58:18 +0000, Lie Ryan wrote:
>>

>><anotherrando mmusing>
>>Since python is dynamic language, I think it should be possible to do
>>something like this:
>>>
>>a = list([1, 2, 3, 4, 5], implementation = 'linkedlist') b =
>>dict({'a': 'A'}, implementation = 'binarytree') c = dict({'a': 'A'},
>>implementatio n = 'binarytree')

>>
>Oh I hope not. I think you have mistaken "dynamic" for "chaotic".
>>
>When I see a dict, I want to know that any two dicts work the same way.

>
Oh no, the two dict implementation would work _exactly_ the same from
the outside, they are transparently interchangeable . Only the
performance characteristic differs because of the different
implementation.

Exactly. That was my point.

[...]

>I don't want to have to search the entire project's source code to find
>out if it is a dict implemented as a hash table with O(1) lookups, or a
>dict implemented as a binary tree with O(log N) lookups, or a dict
>implemented as a linear array with O(N) lookups.

>
No, you'd only need to look at the dict's creation point (or actually
much better at projects docs, but not everyone create good docs).

And how do you find an arbitrary object's creation point without
searching the project's source code?

>If I wanted that sort of nightmare, I can already do it by shadowing
>the builtin:
>>
>dict = binarytree
>D = dict({'a': 'A'}) # make a binary tree

>
I DON'T want THAT sort of nightmare you mentioned... And it'd be
impossible to have two dictionary that have two different
implementations .

Nonsense.

dict = binarytree
D1 = dict({'a': 'A'}) # make a binary tree "dict"
dict = __builtin__.dic t
D2 = dict({'a': 'A'}) # make a standard dict
dict = someothertype
D3 = dict({'a': 'A'})

I'm not suggesting this is a good idea. This is a terrible idea. But it
is not much worse than your idea:

D1 = dict({'a': 'A'}, implementation= 'binarytree')
D2 = dict({'a': 'A'}, implementation= 'dict')
D3 = dict({'a': 'A'}, implementation= 'someothertype' )

>There is no possible good that come from this suggestion. The beauty of
>Python is that the built-in data structures (list, dict, set) are
>powerful enough for 99% of uses[1], and for the other 1%, you can
>easily and explicitly use something else.

>
Oh really? As far as I know, python's list is extremely bad if you're
inserting data at the beginning of the list

And how often do you do that?

And when you do, use a deque. Just call it a deque.

[...]

>But *explicitly* is the point. There's never any time where you can do
>this:

>
Yes, true, explicitly IS the point. How more explicit can you be than:
dict({foo: bar}, implementation = 'binarytree')
>

>type(mydict) is dict

You miss the point. With your plan, you can do this:

D1 = dict({foo: bar}, implementation = 'binarytree')
D2 = dict({foo: bar}, implementation = 'dict')
type(D1) is type(D2)

and yet D1 and D2 have UTTERLY different performance characteristics . So
now you need to add ANOTHER test to distinguish dicts-which-are-dicts
from dicts-which-are-binary-trees:

D1.implementati on != D2.implementati on

And why? So you can avoid calling a goose a goose, and call it a duck
instead.

If my memory serves right, binary tree dict and hashed table dict is
both a dict right? (Duck Typing)
Only their implementation differs. Implementation is... well,
"implementa tion detail".

Duck typing refers to *interface*, not implementation. I have no problem
with you using a type with the same interface as a dict. That's what duck
typing is all about. Just don't call it a dict!

>and not know exactly what performance characteristics mydict will have.

>
Oh... why do I need to know what the performance characteristic of
mydict is? Unless I know what I'm doing.

Back to Basics

http://www.joelonsoftware.com/articles/fog0000000319.html

We spend a lot of time on this site talking about exciting Big Picture Stuff like .NET versus Java, XML strategy, Lock-In, competitive strategy, software design, architecture, and so forth. All thi…

Because when you do this:

mydict[key] = 1

it's important whether each dict lookup is O(1), O(log N) or O(N). For a
dict with one million items, that means that an implementation based on a
binary tree does O(20) times more processing than a dict, and an
implementation based on linear searching does O(1000000) times more
processing.

If you think implementation details don't matter, try this:

s1 = 'c'*(10**6)

versus

s2 = ''
for i in xrange(10**6):
s2 = 'c' + s2 # defeat optimizer

>(Unless you shadow dict or type, or otherwise do something that breaks
>the rules.) You never need to ask, "Okay, it's a dict. What sort of
>dict?"

>
Okay, it's a dict. What sort of dict? Who the hell cares?

If you don't care, then why are you specifying the implementation type?

mydict = dict({'foo': 'bar'}, implementation= "surprise me!")

You can't have it both ways. If you care, then you know enough to want a
hash table based dict (the standard) or a binary tree or something else.
So go ahead and use whatever data structure you want. Just don't call it
a dict.

But if you don't care, then just use the Python standard data types.

I don't need
to know, they all looks and behave the same (Duck Typing)...

Until you try a simple operation like len(mydict) and it takes three
minutes because the implementation you choose is really inefficient.

Sometimes we need a data type to use a specific data structure that have
some specific performance characteristic, because we know we'll be doing
a specific operation a lot more than other operations.

I'm not arguing that you should never use any other data structure. Go
right ahead and use them all you like.

>If you want a binary tree, ask for a binary tree.

>
Yeah, you ask for binary tree EXPLICITLY: bintreedict = dict({a:b},
implementation = 'binarytree')

Or you could do it the right way:

D1 = binarytree(data )
D2 = dict(data)
type(D1), type(D2)
=returns binarytree, dict

versus:

D1 = dict(data, implementation= 'binarytree')
D2 = dict(data)
type(D1), type(D2)
=returns dict, dict

--
Steven

**Terry Reedy** · Oct 26 '08, 01:55 AM

Re: Ordering python sets

Lie Ryan wrote:

On Sat, 25 Oct 2008 18:20:46 -0400, Terry Reedy wrote:

>>a = list([1, 2, 3, 4, 5], implementation = 'linkedlist')

>For this to work, the abstract list would have to know about all
>implementation s of the abstraction.

>
/the exact syntax isn't really important/
/abstract type and implementation separation is the important point/
>
Actually, if I want to force it, that syntax could work using the same
magic used by event-based systems: registration.

ABCs have registration method. The builtin ABCs have appropriate
builtin classes preregistered.

>>import collections as co
>>mu = co.MutableSeque nce
>>issubclass(li st, mu)

True

I believe user classes that inherit from an ABC are also registered, and
other can be registered explicitly.

>Although I agree it
might be a bit cumbersome to do registration for something like this, but
as I've said before, exact syntax is not really important.

Then why do you object to current
mylist = linkedlist(data )
and request the harder to write and implement
mylist = list(data, implementation = 'linkedlist')
?

tjr

**Lie Ryan** · Oct 26 '08, 12:25 PM

Re: Ordering python sets

On Sun, 26 Oct 2008 00:53:18 +0000, Steven D'Aprano wrote:
[...]

And how do you find an arbitrary object's creation point without
searching the project's source code?

How is it better using the current way?
Asking the .implementation field isn't much harder than asking the type
(), and is much more obvious that we're asking the "implementation " being
used.

>>If I wanted that sort of nightmare, I can already do it by shadowing
>>the builtin:
>>>
>>dict = binarytree
>>D = dict({'a': 'A'}) # make a binary tree

>>
>I DON'T want THAT sort of nightmare you mentioned... And it'd be
>impossible to have two dictionary that have two different
>implementation s.

>
Nonsense.
>

<snip>

There is two good arguments for using a plain string to specify the
implementation used: 1) Plain-text Configuration, 2) argument passing.

1) The current way requires you have a map of the configuration text to a
function (i.e. imps = {'linkedlist': linkedlist}), than assign the
function to _a global name_ (i.e. lista = imps[configuration[confkey]] --
namespace pollution), then instantiate an object using that (ls = lista
([...]))). The implementation way, only requires ls = list([...],
implementation = configuration[confkey])

2) Easily said, plain text passing is safer and easier than passing
function object.

>>There is no possible good that come from this suggestion. The beauty
>>of Python is that the built-in data structures (list, dict, set) are
>>powerful enough for 99% of uses[1], and for the other 1%, you can
>>easily and explicitly use something else.

>>
>Oh really? As far as I know, python's list is extremely bad if you're
>inserting data at the beginning of the list

>
And how often do you do that?

It's not even a deque yet, a regular queue is slow using list in python,
and any insertion sort requires you to insert elements into arbitrary
position. This makes using array for it a O(n**2). On the other hand,
using binary tree is a O(log n).

[...]

>>But *explicitly* is the point. There's never any time where you can do
>>this:

>>
>Yes, true, explicitly IS the point. How more explicit can you be than:
>dict({foo: bar}, implementation = 'binarytree')
>>

>>type(mydict ) is dict

>
>
You miss the point. With your plan, you can do this:
>
D1 = dict({foo: bar}, implementation = 'binarytree') D2 = dict({foo:
bar}, implementation = 'dict') type(D1) is type(D2)
>
and yet D1 and D2 have UTTERLY different performance characteristics .

It does not matter that it have different performance characteristic.

So
now you need to add ANOTHER test to distinguish dicts-which-are-dicts
from dicts-which-are-binary-trees:

How often would you need to care about the specific implementation used?
Compare with how often you want to know how to work with the object. Most
of the time, you only need to know how to work with it (i.e. knowing its
interface), only _sometimes_ when it DOES matter speed-wise, you'd need
to know and affect its implementation.

D1.implementati on != D2.implementati on
>
And why? So you can avoid calling a goose a goose, and call it a duck
instead.

I think we have an utterly different view here.
I wished that list() becomes an abstract type, instead of a data
structure. (binarytreelist and arraylist is some property of a list)

While you insisted that list() is a data structure, and abstract type is
some obscureness somewhere unknown.

>If my memory serves right, binary tree dict and hashed table dict is
>both a dict right? (Duck Typing)
>Only their implementation differs. Implementation is... well,
>"implementatio n detail".

>
Duck typing refers to *interface*, not implementation.

Well said, you've just supported my view.

I have no problem
with you using a type with the same interface as a dict. That's what
duck typing is all about. Just don't call it a dict!

why do you think binarytreedict is "less dict" than hasheddict?

>>and not know exactly what performance characteristics mydict will
>>have.

>>
>Oh... why do I need to know what the performance characteristic of
>mydict is? Unless I know what I'm doing.

>
http://www.joelonsoftware.com/articl...000000319.html

Sometimes it is good to have control of everything. Sometimes (most of
the time) I know I can dump some of the details to think on more
important matters. To know which parts need more thinking by doing
profiling.

Because when you do this:
>
mydict[key] = 1
>
it's important whether each dict lookup is O(1), O(log N) or O(N). For a
dict with one million items, that means that an implementation based on
a binary tree does O(20) times more processing than a dict, and an
implementation based on linear searching does O(1000000) times more
processing.

Well, I don't need to know how the dict works if I'm only using it to
store twenty items right? (I do know how python's dict/set/list/etc is
implemented, but unless that particular dict is central to my program,
I'm not going to create my programs around how dict is implemented, I'm
going to create it on how my mind works).

I don't care how print statement works, unless I'm trying to do something
fancy with print. I don't care that print takes a series of electrical
signal, then convert it into a series of bytes and pass it into the
terminal stdout then the terminal will figure out how to draw the string
using the appropriate font at the appropriate place and pass itself to a
window manager which would compose the desktop into an image then pass
itself to the graphic driver which would convert the image to electrical
signal to be sent to the graphic card which would do some fancy tricks
before sending it to the monitor, when what I need is only to show the
user a simple welcome message.

The reason we use high-level language is that we know we don't need to
care about some details. Only when the detail matters then we take care
of it. If I do care about all the details, I'd use C, no, no, lower, I'd
use assembly, no, lower..., I'd design my own CPU, no even lower, I'd
design my own non-von-neumann architecture. I don't think I can change
the rules of physics though, but if I can... <some more pseudo random
bytes>

>I don't need
>to know, they all looks and behave the same (Duck Typing)...

>
Until you try a simple operation like len(mydict) and it takes three
minutes because the implementation you choose is really inefficient.

.... then I know I have chosen the wrong data structure, and I know I need
to change it.

>>If you want a binary tree, ask for a binary tree.

>>
>Yeah, you ask for binary tree EXPLICITLY: bintreedict = dict({a:b},
>implementati on = 'binarytree')

>
Or you could do it the right way:

<snip>

For example, if I have a certain unknown object. What is easier?

>>type(URO) # Unidentified Round Object

=returns (*&%$^%*

or

>>type(URO)

=returns diningplate

only after three thousand years of continuous and laborious researching,
at last I found out that (*&%$^%* is an alien dining plate, and it does
have all the features and functionality as earth dining plate, although
due to it being made of CClKO32C32CH42U it is much more resistant to
acidity but would melt when you put carrot on it.

Only when I want to eat carrot, would I care that it is an alien dining
plate. And only when I want to eat something extremely acidic, I'd chose
the alien dining plate over earth dining plate.

**Lie Ryan** · Oct 26 '08, 12:35 PM

Re: Ordering python sets

On Sat, 25 Oct 2008 21:50:36 -0400, Terry Reedy wrote:

Lie Ryan wrote:

>On Sat, 25 Oct 2008 18:20:46 -0400, Terry Reedy wrote:

Then why do you object to current
mylist = linkedlist(data )
and request the harder to write and implement
mylist = list(data, implementation = 'linkedlist')

I don't wholly object it. I think it's fine but can be improved.

I tend to be more inclined on this side though:

mylist = list.linkedlist (data) # or list<othersymbo l>linkedlist(da ta)
as syntax sugar for
mylist = list(data, implementation = 'linkedlist')

this kinds of syntax magnify the fact that linkedlist is a choice of
implementation for list abstract type. The current syntax makes no
indication whatsoever that linkedlist is anything related to list.

**bearophileHUGS@lycos.com** · Oct 27 '08, 05:05 PM

Re: Ordering python sets

Glenn Linderman:

how does one create a key that corresponds to ascending integer followed by descending character string?

(Others may have already answered you because Google groups is very
slow.)

>>seq = [(10, "abb"), (5, "zul"), (5, "hal"), (2, "of")]
>>sorted(seq, key=lambda (n,s): (-n, s), reverse=True)

[(2, 'of'), (5, 'zul'), (5, 'hal'), (10, 'abb')]

A little harder question is how to create a key that corresponds to
ascending string followed by descending string?

To do that you can sort the data two times, relying on the stable
nature of the Python sort.

Another (silly?) solution may be to invent a "negate" function for
strings too:

>>strneg = lambda txt: txt.translate(i table)
>>itable = "".join(chr (i) for i in xrange(255, -1, -1))
>>pairs = [('al', 'boo'), ('zul', 'ao'), ('al', 'den'), ('zul', 'o')]
>>sorted(pair s, key=lambda (s1,s2): (s1, strneg(s2)))

[('al', 'den'), ('al', 'boo'), ('zul', 'o'), ('zul', 'ao')]

Bye,
bearophile

**Peter Otten** · Oct 27 '08, 05:35 PM

Re: Ordering python sets

bearophileHUGS@ lycos.com wrote:

Glenn Linderman:
>

>how does one create a key that corresponds to ascending integer followed
>by descending character string?

>
(Others may have already answered you because Google groups is very
slow.)
>

>>>seq = [(10, "abb"), (5, "zul"), (5, "hal"), (2, "of")]
>>>sorted(seq , key=lambda (n,s): (-n, s), reverse=True)

[(2, 'of'), (5, 'zul'), (5, 'hal'), (10, 'abb')]
>
A little harder question is how to create a key that corresponds to
ascending string followed by descending string?
>
To do that you can sort the data two times, relying on the stable
nature of the Python sort.
>
Another (silly?) solution may be to invent a "negate" function for
strings too:
>

>>>strneg = lambda txt: txt.translate(i table)
>>>itable = "".join(chr (i) for i in xrange(255, -1, -1))
>>>pairs = [('al', 'boo'), ('zul', 'ao'), ('al', 'den'), ('zul', 'o')]
>>>sorted(pairs , key=lambda (s1,s2): (s1, strneg(s2)))

[('al', 'den'), ('al', 'boo'), ('zul', 'o'), ('zul', 'ao')]

Here's a class that can negate arbitrary values:

>>class Neg(object):

.... def __init__(self, value):
.... self.value = value
.... def __cmp__(self, other):
.... return -cmp(self.value, other.value)
....

>>pairs = [('al', 'boo'), ('zul', 'ao'), ('al', 'den'), ('zul', 'o')]
>>sorted(pair s, key=lambda (s, t): (s, Neg(t)))

[('al', 'den'), ('al', 'boo'), ('zul', 'o'), ('zul', 'ao')]

**bearophileHUGS@lycos.com** · Oct 27 '08, 08:25 PM

Re: Ordering python sets

Lie Ryan:

>Oh no, the two dict implementation would work _exactly_ the same from the outside, they are transparently interchangeable . Only the performance characteristic differs because of the different implementation. <

I don't agree with the general idea. If the operations done by your
data structure have different computational complexity, then they are
fit for different usages. When you program you must know what
computational complexity has each of the operations of your data
structyre, otherwise there's no way to know the complexity of your
whole program, so instead of programming you are just become a mage
that tries magical spells and hopes for the better. So I don't accept
so much different data structures to have the same name. That's why
I'll never appreciate the Python list type to be named list instead of
array, despite it supports more or less all the functions you expect
from an abstract list type.

Said that, for a high-level language like Python I can see another
possible solution. To have a type that contains several kinds of data
structures, for example a "dict" that contains a hash implementation,
a red-black tree, etc. Such data structure can use a small part of the
computational power given to it to collect statistics usages of each
object (or each variable, that may contain in succession several
ojects of the same type). Such data structure can then at run time
adapt dynamically, chosing to use the implementation fitter for the
specific usage of each object or variable (the programmer can give
hints of course, or sometimes even coerce the usage a specific
implementation) . (such statistics can also be saved on disk to be used
for the successive run of the program, to help them work well from the
start too, and not just after a while). If this capability works well
in practice, then it can solve the problem you were talking about, I
think.

I presume data structures in future high-level languages will be quite
more able to adapt themselves to their usages. Today it's the second
time I talk about future programming languages :-)

Bye,
bearophile

**greg** · Oct 28 '08, 02:45 AM

Re: Ordering python sets

On approximately 10/27/2008 10:27 AM, came the following characters from
the keyboard of Peter Otten:

Here's a class that can negate arbitrary values
>
... def __init__(self, value):
... self.value = value
... def __cmp__(self, other):
... return -cmp(self.value, other.value)

Unfortunately, the __cmp__ method has been eliminated from
3.0 as well, so this won't work as-is. You would need to
override __lt__, __eq__, etc.

--
Greg

**Lie Ryan** · Nov 1 '08, 09:45 AM

Re: Ordering python sets

On Mon, 27 Oct 2008 13:18:43 -0700, bearophileHUGS wrote:

So I don't accept so much different data structures to have the
same name

You need to adjust the current mindset slightly (but in an important way
to understand the "why" behind this idea). The current notion is: list
and dict is a data structure. With this idea, list and dict is an
abstract type, not a data structure. array, linked list, binary tree, red-
black tree, hashed are data structure. We create a data structure by
passing the data structure's identifier string to a factory function
provided by the abstract type.

There are two possible syntax sugar:
a = list.bintree([...])
b = list([...]) # also for backward compat, have reasonable default

In short, the data structures doesn't share the same name, the abstract
data-type does. The suggestion is to change the place where we define the
what data structure to use.

Said that, for a high-level language like Python I can see another
possible solution. To have a type that contains several kinds of data
structures, for example a "dict" that contains a hash implementation, a
red-black tree, etc. Such data structure can use a small part of the
computational power given to it to collect statistics usages of each
object (or each variable, that may contain in succession several ojects
of the same type). Such data structure can then at run time adapt
dynamically, chosing to use the implementation fitter for the specific
usage of each object or variable (the programmer can give hints of
course, or sometimes even coerce the usage a specific implementation) .
(such statistics can also be saved on disk to be used for the successive
run of the program, to help them work well from the start too, and not
just after a while). If this capability works well in practice, then it
can solve the problem you were talking about, I think.
>
I presume data structures in future high-level languages will be quite
more able to adapt themselves to their usages. Today it's the second
time I talk about future programming languages :-)

Although you said you disagree with the general idea, you actually take
the idea two steps further, I take that as an implicit agreement to
several parts of the idea.

Ordering python sets

Ordering python sets

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment