change of random state when pyc created??

**Peter Otten** · May 9 '07, 07:35 AM

Re: change of random state when pyc created??

Alan Isaac wrote:

This may seem very strange, but it is true.
If I delete a .pyc file, my program executes with a different state!

Can someone explain this to me?

There is nothing wrong with the random module -- you get the same numbers on
every run. When there is no pyc-file Python uses some RAM to create it and
therefore your GridPlayer instances are located in different memory
locations and get different hash values. This in turn affects the order in
which they occur when you iterate over the GridPlayer.play ers_played set.

Here is a minimal example:

import test # sic

class T:
def __init__(self, name):
self.name = name
def __repr__(self):
return "T(name=%r) " % self.name

if __name__ == "__main__":
print set(T(i) for i in range(4))

$ python2.5 test.py
set([T(name=2), T(name=1), T(name=0), T(name=3)])
$ python2.5 test.py
set([T(name=3), T(name=1), T(name=0), T(name=2)])
$ python2.5 test.py
set([T(name=3), T(name=1), T(name=0), T(name=2)])
$ rm test.pyc
$ python2.5 test.py
set([T(name=2), T(name=1), T(name=0), T(name=3)])

Peter

**Alan Isaac** · May 9 '07, 01:45 PM

Re: change of random state when pyc created??

"Peter Otten" <__peter__@web. dewrote in message
news:f1rt61$kfg $03$1@news.t-online.com...

Alan Isaac wrote:
There is nothing wrong with the random module -- you get the same numbers

on

every run. When there is no pyc-file Python uses some RAM to create it and
therefore your GridPlayer instances are located in different memory
locations and get different hash values. This in turn affects the order in
which they occur when you iterate over the GridPlayer.play ers_played set.

Thanks!!
This also explains Steven's results.

If I sort the set before iterating over it,
the "anomaly" disappears.

This means that currently the use of sets
(and, I assume, dictionaries) as iterators
compromises replicability. Is that a fair
statement?

For me (and apparently for a few others)
this was a very subtle problem. Is there
a warning anywhere in the docs? Should
there be?

Thanks again!!

Alan Isaac

**Diez B. Roggisch** · May 9 '07, 01:55 PM

Re: change of random state when pyc created??

Alan Isaac wrote:

>
"Peter Otten" <__peter__@web. dewrote in message
news:f1rt61$kfg $03$1@news.t-online.com...

>Alan Isaac wrote:
>There is nothing wrong with the random module -- you get the same numbers

on

>every run. When there is no pyc-file Python uses some RAM to create it
>and therefore your GridPlayer instances are located in different memory
>locations and get different hash values. This in turn affects the order
>in which they occur when you iterate over the GridPlayer.play ers_played
>set.

>
Thanks!!
This also explains Steven's results.
>
If I sort the set before iterating over it,
the "anomaly" disappears.
>
This means that currently the use of sets
(and, I assume, dictionaries) as iterators
compromises replicability. Is that a fair
statement?

Yes.

For me (and apparently for a few others)
this was a very subtle problem. Is there
a warning anywhere in the docs? Should
there be?

Not really, but that depends on what you know about the concept of sets and
maps as collections of course.

The contract for sets and dicts doesn't imply any order whatsoever. Which is
essentially the reason why

set(xrange(10))[0]

doesn't exist, and quite a few times cries for an ordered dictionary as part
of the standard libraries was made.

Diez

**Alan G Isaac** · May 9 '07, 03:55 PM

Re: change of random state when pyc created??

Diez B. Roggisch wrote:

Not really, but that depends on what you know about the concept of sets and
maps as collections of course.
>
The contract for sets and dicts doesn't imply any order whatsoever. Which is
essentially the reason why
>
set(xrange(10))[0]
>
doesn't exist, and quite a few times cries for an ordered dictionary as part
of the standard libraries was made.

It seems to me that you are missing the point,
but maybe I am missing your point.

The question of whether a set or dict guarantees
some order seems quite different from the question
of whether rerunning an **unchanged program** yields the
**unchanged results**. The latter question is the question
of replicability.

Again I point out that some sophisticated users
(among which I am not numbering myself) did not
see into the source of this "anomaly". This
suggests that an explicit warning is warranted.

Cheers,
Alan Isaac

PS I know ordered dicts are under discussion;
what about ordered sets?

**Robert Kern** · May 9 '07, 07:55 PM

Re: change of random state when pyc created??

Alan G Isaac wrote:

Diez B. Roggisch wrote:

>Not really, but that depends on what you know about the concept of sets and
>maps as collections of course.
>>
>The contract for sets and dicts doesn't imply any order whatsoever. Which is
>essentially the reason why
>>
>set(xrange(10) )[0]
>>
>doesn't exist, and quite a few times cries for an ordered dictionary as part
>of the standard libraries was made.

>
It seems to me that you are missing the point,
but maybe I am missing your point.
>
The question of whether a set or dict guarantees
some order seems quite different from the question
of whether rerunning an **unchanged program** yields the
**unchanged results**. The latter question is the question
of replicability.
>
Again I point out that some sophisticated users
(among which I am not numbering myself) did not
see into the source of this "anomaly". This
suggests that an explicit warning is warranted.

404 Not Found

http://docs.python.org/lib/typesmapping.html

"""
Keys and values are listed in an arbitrary order which is non-random, varies
across Python implementations , and depends on the dictionary's history of
insertions and deletions.
"""

The sets documentation is a bit less explicit, though.

404 Not Found

http://docs.python.org/lib/types-set.html

"""
Like other collections, sets support x in set, len(set), and for x in set. Being
an unordered collection, sets do not record element position or order of insertion.
"""

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

**Alan G Isaac** · May 9 '07, 08:45 PM

Re: change of random state when pyc created??

Robert Kern wrote:

404 Not Found

http://docs.python.org/lib/typesmapping.html

"""
Keys and values are listed in an arbitrary order which is non-random, varies
across Python implementations , and depends on the dictionary's history of
insertions and deletions.
"""

Even this does not tell me that if I use a specified implementation
that my results can vary from run to run. That is, it still does
not communicate that rerunning an *unchanged* program with an
*unchanged* implementation can produce a change in results.

Alan Isaac

**Chris Mellon** · May 9 '07, 08:55 PM

Re: change of random state when pyc created??

On 5/9/07, Alan G Isaac <aisaac@america n.eduwrote:

Robert Kern wrote:

404 Not Found

http://docs.python.org/lib/typesmapping.html

"""
Keys and values are listed in an arbitrary order which is non-random, varies
across Python implementations , and depends on the dictionary's history of
insertions and deletions.
"""

>
>
Even this does not tell me that if I use a specified implementation
that my results can vary from run to run. That is, it still does
not communicate that rerunning an *unchanged* program with an
*unchanged* implementation can produce a change in results.
>

Well, now you know. I'm not sure why you expect any given program to
be idempotent unless you take specific measures to ensure that anyway.

**Robert Kern** · May 9 '07, 09:05 PM

Re: change of random state when pyc created??

Alan G Isaac wrote:

Robert Kern wrote:

>http://docs.python.org/lib/typesmapping.html
>"""
>Keys and values are listed in an arbitrary order which is non-random, varies
>across Python implementations , and depends on the dictionary's history of
>insertions and deletions.
>"""

>
Even this does not tell me that if I use a specified implementation
that my results can vary from run to run. That is, it still does
not communicate that rerunning an *unchanged* program with an
*unchanged* implementation can produce a change in results.

The last clause does tell me that.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

**Carsten Haese** · May 9 '07, 09:25 PM

Re: change of random state when pyc created??

On Wed, 2007-05-09 at 15:35 -0500, Alan G Isaac wrote:

Robert Kern wrote:

404 Not Found

http://docs.python.org/lib/typesmapping.html

"""
Keys and values are listed in an arbitrary order which is non-random, varies
across Python implementations , and depends on the dictionary's history of
insertions and deletions.
"""

>
>
Even this does not tell me that if I use a specified implementation
that my results can vary from run to run. That is, it still does
not communicate that rerunning an *unchanged* program with an
*unchanged* implementation can produce a change in results.

It doesn't say that rerunning the program won't produce a change in
results. It doesn't say that the order depends *only* on those factors
in a deterministic and reproducible manner.

The documentation shouldn't be expected to list every little thing that
might change the order of keys in a dictionary. The documentation does
say explicitly what *is* guaranteed: Order of keys is preserved as long
as no intervening modifications happen to the dictionary. Tearing down
the interpreter, starting it back up, and rebuilding the dictionary from
scratch is very definitely an intervening modification.

Regards,

--
Carsten Haese

InformixDB

http://informixdb.sourceforge.net

**Alan Isaac** · May 10 '07, 01:35 AM

Re: change of random state when pyc created??

>Robert Kern wrote:

>>http://docs.python.org/lib/typesmapping.html
>>"""
>>Keys and values are listed in an arbitrary order which is non-random,

varies

>>across Python implementations , and depends on the dictionary's history

of

>>insertions and deletions.
>>"""

Alan G Isaac wrote:

>Even this does not tell me that if I use a specified implementation
>that my results can vary from run to run. That is, it still does
>not communicate that rerunning an *unchanged* program with an
>*unchanged* implementation can produce a change in results.

"Robert Kern" <robert.kern@gm ail.comwrote in message
news:mailman.74 88.1178744519.3 2031.python-list@python.org ...

The last clause does tell me that.

1. About your reading of the current language:
I believe you, of course, but can you tell me **how** it tells you that?
To be concrete, let us suppose parallel language were added to
the description of sets. What about that language should allow
me to anticipate Peter's example (in this thread)?

2. About possibly changing the docs:
You are much more sophisticated than ordinary users.
Did this thread not demonstrate that even sophisticated users
do not see into this "implicatio n" immediately? Replicability
of results is a huge deal in some circles. I think the docs
for sets and dicts should include a red flag: do not use
these as iterators if you want replicable results.
(Side note to Carsten: this does not require listing "every little thing".)

Cheers,
Alan Isaac

**Robert Kern** · May 10 '07, 02:25 AM

Re: change of random state when pyc created??

Alan Isaac wrote:

>>Robert Kern wrote:
>>>http://docs.python.org/lib/typesmapping.html
>>>"""
>>>Keys and values are listed in an arbitrary order which is non-random,

varies

>>>across Python implementations , and depends on the dictionary's history

of

>>>insertions and deletions.
>>>"""

>

>Alan G Isaac wrote:

>>Even this does not tell me that if I use a specified implementation
>>that my results can vary from run to run. That is, it still does
>>not communicate that rerunning an *unchanged* program with an
>>*unchanged* implementation can produce a change in results.

>
"Robert Kern" <robert.kern@gm ail.comwrote in message
news:mailman.74 88.1178744519.3 2031.python-list@python.org ...

>The last clause does tell me that.

>
1. About your reading of the current language:
I believe you, of course, but can you tell me **how** it tells you that?
To be concrete, let us suppose parallel language were added to
the description of sets. What about that language should allow
me to anticipate Peter's example (in this thread)?

Actually, the root cause of Peter's specific example is the fact that the
default implementation of __hash__() and __eq__() rely on identity comparisons.
Two separate invocations of the same script give different objects by identity
and thus the "history of insertions and deletions" is different.

2. About possibly changing the docs:
You are much more sophisticated than ordinary users.
Did this thread not demonstrate that even sophisticated users
do not see into this "implicatio n" immediately?

Well, if you had a small test case that demonstrated the problem, we would have.
Your example was large, complicated, and involved other semi-deterministic red
herrings (the PRNG). It's quite easy to see the problem with Peter's example.

Replicability
of results is a huge deal in some circles. I think the docs
for sets and dicts should include a red flag: do not use
these as iterators if you want replicable results.
(Side note to Carsten: this does not require listing "every little thing".)

They do. They say very explicitly that they are not ordered and that the
sequence of iteration should not be relied upon. The red flags are there.

But I'm not going to stop you from writing up something that's even more explicit.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

**Carsten Haese** · May 10 '07, 02:25 AM

Re: change of random state when pyc created??

On Thu, 2007-05-10 at 01:25 +0000, Alan Isaac wrote:

Did this thread not demonstrate that even sophisticated users
do not see into this "implicatio n" immediately?

Knowing that maps don't have reproducible ordering is one thing.
Realizing that that's the cause of the problem that's arbitrarily and
wrongly attributed to the 'random' module, in a piece of code that's not
posted to the public, and presumably not trimmed down to the shortest
possible example of the problem, is quite another.

I'll venture the guess that most Python programmers with a modicum of
experience will, when asked point blank if it's safe to rely on a
dictionary to be iterated in a particular order, answer no.

Replicability
of results is a huge deal in some circles.

Every software engineer wants their results to be replicable. Software
engineers also know that they can only expect their results to be
replicable if they use deterministic functions. You wouldn't expect
time.time() to return the same result just because you're running the
same code, would you?

I think the docs
for sets and dicts should include a red flag: do not use
these as iterators if you want replicable results.

It does, at least for dicts: "Keys and values are listed in an arbitrary
order." If this wording is not present for sets, something to this
effect should be added.

Regards,

--
Carsten Haese

InformixDB

http://informixdb.sourceforge.net

**Steven D'Aprano** · May 10 '07, 02:45 AM

Re: change of random state when pyc created??

On Wed, 09 May 2007 16:01:02 -0500, Robert Kern wrote:

Alan G Isaac wrote:

>Robert Kern wrote:

>>http://docs.python.org/lib/typesmapping.html
>>"""
>>Keys and values are listed in an arbitrary order which is non-random, varies
>>across Python implementations , and depends on the dictionary's history of
>>insertions and deletions.
>>"""

>>
>Even this does not tell me that if I use a specified implementation
>that my results can vary from run to run. That is, it still does
>not communicate that rerunning an *unchanged* program with an
>*unchanged* implementation can produce a change in results.

>
The last clause does tell me that.

Actually it doesn't. If you run a program twice, with the same inputs,
and no other source of randomness (or at most have pseudo-randomness
starting with the same seed), then the dictionary will have the same
history of insertions and deletions from run to run.

Go back to Peter Otten's diagnosis of the issue:

"... your GridPlayer instances are located in different memory locations
and get different hash values. This in turn affects the order in which
they occur when you iterate over the GridPlayer.play ers_played set."

There is nothing in there about the dictionary having a different history
of insertions and deletions. It is having the same insertions and
deletions each run, but the items being inserted are located at different
memory locations, and _that_ changes their hash value and hence the order
they occur in when you iterate over the set.

That's quite a subtle thread to follow, and with all respect Robert, it's
easy to say it is obvious in hindsight, but I didn't notice you solving
the problem in the first place. Maybe you would have, if you had tried...
and maybe you would have scratched your head too. Who can tell?

As Carsten Haese says in another post:

"The documentation shouldn't be expected to list every little thing that
might change the order of keys in a dictionary. The documentation does say
explicitly what *is* guaranteed: Order of keys is preserved as long as no
intervening modifications happen to the dictionary. Tearing down the
interpreter, starting it back up, and rebuilding the dictionary from
scratch is very definitely an intervening modification."

That's all very true, but nevertheless it is a significant gotcha. It is
natural to expect two runs of any program to give the same result if there
are (1) no random numbers involved; (2) the same input data; (3) and no
permanent storage from run to run. One doesn't normally expect the output
of a well-written, bug-free program to depend on the memory location of
objects. And that's the gotcha -- with dicts and sets, they can.

--
Steven.

**Steven D'Aprano** · May 10 '07, 02:55 AM

Re: change of random state when pyc created??

On Wed, 09 May 2007 21:18:25 -0500, Robert Kern wrote:

Actually, the root cause of Peter's specific example is the fact that the
default implementation of __hash__() and __eq__() rely on identity comparisons.
Two separate invocations of the same script give different objects by identity
and thus the "history of insertions and deletions" is different.

The history is the same. The objects inserted are the same (by equality).
The memory address those objects are located at is different.

Would you expect that "hello world".find("w" ) should depend on the address
of the string "w"? No, of course not. Programming in a high level language
like Python, we hope to never need to think about memory addresses. And
that's the gotcha.

--
Steven.

**Alan Isaac** · May 10 '07, 02:55 AM

Re: change of random state when pyc created??

"Robert Kern" <robert.kern@gm ail.comwrote in message
news:mailman.74 97.1178763539.3 2031.python-list@python.org ...

Actually, the root cause of Peter's specific example is the fact that the
default implementation of __hash__() and __eq__() rely on identity

comparisons.

Two separate invocations of the same script give different objects by

identity

and thus the "history of insertions and deletions" is different.

OK. Thank you.
Alan

change of random state when pyc created??

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment