Python 3000, zip, *args and iterators

**Terry Reedy** · Jul 18 '05, 07:22 PM

Re: Python 3000, zip, *args and iterators

"Steven Bethard" <steven.bethard @gmail.com> wrote in message
news:3yGzd.2892 58$HA.962@attbi _s01...[color=blue]
> So, as I understand it, in Python 3000, zip will basically be replaced
> with izip, meaning that instead of returning a list, it will return an
> iterator.[/color]

I think it worth repeating that Python 3 is at yet something of a
pipedream, as indicated by the joke name Python 3000 (that also being in
part a satire on Windows 2000, and the like). So, while Guido has said he
would like to make Python iterator-oriented in the way that it used to be
list-oriented, nothing is set in stone, certainly not the details.

Guido has also said that he would like there to be funding to pay him to
spend a year on its development. He wants to take that long so there will
be adequate discussion, thought, and testing so he can 'get it right' as
least in the sense of having everything work well together.

Terry J. Reedy

**Steven Bethard** · Jul 18 '05, 07:22 PM

Re: Python 3000, zip, *args and iterators

Terry Reedy wrote:[color=blue]
> "Steven Bethard" <steven.bethard @gmail.com> wrote in message
> news:3yGzd.2892 58$HA.962@attbi _s01...
>[color=green]
>>So, as I understand it, in Python 3000, zip will basically be replaced
>>with izip, meaning that instead of returning a list, it will return an
>>iterator.[/color]
>
> I think it worth repeating that Python 3 is at yet something of a
> pipedream, as indicated by the joke name Python 3000 (that also being in
> part a satire on Windows 2000, and the like).[/color]

True, true. And worth repeating.
[color=blue]
> So, while Guido has said he
> would like to make Python iterator-oriented in the way that it used to be
> list-oriented, nothing is set in stone, certainly not the details.[/color]

Right, though my understanding of PEP 3000[1] is that though "Python
3000" may never exist, the PEP is there as a road-map of where Python
as a language would like to go. I guess the point of my question is to
find out if this kind of nice interaction of *args and iterators is
something that's in the road-map. If it is, then maybe there are parts
of it that could be implemented in a way that's backwards compatible,
even if the full system wouldn't be available for some time. (Perhaps
something along the lines of "from __future__ import iter_args".)

Steve

[1] http://www.python.org/peps/pep-3000.html

**Terry Reedy** · Jul 18 '05, 07:22 PM

Re: Python 3000, zip, *args and iterators

"Steven Bethard" <steven.bethard @gmail.com> wrote in message
news:O5Jzd.5664 95$wV.471519@at tbi_s54...[color=blue]
> Terry Reedy wrote:[color=green]
>> I think it worth repeating that Python 3 is at yet something of a
>> pipedream, as indicated by the joke name Python 3000[/color][/color]
[color=blue]
> Right, though my understanding of PEP 3000[1] is that though "Python
> 3000" may never exist, the PEP is there as a road-map of where Python as
> a language would like to go.[/color]

A major backwards compatibility break will not happen without a major
number change to Py3. And I expect it to happen -- the 'as yet' was
intentional. In fact, here is my New Year's prediction (with subjective
certainty > .5):

a. The PyPy project will succeed.
b. Python3 (actually, the reference implementation thereof) will be written
in Python3 (perhaps with 'draft' in Py2).
c. We will see it within 5 years.

We will see if I am any better than the tabloid 'psychics'.
[color=blue]
>I guess the point of my question is to find out if this kind of nice
>interaction of *args and iterators is something that's in the road-map.
>If it is, then maybe there are parts of it that could be implemented in a
>way that's backwards compatible, even if the full system wouldn't be
>available for some time. (Perhaps something along the lines of "from
>__future__ import iter_args".)[/color]

You can certainly share your concerns with the PEP author. I believe that
there is also a PyWiki page that you can directly add to.

Terry J. Reedy

**Steven Bethard** · Jul 18 '05, 07:22 PM

Re: Python 3000, zip, *args and iterators

Terry Reedy wrote:[color=blue]
> "Steven Bethard" <steven.bethard @gmail.com> wrote in message
> news:O5Jzd.5664 95$wV.471519@at tbi_s54...
>[color=green]
>>I guess the point of my question is to find out if this kind of nice
>>interaction of *args and iterators is something that's in the road-map.
>>If it is, then maybe there are parts of it that could be implemented in a
>>way that's backwards compatible, even if the full system wouldn't be
>>available for some time. (Perhaps something along the lines of "from
>>__future__ import iter_args".)[/color]
>
> You can certainly share your concerns with the PEP author. I believe that
> there is also a PyWiki page that you can directly add to.[/color]

Yeah, I found the wiki page too[1]. Does anyone know if it's okay to
add things to this page? I had avoided doing so since it gives as its
description "This page lists features that GvR has mentioned as goals
for Python 3.0" which sounds like it's not intended for commentary by
the general Python community.

Maybe I should start a Python3.0Wishli st page?

Steve

[1]http://www.python.org/moin/Python3_2e0

P.S. I thought about posting to python-dev where GvR might hear directly
about this kind of thing, but it seems a little premature since most
predictions put Python 3.0 at least 3-5 years from now.

**Raymond Hettinger** · Jul 18 '05, 07:23 PM

Re: Python 3000, zip, *args and iterators

[Steven Bethard][color=blue]
> What I would prefer is something like:
>[color=green][color=darkred]
> >>> zip(*g(4))[/color][/color]
> <iterator object at ...>[color=green][color=darkred]
> >>> x, y, z = zip(*g(4))
> >>> x, y, z[/color][/color]
> (<iterator object at ...>, <iterator object at ..., <iterator object[/color]
at ...)
.. . .[color=blue]
> So I guess my real question is, should I expect Python 3000 to play
> nicely with *args and iterators? Are there reasons (besides[/color]
backwards[color=blue]
> incompatibility ) that parsing *args this way would be bad?[/color]
.. . .[color=blue]
> In fact, with the help of the folks from this list, I did:
> http://aspn.activestate.com/ASPN/Coo.../Recipe/302325[/color]

* The answer to the first question is Yes. The point of Python 3000 is
building on what was learned and writing a simpler, cleaner language
without the encumbrance of backwards compatibility.

* However, IMHO, the proposed behavior doesn't qualify as "playing
nicely".

* Your excellent recipe provides a good basis for discussion and it
highlights some of the issues around the proposed behavior:

1: The current implementation' s behavior is easy to learn, easy to
explain, and does what most folks expect (not folks who are pushing the
iterator and *arg protocols to the outer limits). In contrast, the
proposed recipe is somewhat complex and its implications are not
immediately obvious. The itertools.tee() component is of extra concern
because it invisibly introduces memory intensive characteristics into
an otherwise lightweight, low-overhead function.

2. It is instructive to look at Guido's reactions to other *args
proposals. His receptivity to a,b,*c=it wanes whenever someone then
requests support for a,*b,c=it. Likewise, he considers zip(*args) as a
transpose function to be an abuse of the *arg protocol. IOW,
supporting "odd" usages does not bode well for a proposal.

3. The recipe discussion and newsgroup posting present only toy
examples -- real use cases have not yet emerged. If some do emerge, I
suspect that each problem will have a better solution (using existing
tools) than the one being proposed. If so, then adopting the proposal
will have the negative effect of leading folks away from the correct
solution.

Raymond Hettinger

"Not everything that can be done, should be done."

**Alex Martelli** · Jul 18 '05, 07:23 PM

Re: Python 3000, zip, *args and iterators

Raymond Hettinger <python@rcn.com > wrote:
...[color=blue]
> "Not everything that can be done, should be done."[/color]

Or, to quote Scripture...:

"'Everythin g is permissible for me' -- but not everything is beneficial"
(1 Cor 6:12)...

Alex

**Steve Holden** · Jul 18 '05, 07:23 PM

Re: Python 3000, zip, *args and iterators

Raymond Hettinger wrote:
[...][color=blue]
>
> "Not everything that can be done, should be done."
>[/color]

.... and not everything that should be done, can be done.

regards
Steve
--
Steve Holden http://www.holdenweb.com/
Python Web Programming http://pydish.holdenweb.com/
Holden Web LLC +1 703 861 4237 +1 800 494 3119

**Steven Bethard** · Jul 18 '05, 07:23 PM

Re: Python 3000, zip, *args and iterators

Raymond Hettinger wrote:[color=blue]
> [Steven Bethard]
>[color=green]
>>What I would prefer is something like:
>>[color=darkred]
>> >>> zip(*g(4))[/color]
>><iterator object at ...>[color=darkred]
>> >>> x, y, z = zip(*g(4))
>> >>> x, y, z[/color]
>>(<iterator object at ...>, <iterator object at ..., <iterator object[/color]
> at ...)
>
> 2. It is instructive to look at Guido's reactions to other *args
> proposals. His receptivity to a,b,*c=it wanes whenever someone then
> requests support for a,*b,c=it.[/color]

Yeah, I've seen his responses to those kind of suggestions. I don't
think what I'm suggesting (at least in terms of *args) is quite as
extreme though -- I'm still only talking about *args in function
definitions. I'm just suggesting that in a function with a *args in the
def, the args variable be an iterator instead of a tuple. (This doesn't
entirely solve my zip problem of course, but it's the only *args change
I was suggesting.)
[color=blue]
> Likewise, he considers zip(*args) as a
> transpose function to be an abuse of the *arg protocol.[/color]

Ahh, I didn't know that. Is there another (preferred) way to do this?
[color=blue]
> 3. The recipe discussion and newsgroup posting present only toy
> examples -- real use cases have not yet emerged.[/color]

Ok, I'll try to give you one of my use cases. It's a little
complicated, so sorry if my explanation goes on for a bit here.

Basically, I'm parsing one file format to another. The files can be
quite large, so it's important to use iterators wherever possible. My
conversion function is a generator that generates a (label,
feature_dict) pair for each line in the input file.

Now, two possible things can happen at this point (depending on
parameters from the user):

CASE 1: I output the (label, feature_dict) pairs as is, with code
something like:

for label, feature_dict in generator:
write_instance( label, feature_dict)

This is, of course, the simple case.

CASE 2: I need to apply a windowing function to the iterables so that
each line includes not only its feature_dict's values, but also the
values of some of the surrounding feature_dicts. Note that I only want
to window the feature_dicts, not the labels. This gives me code
something like:

labels, feature_dicts = starzip(generat or)
for label, feature_window in izip(labels, window(feature_ dicts)):
write_instance( label, combine_dicts(f eature_widow))

Note that I can't write the code like:

for label, feature_dict in generator:
feature_dict = combine_dicts(w indow(feature_d ict)) # WRONG!
write_instance( label, feature_dict)

because window produces an iterable from an *iterable* of feature_dicts,
not from a single feature_dict. So basically what I've done here is to
"transpose" (to use your word) the iterators, apply my function, and
then transpose the iterators back.

Hopefully this gives a little better justification for starzip? If you
have a cleaner way to do this kind of thing, I'd welcome any suggestions
of course.

If zip(*) is discouraged as a transpose function, maybe I should be
lobbying for adding a transpose function instead? (For now, of course,
it would go into itertools, but when iterators become the standard in
Python 3.0, maybe it could be moved into the builtins...)

Thanks for your comments!

Steve

**Raymond Hettinger** · Jul 18 '05, 07:37 PM

Re: Python 3000, zip, *args and iterators

[Steven Bethard] I'm just suggesting that in a function with a[color=blue]
> *args in the def, the args variable be an iterator instead of
> a tuple.[/color]

So people would lose the useful abilities to check len(args) or extract
an argument with args[1]?

Besides, if a function really wants an iterator, then its signature
should accept one directly -- no need for the star operator.

[color=blue][color=green]
> > Likewise, he considers zip(*args) as a
> > transpose function to be an abuse of the *arg protocol.[/color]
>
> Ahh, I didn't know that. Is there another (preferred) way to do[/color]
this?

I prefer the abusive approach ;-) however, the Right Way (tm) is
probably nested list comps or just plain for-loops. And, if you have
numeric, there is an obvious preferred approach.

[color=blue]
> So basically what I've done here is to
> "transpose" (to use your word) the iterators, apply my function, and
> then transpose the iterators back.[/color]

If you follow the data movements, you'll find that iterators provide no
advantage here. To execute transpose(map(f , transpose(itera tor)), the
whole iterator necessarily has to be read into memory so that the first
function application will have all of its arguments present -- using
the star operator only obscures that fact.

Realizing that the input has to be in memory anyway, then you might as
well take advantage of the code simplication offered by indexing:
[color=blue][color=green][color=darkred]
>>> def twistedmap(f, iterable):[/color][/color][/color]
.... data = list(iterable)
.... rows = range(len(data) )
.... for col in xrange(len(data[0])):
.... args = [data[row][col] for rows in rows]
.... yield f(*args)

Raymond Hettinger

**Steven Bethard** · Jul 18 '05, 07:37 PM

Re: Python 3000, zip, *args and iterators

Raymond Hettinger wrote:[color=blue]
> [Steven Bethard] I'm just suggesting that in a function with a
>[color=green]
>>*args in the def, the args variable be an iterator instead of
>>a tuple.[/color]
>
>
> So people would lose the useful abilities to check len(args) or extract
> an argument with args[1]?[/color]

No more than you lose these abilities with any other iterators:

def f(x, y, *args):
args = list(args) # or tuple(args)
if len(args) == 3:
print args[0], args[1], args[2]

True, if you do want to check argument counts, this is an extra step of
work. I personally find that most of my functions with *args parameters
look like:

def f(x, y, *args):
do_something1(x )
do_something2(y )
for arg in args:
do_something3(a rg)

where having *args be an iterable would not be a problem.
[color=blue][color=green]
>> So basically what I've done here is to
>>"transpose" (to use your word) the iterators, apply my function, and
>>then transpose the iterators back.[/color]
>
> If you follow the data movements, you'll find that iterators provide no
> advantage here. To execute transpose(map(f , transpose(itera tor)), the
> whole iterator necessarily has to be read into memory so that the first
> function application will have all of its arguments present -- using
> the star operator only obscures that fact.[/color]

I'm not sure I follow you here. Looking at my code:

labels, feature_dicts = starzip(generat or)
for label, feature_window in izip(labels, window(feature_ dicts)):
write_instance( label, combine_dicts(f eature_widow))

A few points:

(1) starzip uses itertools.tee, so it is not going to read the entire
contents of the generator in at once as long as the two parallel
iterators do not run out of sync

(2) window does not exhaust the iterator passed to it; instead, it uses
the items of that iterator to generate a new iterator in sync with the
original, so izip(labels, window(feature_ dicts)) will keep the labels
and feature_dicts iterators in sync.

(3) the for loop just iterates over the izip iterator, so it should be
consuming (label, feature_window) pairs in sync.

I assume you disagree with one of these points or you wouldn't say that
"iterators provide no advantage here". Could you explain what doesn't
work here?

Steve

Python 3000, zip, *args and iterators

Python 3000, zip, *args and iterators

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment