Pre-PEP: reverse iteration methods

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Raymond Hettinger

    Pre-PEP: reverse iteration methods

    Here is a discussion draft of a potential PEP.
    The ideas grew out of the discussion on pep-284.
    Comments are invited. Dart throwing is optional.


    Raymond Hettinger

    -------------------------------------------------------------

    PEP: 323
    Title: Add Reverse Iteration Methods
    Version: $Revision: 1.1 $
    Last-Modified: $Date: 2003/03/11 04:49:44 $
    Author: Raymond Hettinger <python at rcn.com>
    Status: Draft
    Type: Standards Track
    Content-Type: text/x-rst
    Created: 23-Sep-2003
    Python-Version: 2.4
    Post-History: 23-Sep-2003


    Abstract
    ========

    This proposal is to extend the API of several sequence types
    to include methods for iterating over the sequence in reverse.


    Motivation
    ==========

    For indexable objects, current methods for reverse iteration are
    error prone, unnatural, and not especially readable::

    for i in xrange(n-1, -1, -1):
    print seqn[i]

    One other current approach involves reversing a list before iterating
    over it. That technique wastes computer cycles, memory, and lines of
    code. Also, it only works with lists (strings, for example, do not define
    a reverse method)::

    rseqn = list(seqn)
    rseqn.reverse()
    for elem in rseqn:
    print elem

    Reverse iteration is much less common than forward iteration, but it
    does arise regularly in practice.


    Proposal
    ========

    Add a method called iter_backwards( ) to sequence objects that can benefit
    from it. The above examples then simplify to::

    for i in xrange(n).iter_ backwards():
    print seqn[i]

    for elem in seqn.iter_backw ards():
    print elem

    The new protocol would be applied to lists, strings, xranges objects,
    and possibly other sequence objects as well (depending on use cases
    and implementation issues). It would not apply to unordered collections
    like dicts and sets.

    No language syntax changes are needed.


    Open Issues
    ===========

    * Should tuples be included? In the past they have been denied some list
    like behaviors such as count() and index().

    * Should file objects be included? Implementing reverse iteration may not
    be easy though it would be useful on occasion.

    * Should enumerate() be included? It would only provide reverse iteration
    whenever the underlying sequence supported it.


    Copyright
    =========

    This document has been placed in the public domain.



  • Andrew Dalke

    #2
    Re: Pre-PEP: reverse iteration methods

    Raymond Hettinger:[color=blue]
    > Proposal
    > ========
    >
    > Add a method called iter_backwards( ) to sequence objects that can benefit
    > from it. The above examples then simplify to::
    >
    > for i in xrange(n).iter_ backwards():
    > print seqn[i]
    >
    > for elem in seqn.iter_backw ards():
    > print elem[/color]

    What about letting 'iter_backwards ' be a builtin which
    looks for __riter__ and if it doesn't exist, get the length
    and do old-fashioned offset indexing?

    Something like this untested code

    def iter_backwards( obj):
    try:
    riter = getattr(obj, "__riter__" )
    except AttributeError:
    n = len(obj)
    while n != 0:
    n = n -1
    yield obj[n]
    else:
    for term in riter():
    yield term

    which would be used like

    for i in iter_backwards( xrange(n)):
    print seqn[i]

    for elem in iter_backwards( seqn):
    print elem

    Andrew
    dalke@dalkescie ntific.com


    Comment

    • John Roth

      #3
      Re: Pre-PEP: reverse iteration methods

      Overall impression: I like it.
      More comments interspersed.

      John Roth

      "Raymond Hettinger" <vze4rx4y@veriz on.net> wrote in message
      news:HA5cb.185$ kD3.168@nwrdny0 3.gnilink.net.. .[color=blue]
      > Here is a discussion draft of a potential PEP.
      > The ideas grew out of the discussion on pep-284.
      > Comments are invited. Dart throwing is optional.
      >
      >
      > Raymond Hettinger
      >
      > -------------------------------------------------------------
      >
      > PEP: 323
      > Title: Add Reverse Iteration Methods
      > Version: $Revision: 1.1 $
      > Last-Modified: $Date: 2003/03/11 04:49:44 $
      > Author: Raymond Hettinger <python at rcn.com>
      > Status: Draft
      > Type: Standards Track
      > Content-Type: text/x-rst
      > Created: 23-Sep-2003
      > Python-Version: 2.4
      > Post-History: 23-Sep-2003
      >
      >
      > Abstract
      > ========
      >
      > This proposal is to extend the API of several sequence types
      > to include methods for iterating over the sequence in reverse.
      >
      >
      > Motivation
      > ==========
      >
      > For indexable objects, current methods for reverse iteration are
      > error prone, unnatural, and not especially readable::
      >
      > for i in xrange(n-1, -1, -1):
      > print seqn[i]
      >
      > One other current approach involves reversing a list before iterating
      > over it. That technique wastes computer cycles, memory, and lines of
      > code.[/color]

      It also has the "returns None instead of an object" wart.
      [color=blue]
      > Also, it only works with lists (strings, for example, do not define
      > a reverse method)::
      >
      > rseqn = list(seqn)
      > rseqn.reverse()
      > for elem in rseqn:
      > print elem
      >
      > Reverse iteration is much less common than forward iteration, but it
      > does arise regularly in practice.[/color]

      Absolutely.
      [color=blue]
      >
      > Proposal
      > ========
      >
      > Add a method called iter_backwards( ) to sequence objects that can benefit
      > from it. The above examples then simplify to::
      >
      > for i in xrange(n).iter_ backwards():
      > print seqn[i]
      >
      > for elem in seqn.iter_backw ards():
      > print elem
      >
      > The new protocol would be applied to lists, strings, xranges objects,
      > and possibly other sequence objects as well (depending on use cases
      > and implementation issues). It would not apply to unordered collections
      > like dicts and sets.[/color]

      I presume that the result of iter_backwards( ) is an iterator
      object with the desired behavior. In other words, it's not
      the original object that has methods to handle the iteration
      protocol.
      [color=blue]
      >
      > No language syntax changes are needed.
      >
      >
      > Open Issues
      > ===========
      >
      > * Should tuples be included? In the past they have been denied some list
      > like behaviors such as count() and index().[/color]

      I'd say yes, but I don't know the reason why tuples are missing methods.
      [color=blue]
      >
      > * Should file objects be included? Implementing reverse iteration may not
      > be easy though it would be useful on occasion.[/color]

      Are we talking text files or binary files here? Offhand, I'm not even
      sure there is a binary file iterator, although if there is reverse iteration
      shouldn't be difficult. Reverse iteration for text files simply means
      implementing
      the reverse scan in C, since the standard library doesn't support it. For
      small enough files, it may be easier to simply internally use getlines() and
      then use iter_reverse() on the list.
      [color=blue]
      >
      > * Should enumerate() be included? It would only provide reverse iteration
      > whenever the underlying sequence supported it.[/color]

      I don't see why not.

      In general, I support the notion that a concept should be implemented
      pervasively, unless there's a clear reason not to do it in a special case.
      Special cases are just that - if they really are special, you trip over them
      once and then remember them. What gets messy is something that's
      partially implemented, where there is no clear rule to determine when
      it is and when it isn't.

      John Roth[color=blue]
      >
      >
      > Copyright
      > =========
      >
      > This document has been placed in the public domain.
      >
      >
      >[/color]


      Comment

      • Terry Reedy

        #4
        Re: Pre-PEP: reverse iteration methods


        "Raymond Hettinger" <vze4rx4y@veriz on.net> wrote in message
        news:HA5cb.185$ kD3.168@nwrdny0 3.gnilink.net.. .[color=blue]
        > for i in xrange(n).iter_ backwards():
        > print seqn[i]
        >
        > for elem in seqn.iter_backw ards():
        > print elem[/color]

        I would prefer a much shorter name, such as riter for reverse/right
        iter. This goes along with current nomenclature such as rfind, etc.

        Terry J. Reedy



        Comment

        • John Roth

          #5
          Re: Pre-PEP: reverse iteration methods


          "Andrew Dalke" <adalke@mindspr ing.com> wrote in message
          news:hJ6cb.1252 $RW4.107@newsre ad4.news.pas.ea rthlink.net...[color=blue]
          > Raymond Hettinger:[color=green]
          > > Proposal
          > > ========
          > >
          > > Add a method called iter_backwards( ) to sequence objects that can[/color][/color]
          benefit[color=blue][color=green]
          > > from it. The above examples then simplify to::
          > >
          > > for i in xrange(n).iter_ backwards():
          > > print seqn[i]
          > >
          > > for elem in seqn.iter_backw ards():
          > > print elem[/color]
          >
          > What about letting 'iter_backwards ' be a builtin which
          > looks for __riter__ and if it doesn't exist, get the length
          > and do old-fashioned offset indexing?[/color]

          There are objects that support iteration that
          don't support len(), such as file objects.
          This has got the advantage, of course, that it would
          automatically work on all objects that support
          an sequence-like protocol, though.

          Frankly, I prefer the notion of a method.
          Maybe make it a method of object that
          automatically works on all new style
          objects that support a sequence-like
          protocol?

          John Roth



          Comment

          • Andrew Dalke

            #6
            Re: Pre-PEP: reverse iteration methods

            John Roth:[color=blue]
            > There are objects that support iteration that
            > don't support len(), such as file objects.[/color]

            Sure. Then it'll given an exception. The implementation
            should turn around and raise the proper exception there
            instead of a "len doesn't exist" one.

            There's also a problem that len works on a dict but
            __getitem__(int ) will only work if the dict stores 0->n-1
            as keys.
            [color=blue]
            > This has got the advantage, of course, that it would
            > automatically work on all objects that support
            > an sequence-like protocol, though.[/color]

            Yup.
            [color=blue]
            > Frankly, I prefer the notion of a method.[/color]

            While I don't. There are many, many ways to
            iterate through a list. (What about all evens
            followed by all odds?) Right now they are all
            done by using a function which creates an iterator
            across a container. If it's a method then you've
            said that reverse_iter is special enough that it
            must be a method, which seems strange for something
            which is so infrequently used.

            Andrew
            dalke@dalkescie ntific.com


            Comment

            • Sean Ross

              #7
              Re: Pre-PEP: reverse iteration methods

              "Raymond Hettinger" <vze4rx4y@veriz on.net> wrote in message
              news:HA5cb.185$ kD3.168@nwrdny0 3.gnilink.net.. .[color=blue]
              > Here is a discussion draft of a potential PEP.[/color]
              [snip][color=blue]
              > Proposal
              > ========
              >
              > Add a method called iter_backwards( ) to sequence objects that can benefit
              > from it. The above examples then simplify to::
              >
              > for i in xrange(n).iter_ backwards():
              > print seqn[i]
              >
              > for elem in seqn.iter_backw ards():
              > print elem
              >[/color]

              Hi.
              This will mostly just be some alternate naming suggestions:

              How about just ".backwards ", where backwards acts like a read-only property
              that returns a generator for reverse iteration?

              for i in xrange(n).backw ards:
              print seqn[i]

              for elem in seqn.backwards:
              print elem

              It's not as explicit as iter_backwards( ), but it's shorter, cleaner, and
              appears
              obvious (to me, anyway). Or, perhaps, 'ibackwards', or 'ireverse' ; or
              ibackwards() or ireverse(). (being reminiscent of imap() and izip())
              '.reverse' would be good, but, of course, it's already taken...

              If one could be made, a fast** general purpose "reverse(iterab le)" (or
              ireverse(iterab le)) function would be my preference (at the very least, it
              would
              avoid having to add iter_backwards( ) to several type definitions).

              # e.g.
              for i in reverse(xrange( n)):
              print seqn[i]

              for index, item in reverse(enumera te(seqn)):
              print "index:%s item:%s"%(index , item)

              # that sort of thing...





              ** Yep, fast. Something where xrange(n-1, -1, -1) is not exorbitantly
              faster than reverse(xrange( n)). Similarly, for sequence types, you'd
              want reverse(list) to be atleast somewhere near as quick as:

              # python-faq, entry 4.6
              list.reverse()
              try:
              for x in list:
              "do something with x"
              finally:
              list.reverse()


              Comment

              • Raymond Hettinger

                #8
                Re: Pre-PEP: reverse iteration methods

                [Raymond Hettinger][color=blue][color=green]
                > > Reverse iteration is much less common than forward iteration, but it
                > > does arise regularly in practice.[/color][/color]

                [John Roth][color=blue]
                > Absolutely.[/color]

                Because the question will arise, I did a quick use case analysis from
                the standard library. I'll attach the following to the PEP. Feel free
                to comment or add other use cases.


                Raymond Hettinger

                --------------------------------------------------------------------------


                Use Case Analysis
                =============== ==

                Here are some instances of reverse iteration taken from the standard
                library and comments on why reverse iteration was necessary:

                * atexit.exit_han dlers() uses::
                while _exithandlers:
                func, targs, kargs = _exithandlers.p op()
                . . .
                The application dictates the need to run exit handlers in the
                reverse order they were built. The "while alist: alist.pop()"
                form is readable and clean; however, it would be slightly faster
                and clearer with:
                for func, target, kargs in _exithandlers.i ter_backwards() :
                . . .
                del _exithandlers

                * difflib.get_clo se_matches() uses::
                result.sort() # Retain only the best n.
                result = result[-n:] # Move best-scorer to head of list.
                result.reverse( ) # Strip scores.
                return [x for score, x in result]
                The need for reverse iteration arises from a requirement to return
                a portion of a sort in an order opposite of the sort criterion. The
                list comprehension is incidental (the third step of a Schwartzian
                transform). This particular use case can met with extended slicing,
                but the code is somewhat unattractive and hard to visually verify::
                result.sort()
                return result[:-n-1:-1]

                * heapq.heapify() uses "for i in xrange(n//2 - 1, -1, -1)" because
                higher-level orderings are more easily formed from pairs of
                lower-level orderings. A forward version of this algorithm is
                possible; however, that would complicate the rest of the heap code
                which iterates over the underlying list in the opposite direction.

                * mhlib.test() uses::
                testfolders.rev erse();
                for t in testfolders:
                do('mh.deletefo lder(%s)' % `t`)
                The need for reverse iteration arises because the tail of the underlying
                list is altered during iteration.

                * platform._dist_ try_harder() uses "for n in range(len(verfi les)-1, -1, -1)"
                because the loop deletes selected elements from "verfiles" but needs to
                leave the rest of the list intact for further iteration. This use
                case could be more efficiently met with itertools.ifilt er but the
                lambda condition and functional style would render it less readable.

                * random.shuffle uses "for i in xrange(len(x)-1, 0, -1)" because the
                algorithm is most easily understood as randomly selecting elements
                from an ever diminishing pool. In fact, the algorithm can be run in
                a forward direction but is less intuitive and rarely presented that
                way in literature.

                * rfc822.Message. __delitem__() uses:
                list.reverse()
                for i in list:
                del self.headers[i]
                The need for reverse iteration arises because the tail of the underlying
                list is altered during iteration.






                Comment

                • Andrew Dalke

                  #9
                  Re: Pre-PEP: reverse iteration methods

                  Sean Ross:[color=blue]
                  > This will mostly just be some alternate naming suggestions:
                  >
                  > How about just ".backwards ", where backwards acts like a read-only[/color]
                  property[color=blue]
                  > that returns a generator for reverse iteration?[/color]

                  -1 from me. 'backwards' looks to me like something which changes
                  the list ordering in place, like what 'reverse' does. Or something
                  which returns a new list but in reverse order. It's an adjective,
                  when it should be a verb.

                  ibackwords would work, but so would riter or reviter or
                  variations thereof.

                  Andrew
                  dalke@dalkescie ntific.com


                  Comment

                  • Stephen Horne

                    #10
                    Re: Pre-PEP: reverse iteration methods

                    On Wed, 24 Sep 2003 00:30:31 GMT, "Raymond Hettinger"
                    <vze4rx4y@veriz on.net> wrote:
                    [color=blue]
                    >Proposal
                    >========
                    >
                    >Add a method called iter_backwards( ) to sequence objects that can benefit
                    >from it. The above examples then simplify to::
                    >
                    > for i in xrange(n).iter_ backwards():
                    > print seqn[i]
                    >
                    > for elem in seqn.iter_backw ards():
                    > print elem[/color]

                    That is a pretty long name. Can we use the convention from itertools
                    where an 'i' prefix is sufficient to suggest an iterator, giving
                    'ibackwards' or 'ireverse' or similar.

                    Second, I'm not quite ready to drop my property idea ;-)

                    An advantage is that the object returned by the property can sensibly
                    support more than just iteration - e.g. slicing (as touched on in the
                    PEP284 thread). So for instance...
                    [color=blue][color=green][color=darkred]
                    >>> x = range(10)
                    >>> list(x.backward )[/color][/color][/color]
                    [9, 8, 7, 6, 5, 4, 3, 2, 1, 0][color=blue][color=green][color=darkred]
                    >>> list(x.backward [::2])[/color][/color][/color]
                    [8, 6, 4, 2, 0]
                    [color=blue]
                    >* Should file objects be included? Implementing reverse iteration may not
                    > be easy though it would be useful on occasion.[/color]

                    IMO no - doing this essentially needs the whole file to be read into
                    memory, in which case you may as well read the whole file into a list
                    and then iterate the list backwards.
                    [color=blue]
                    >* Should enumerate() be included? It would only provide reverse iteration
                    > whenever the underlying sequence supported it.[/color]

                    Why not...

                    for i, j in enumerate (listname.iter_ backwards ()) :

                    in other words, as enumerate can handle the already-reversed
                    sequence/iteration, I don't see the point.


                    --
                    Steve Horne

                    steve at ninereeds dot fsnet dot co dot uk

                    Comment

                    • Stephen Horne

                      #11
                      Re: Pre-PEP: reverse iteration methods

                      On Wed, 24 Sep 2003 04:44:21 GMT, "Andrew Dalke"
                      <adalke@mindspr ing.com> wrote:
                      [color=blue]
                      >-1 from me. 'backwards' looks to me like something which changes
                      >the list ordering in place, like what 'reverse' does. Or something
                      >which returns a new list but in reverse order. It's an adjective,
                      >when it should be a verb.[/color]

                      I disagree with this. IMO something that modifies a value in place
                      should be named using a verb (such as reverse, or sort, or update...).
                      Something that returns a value without modifying its parameters/object
                      should be named to describe what it returns - normally a noun or
                      adjective (such as globals, len, isinstance).

                      So if we had a sorting function that returns a sorted version of the
                      parameter without modifying its parameter, I'd want it to be called
                      'sorted' as opposed to the current in-place 'sort'.

                      Of course there are non-modifying functions that are named as verbs
                      (map, filter and reduce being particularly obvious), but that does
                      seems slightly wrong to me - though for the ones I've listed there is
                      basically an overriding convention (they are well known names).


                      --
                      Steve Horne

                      steve at ninereeds dot fsnet dot co dot uk

                      Comment

                      • Stephen Horne

                        #12
                        Re: Pre-PEP: reverse iteration methods

                        On Wed, 24 Sep 2003 04:44:21 GMT, "Andrew Dalke"
                        <adalke@mindspr ing.com> wrote:
                        [color=blue]
                        >-1 from me. 'backwards' looks to me like something which changes
                        >the list ordering in place, like what 'reverse' does. Or something
                        >which returns a new list but in reverse order. It's an adjective,
                        >when it should be a verb.[/color]

                        I disagree with this. IMO something that modifies a value in place
                        should be named using a verb (such as reverse, or sort, or update...).
                        Something that returns a value without modifying its parameters/object
                        should be named to describe what it returns - normally a noun or
                        adjective (such as globals, len, isinstance).

                        So if we had a sorting function that returns a sorted version of the
                        parameter without modifying its parameter, I'd want it to be called
                        'sorted' as opposed to the current in-place 'sort'.

                        Of course there are non-modifying functions that are named as verbs
                        (map, filter and reduce being particularly obvious), but that does
                        seems slightly wrong to me - though for the ones I've listed there is
                        basically an overriding convention (they are well known names).


                        --
                        Steve Horne

                        steve at ninereeds dot fsnet dot co dot uk

                        Comment

                        • Stephen Horne

                          #13
                          Re: Pre-PEP: reverse iteration methods

                          On Wed, 24 Sep 2003 02:13:48 GMT, "Andrew Dalke"
                          <adalke@mindspr ing.com> wrote:
                          [color=blue][color=green]
                          >> Frankly, I prefer the notion of a method.[/color]
                          >
                          >While I don't.[/color]

                          To me, the reason to use a method (or property) is simply that most
                          types cannot be efficiently 'backwardised'. For instance, iterators in
                          general would have to be buffered into a list an then the resulting
                          list iterated backwards. That could be useful, but the overhead is
                          sufficient that I think explicitly writing 'list (x).backward ()'
                          rather than 'x.backward ()' would be a good thing.

                          Having a method (or property) explicitly associates it with the
                          object/class being handled.


                          --
                          Steve Horne

                          steve at ninereeds dot fsnet dot co dot uk

                          Comment

                          • Stephen Horne

                            #14
                            Re: Pre-PEP: reverse iteration methods

                            On Wed, 24 Sep 2003 02:13:48 GMT, "Andrew Dalke"
                            <adalke@mindspr ing.com> wrote:
                            [color=blue][color=green]
                            >> Frankly, I prefer the notion of a method.[/color]
                            >
                            >While I don't.[/color]

                            To me, the reason to use a method (or property) is simply that most
                            types cannot be efficiently 'backwardised'. For instance, iterators in
                            general would have to be buffered into a list an then the resulting
                            list iterated backwards. That could be useful, but the overhead is
                            sufficient that I think explicitly writing 'list (x).backward ()'
                            rather than 'x.backward ()' would be a good thing.

                            Having a method (or property) explicitly associates it with the
                            object/class being handled.


                            --
                            Steve Horne

                            steve at ninereeds dot fsnet dot co dot uk

                            Comment

                            • Hannu Kankaanpää

                              #15
                              Re: Pre-PEP: reverse iteration methods

                              "Andrew Dalke" <adalke@mindspr ing.com> wrote in message news:<w57cb.128 9$RW4.1142@news read4.news.pas. earthlink.net>. ..[color=blue]
                              > Sure. Then it'll given an exception. The implementation
                              > should turn around and raise the proper exception there
                              > instead of a "len doesn't exist" one.
                              >
                              > There's also a problem that len works on a dict but
                              > __getitem__(int ) will only work if the dict stores 0->n-1
                              > as keys.[/color]


                              How about using a temporary sequence if __riter__ isn't defined?
                              It's slow, but would work with all non-infinite iterators that
                              don't have strange side effects (vast majority).


                              def riter(obj):
                              try:
                              rit = getattr(obj, "__riter__" )
                              except AttributeError:
                              # assuming list has __riter__, the following could be:
                              # for x in riter(list(obj) ):
                              # yield x
                              seq = list(obj)
                              n = len(seq)
                              while n != 0:
                              n -= 1
                              yield seq[n]
                              else:
                              for term in rit():
                              yield term

                              # reverse iteration of file
                              f = file('riter.py' )
                              try:
                              for line in riter(f):
                              print line,
                              finally:
                              f.close()

                              # reverse iteration of a dictionary
                              for x in riter({'foo':1, 'bar':2}):
                              print x,


                              I changed it to use a shorter name as suggested by Reedy, to be similar
                              with rfind etc. I also prefer a function instead of a method, as the function
                              can provide (slow) default behaviour for iterators that don't explicitely
                              support reverse iteration.

                              Comment

                              Working...