negative indices for sequence types

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • dan

    negative indices for sequence types

    I was recently surprised, and quite shocked in fact, to find that
    Python treats negative indices into sequence types as if they were
    mod(length-of-sequence), at least up to -len(seq).

    This fact is *deeply* buried in the docs, and is not at all intuitive.
    One of the big advantages of a high-level language such as Python is
    the ability to provide run-time bounds checking on array-type
    constructs. To achieve this I will now have to subclass my objects
    and add it myself, which seems silly and will add significant
    overhead. If you want this behavior, how hard is it to say a = b[x %
    len(b)] ??

    Can anyone explain why this anomaly exists, and why it should continue
    to exist?
  • Martin v. Löwis

    #2
    Re: negative indices for sequence types

    danbmil99@yahoo .com (dan) writes:
    [color=blue]
    > This fact is *deeply* buried in the docs, and is not at all intuitive.[/color]

    I find it highly intuitive and very convenient.
    [color=blue]
    > If you want this behavior, how hard is it to say a = b[x %
    > len(b)] ??[/color]

    *This* I would call un-intuitive. It is also much slower.

    To get the last element, you currently write b[-1]. If that was not
    available, you would have to write b[len(b)-1], which is still
    significantly slower. Also, you might not have a variable name, so try
    rewriting foo()[-1].

    Regards,
    Martin

    Comment

    • Peter Otten

      #3
      Re: negative indices for sequence types

      dan wrote:
      [color=blue]
      > I was recently surprised, and quite shocked in fact, to find that
      > Python treats negative indices into sequence types as if they were
      > mod(length-of-sequence), at least up to -len(seq).
      >
      > This fact is *deeply* buried in the docs, and is not at all intuitive.
      > One of the big advantages of a high-level language such as Python is
      > the ability to provide run-time bounds checking on array-type
      > constructs. To achieve this I will now have to subclass my objects
      > and add it myself, which seems silly and will add significant
      > overhead. If you want this behavior, how hard is it to say a = b[x %
      > len(b)] ??
      >
      > Can anyone explain why this anomaly exists, and why it should continue
      > to exist?[/color]

      After you have recovered from the shock, you probably will admit that
      (1) the most common "out of bounds" case is caught:
      [color=blue][color=green][color=darkred]
      >>> l = list("abc")
      >>> l[3][/color][/color][/color]
      Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      IndexError: list index out of range

      and
      (2) that accessing elements from the end of the list is something you will
      soon appreciate:[color=blue][color=green][color=darkred]
      >>> l[-1][/color][/color][/color]
      'c'[color=blue][color=green][color=darkred]
      >>>[/color][/color][/color]
      [color=blue][color=green][color=darkred]
      >>> l[-2:][/color][/color][/color]
      ['b', 'c'][color=blue][color=green][color=darkred]
      >>>[/color][/color][/color]

      I think that more code enjoys the beauty of accessing the end of a list than
      suffers from uncaught <0 index errors. See the possibilities rather than
      the danger :-)

      Peter

      Comment

      • Bengt Richter

        #4
        Re: negative indices for sequence types

        On 7 Sep 2003 11:26:28 -0700, danbmil99@yahoo .com (dan) wrote:
        [color=blue]
        >I was recently surprised, and quite shocked in fact, to find that
        >Python treats negative indices into sequence types as if they were
        >mod(length-of-sequence), at least up to -len(seq).
        >
        >This fact is *deeply* buried in the docs, and is not at all intuitive.
        > One of the big advantages of a high-level language such as Python is
        >the ability to provide run-time bounds checking on array-type
        >constructs. To achieve this I will now have to subclass my objects
        >and add it myself, which seems silly and will add significant
        >overhead. If you want this behavior, how hard is it to say a = b[x %
        >len(b)] ??[/color]

        That isn't really the exact behavior. E.g.,
        [color=blue][color=green][color=darkred]
        >>> range(5)[/color][/color][/color]
        [0, 1, 2, 3, 4][color=blue][color=green][color=darkred]
        >>> range(5)[-4][/color][/color][/color]
        1[color=blue][color=green][color=darkred]
        >>> range(5)[-5][/color][/color][/color]
        0[color=blue][color=green][color=darkred]
        >>> range(5)[-6][/color][/color][/color]
        Traceback (most recent call last):
        File "<stdin>", line 1, in ?
        IndexError: list index out of range
        [color=blue][color=green][color=darkred]
        >>> range(5)[4][/color][/color][/color]
        4[color=blue][color=green][color=darkred]
        >>> range(5)[5][/color][/color][/color]
        Traceback (most recent call last):
        File "<stdin>", line 1, in ?
        IndexError: list index out of range
        [color=blue]
        >Can anyone explain why this anomaly exists, and why it should continue
        >to exist?[/color]
        It has apparently proven more useful to have it so than not, though I sympathize
        with your frustration in for your use.

        Perhaps a .no_negative_in dexing attribute or something could be added to the C implementation,
        so that you could specify your desired checking without a performance hit.

        Meanwhile, maybe an assert i>=0 in the index-supplier side of the contract might work too?

        Regards,
        Bengt Richter

        Comment

        • Terry Reedy

          #5
          Re: negative indices for sequence types


          "dan" <danbmil99@yaho o.com> wrote in message
          news:fbf8d8f2.0 309071026.4c44b 985@posting.goo gle.com...[color=blue]
          > I was recently surprised, and quite shocked in fact, to find that
          > Python treats negative indices into sequence types as if they were
          > mod(length-of-sequence), at least up to -len(seq).[/color]

          No, it adds len(seq). Changing + to % would be slower and more
          obscure.
          [color=blue]
          > This fact is *deeply* buried in the docs,[/color]

          No more so than everything else in chapter subsections. From the Ref
          Man table of contents I went directly to the most obvious place 5.3.2
          Subscriptions, and found
          '''
          If the primary is a sequence, the expression (list) must evaluate to a
          plain integer. If this value is negative, the length of the sequence
          is added to it (so that, e.g., x[-1] selects the last item of x.) The
          resulting value must be a nonnegative integer less than the number of
          items in the sequence, and the subscription selects the item whose
          index is that value (counting from zero).
          '''
          Translated to Python, letting idex be result of index expression:

          if not isinstance(idex , (int,long)): raise TypeError()
          if idex < 0: idex += seqlen
          if idex < 0 or idex >= seqlen: raise IndexError()
          <get seq[idex]>
          [color=blue]
          > and is not at all intuitive.[/color]

          Phrases like 'third from the end' are idiomatic English ;-)
          [color=blue]
          > One of the big advantages of a high-level language such as Python[/color]
          is[color=blue]
          > the ability to provide run-time bounds checking on array-type
          > constructs. To achieve this I will now have to subclass my objects
          > and add it myself, which seems silly and will add significant
          > overhead. If you want this behavior, how hard is it to say a = b[x[/color]
          %[color=blue]
          > len(b)] ??[/color]

          Again, your innovation of using '% obscures rather than clarify.
          [color=blue]
          > Can anyone explain why this anomaly exists, and why it should[/color]
          continue[color=blue]
          > to exist?[/color]

          Being able to abbreviate seq(len(seq)-1] as seq[-1] is quite handy and
          faster executing, , especially if seq is calculated from an
          expression. Same for -2, etc. (And, of course, a change now would
          break a noticeable fraction of existing programs.)

          Terry J. Reedy


          Comment

          • Erik Max Francis

            #6
            Re: negative indices for sequence types

            dan wrote:
            [color=blue]
            > I was recently surprised, and quite shocked in fact, to find that
            > Python treats negative indices into sequence types as if they were
            > mod(length-of-sequence), at least up to -len(seq).[/color]

            That is not the behavior of negative indices. Negative indices mean
            index from the end of the sequence. So -1 means the _last_ element in
            the list, -2 means the second to last element in the list, and so on.
            -n (for n = len(seq) is the first element in the list.
            [color=blue]
            > This fact is *deeply* buried in the docs, and is not at all intuitive.[/color]

            It's mentioned prominently (and early) in all the tutorials and books on
            Python I've read, and it's a very common and convenient convention, so
            I'm not sure how far you could have gotten through learning Python and
            never been exposed to it.
            [color=blue]
            > One of the big advantages of a high-level language such as Python is
            > the ability to provide run-time bounds checking on array-type
            > constructs. To achieve this I will now have to subclass my objects
            > and add it myself, which seems silly and will add significant
            > overhead. If you want this behavior, how hard is it to say a = b[x %
            > len(b)] ??[/color]

            That's simply not true. Negative indices have similar bounds
            requirements. If you have a sequence of length n, then indices 0
            through (n - 1) map to the elements of the sequence in order from left
            to right, and indices -1 through -n map to the elements in order from
            right to left. Indices greater than n or less than -n generate
            IndexErrors. Bounds checking is always done, whether on positive or
            negative indices.

            --
            Erik Max Francis && max@alcyone.com && http://www.alcyone.com/max/
            __ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
            / \ Then you give me that Judas Kiss / Could you hurt me more than this
            \__/ Lamya

            Comment

            • bigdog

              #7
              Re: negative indices for sequence types

              danbmil99@yahoo .com (dan) wrote in message news:<fbf8d8f2. 0309080911.1cf7 3016@posting.go ogle.com>...[color=blue]
              > As is often the case, I think this comes down to documentation. While
              > the behavior is mentioned early in the tutorial, I found it difficult
              > to find it in the reference -- but whatever, we can chalk this up to
              > RTFM on my part.
              >
              > My explanation of the behavior is correct however. list[a] always
              > equals list[a % len(list)]. A negative number mod N = its absolute
              > value subtracted from N:
              >
              > a % n == n - abs(a) # where -n <= a <= 0
              >
              > However if I want to count from the end of the list, I would of course
              > write
              > list[len(list)-a]. I wasn't really considering that the purpose of
              > this feature was to count from the end of a list, which I admit could
              > come in handy.
              >
              > Thanks for the responses.
              >
              > Fernando Perez <fperez528@yaho o.com> wrote in message news:<bjh94c$j9 p$1@peabody.col orado.edu>...[color=green]
              > > dan wrote:
              > >[color=darkred]
              > > > I was recently surprised, and quite shocked in fact, to find that
              > > > Python treats negative indices into sequence types as if they were
              > > > mod(length-of-sequence), at least up to -len(seq).
              > > >
              > > > This fact is *deeply* buried in the docs, and is not at all intuitive.[/color]
              > >
              > > Very deeply indeed: section 3.1.4 of the beginner's tutorial:
              > >
              > > http://www.python.org/doc/current/tu...00000000000000
              > >
              > > Of all places, this is the section on lists:
              > >[color=darkred]
              > > >>> a = ['spam', 'eggs', 100, 1234][/color]
              > >
              > > [... snip ...]
              > >[color=darkred]
              > > >>> a[-2][/color][/color]
              > 100[color=green][color=darkred]
              > > >>> a[1:-1][/color]
              > > ['eggs', 100]
              > >[color=darkred]
              > > > Can anyone explain why this anomaly exists, and why it should continue
              > > > to exist?[/color]
              > >
              > > Because this 'anomaly' is incredibly useful in many contexts, as many others
              > > have already pointed out. Rest assured that it will continue to exist,
              > > probably for as long as the language is around. Better get to like it :)
              > >
              > > Cheers,
              > >
              > > f.[/color][/color]

              Heck, I like it simply because I can read lines from files and easily
              chop off the newline.

              myStr = f.readline()[0:-1]

              That alone is worth it's wait in gold to me, never mind all the other
              things it makes easy.

              Comment

              • Lukasz Pankowski

                #8
                Re: negative indices for sequence types

                msurel@comshare .com (bigdog) writes:
                [color=blue]
                > myStr = f.readline()[0:-1][/color]

                this may eat you last character in the file (if last line does not end
                with new line which happens, but this will not ::

                myStr = f.readline().rs trip('\n')

                but is 6 character longer :)

                --

                =*= Lukasz Pankowski =*=

                Comment

                • Jacek Generowicz

                  #9
                  Re: negative indices for sequence types

                  danbmil99@yahoo .com (dan) hypothesizes:
                  [color=blue]
                  > My explanation of the behavior is correct however. list[a] always
                  > equals list[a % len(list)]. A negative number mod N = its absolute
                  > value subtracted from N:[/color]

                  Proof by counterexample:

                  Python 2.2.2 (#1, Feb 8 2003, 12:11:31)
                  [GCC 3.2] on linux2
                  Type "help", "copyright" , "credits" or "license" for more information.[color=blue][color=green][color=darkred]
                  >>> s = '0123'
                  >>> s[-20 % len(s)][/color][/color][/color]
                  '0'[color=blue][color=green][color=darkred]
                  >>> s[-20][/color][/color][/color]
                  Traceback (most recent call last):
                  File "<stdin>", line 1, in ?
                  IndexError: string index out of range


                  Your explanation of the behaviour is incorrect.

                  QED.

                  Comment

                  Working...