"Collapsing" a list into a list of changes

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Alan McIntyre

    "Collapsing" a list into a list of changes

    Hi all,

    I have a list of items that has contiguous repetitions of values, but
    the number and location of the repetitions is not important, so I just
    need to strip them out. For example, if my original list is
    [0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5], I want to end up with [0,1,2,3,2,4,5].

    Here is the way I'm doing this now:

    def straightforward _collapse(myLis t):
    collapsed = [myList[0]]
    for n in myList[1:]:
    if n != collapsed[-1]:
    collapsed.appen d(n)

    return collapsed

    Is there an elegant way to do this, or should I just stick with the code
    above?

    Thanks,
    Alan McIntyre

  • Steven Bethard

    #2
    Re: "Collapsin g" a list into a list of changes

    Alan McIntyre wrote:[color=blue]
    > Hi all,
    >
    > I have a list of items that has contiguous repetitions of values, but
    > the number and location of the repetitions is not important, so I just
    > need to strip them out. For example, if my original list is
    > [0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5], I want to end up with [0,1,2,3,2,4,5].
    >
    > Here is the way I'm doing this now:
    >
    > def straightforward _collapse(myLis t):
    > collapsed = [myList[0]]
    > for n in myList[1:]:
    > if n != collapsed[-1]:
    > collapsed.appen d(n)
    >
    > return collapsed
    >
    > Is there an elegant way to do this, or should I just stick with the code
    > above?[/color]

    Well, this does about the same thing, but using enumerate and a list
    comprehension:

    py> lst = [0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5]
    py> [item for i, item in enumerate(lst) if i == 0 or item != lst[i-1]]
    [0, 1, 2, 3, 2, 4, 5]

    Similar code that doesn't check 'if i == 0' each time through:

    py> itr = enumerate(lst)
    py> itr.next()
    (0, 0)
    py> [lst[0]] + [item for i, item in itr if item != lst[i-1]]
    [0, 1, 2, 3, 2, 4, 5]

    I don't know if either of these is really more elegant though...

    Steve

    Comment

    • Jeremy Bowers

      #3
      Re: "Collapsin g" a list into a list of changes

      On Fri, 04 Feb 2005 12:43:37 -0500, Alan McIntyre wrote:[color=blue]
      > def straightforward _collapse(myLis t):
      > collapsed = [myList[0]]
      > for n in myList[1:]:
      > if n != collapsed[-1]:
      > collapsed.appen d(n)
      >
      > return collapsed
      >
      > Is there an elegant way to do this, or should I just stick with the code
      > above?[/color]

      I think that's pretty elegant; I read it and immediately understood what
      you were doing. There may be some performance tweaks you could make if you
      were doing this to large lists, and my instincts say to re-write it as an
      iterator if you use it a lot like:

      for item in collapse(yourLi st):

      but other than that which may not even apply, "straightforwar d" is
      generally a *good* thing, don't you think? :-)

      Comment

      • Jack Diederich

        #4
        Re: "Collapsin g" a list into a list of changes

        On Fri, Feb 04, 2005 at 12:43:37PM -0500, Alan McIntyre wrote:[color=blue]
        > Hi all,
        >
        > I have a list of items that has contiguous repetitions of values, but
        > the number and location of the repetitions is not important, so I just
        > need to strip them out. For example, if my original list is
        > [0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5], I want to end up with [0,1,2,3,2,4,5].
        >
        > Here is the way I'm doing this now:
        >
        > def straightforward _collapse(myLis t):
        > collapsed = [myList[0]]
        > for n in myList[1:]:
        > if n != collapsed[-1]:
        > collapsed.appen d(n)
        >
        > return collapsed
        >
        > Is there an elegant way to do this, or should I just stick with the code
        > above?
        >[/color]
        If you are using python2.4,
        [color=blue][color=green][color=darkred]
        >>> import itertools as it
        >>> [x[0] for (x) in it.groupby([0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5])][/color][/color][/color]
        [0, 1, 2, 3, 2, 4, 5][color=blue][color=green][color=darkred]
        >>>[/color][/color][/color]

        Since this is 2.4 you could also return a generator expression.
        [color=blue][color=green][color=darkred]
        >>> def iter_collapse(m yList):[/color][/color][/color]
        .... return (x[0] for (x) in it.groupby([0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5]))
        ....[color=blue][color=green][color=darkred]
        >>> i = iter_collapse([0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5])
        >>> i[/color][/color][/color]
        <generator object at 0xb7df6b2c>[color=blue][color=green][color=darkred]
        >>> list(i)[/color][/color][/color]
        [0, 1, 2, 3, 2, 4, 5][color=blue][color=green][color=darkred]
        >>>[/color][/color][/color]


        -Jack

        Comment

        • Alan McIntyre

          #5
          Re: &quot;Collapsin g&quot; a list into a list of changes

          Your first example is along the lines of what I was thinking when I said
          "elegant." :) I was looking for something that I could drop into one
          or two lines of code; I may not do that if I'm writing code that will
          have to be maintained, but it's still nice to know how to do it.

          Thanks :)
          Alan

          Steven Bethard wrote:[color=blue]
          > Well, this does about the same thing, but using enumerate and a list
          > comprehension:
          >
          > py> lst = [0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5]
          > py> [item for i, item in enumerate(lst) if i == 0 or item != lst[i-1]]
          > [0, 1, 2, 3, 2, 4, 5]
          >
          > Similar code that doesn't check 'if i == 0' each time through:
          >
          > py> itr = enumerate(lst)
          > py> itr.next()
          > (0, 0)
          > py> [lst[0]] + [item for i, item in itr if item != lst[i-1]]
          > [0, 1, 2, 3, 2, 4, 5]
          >
          > I don't know if either of these is really more elegant though...
          >
          > Steve[/color]

          Comment

          • Steven Bethard

            #6
            Re: &quot;Collapsin g&quot; a list into a list of changes

            Alan McIntyre wrote:[color=blue]
            > Hi all,
            >
            > I have a list of items that has contiguous repetitions of values, but
            > the number and location of the repetitions is not important, so I just
            > need to strip them out. For example, if my original list is
            > [0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5], I want to end up with [0,1,2,3,2,4,5].
            >
            > Here is the way I'm doing this now:
            >
            > def straightforward _collapse(myLis t):
            > collapsed = [myList[0]]
            > for n in myList[1:]:
            > if n != collapsed[-1]:
            > collapsed.appen d(n)
            >
            > return collapsed
            >
            > Is there an elegant way to do this, or should I just stick with the code
            > above?[/color]

            Here's a solution that works for iterables other than lists:

            py> def collapse(iterab le):
            .... enumeration = enumerate(itera ble)
            .... _, lastitem = enumeration.nex t()
            .... yield lastitem
            .... for i, item in enumeration:
            .... if item != lastitem:
            .... yield item
            .... lastitem = item
            ....
            py> lst = [0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5]
            py> list(collapse(l st))
            [0, 1, 2, 3, 2, 4, 5]

            Again, I'm still not sure I'd call this more elegant...

            STeVe

            Comment

            • Alan McIntyre

              #7
              Re: &quot;Collapsin g&quot; a list into a list of changes

              I think you're right; sometimes I'm susceptible to the "I can do that in
              one line of code" temptation. :)

              Since this current bit of code will probably end up in something that's
              going to be maintained, I will probably stick with the straightforward
              method just to be nice to the maintainer (especially if it's me!).

              Jeremy Bowers wrote:[color=blue]
              > I think that's pretty elegant; I read it and immediately understood what
              > you were doing. There may be some performance tweaks you could make if you
              > were doing this to large lists, and my instincts say to re-write it as an
              > iterator if you use it a lot like:
              >
              > for item in collapse(yourLi st):
              >
              > but other than that which may not even apply, "straightforwar d" is
              > generally a *good* thing, don't you think? :-)
              >[/color]

              Comment

              • Alan McIntyre

                #8
                Re: &quot;Collapsin g&quot; a list into a list of changes

                Jack,

                I'm not using 2.4 yet; still back in 2.3x. :) Thanks for the examples,
                though - they are clear enough that I will probably use them when I upgrade.

                Thanks,
                Alan

                Jack Diederich wrote:[color=blue]
                > If you are using python2.4,
                >
                >[color=green][color=darkred]
                >>>>import itertools as it
                >>>>[x[0] for (x) in it.groupby([0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5])][/color][/color]
                >
                > [0, 1, 2, 3, 2, 4, 5]
                >
                >
                > Since this is 2.4 you could also return a generator expression.
                >
                >[color=green][color=darkred]
                >>>>def iter_collapse(m yList):[/color][/color]
                >
                > ... return (x[0] for (x) in it.groupby([0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5]))
                > ...
                >[color=green][color=darkred]
                >>>>i = iter_collapse([0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5])
                >>>>i[/color][/color]
                >
                > <generator object at 0xb7df6b2c>
                >[color=green][color=darkred]
                >>>>list(i)[/color][/color]
                >
                > [0, 1, 2, 3, 2, 4, 5]
                >
                >
                >
                > -Jack
                >[/color]

                Comment

                • Mike C. Fletcher

                  #9
                  Re: &quot;Collapsin g&quot; a list into a list of changes

                  Alan McIntyre wrote:
                  ....
                  [color=blue]
                  > I have a list of items that has contiguous repetitions of values, but
                  > the number and location of the repetitions is not important, so I just
                  > need to strip them out. For example, if my original list is
                  > [0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5], I want to end up with
                  > [0,1,2,3,2,4,5].[/color]

                  ....
                  [color=blue]
                  > Is there an elegant way to do this, or should I just stick with the
                  > code above?[/color]
                  [color=blue][color=green][color=darkred]
                  >>> def changes( dataset ):[/color][/color][/color]
                  .... last = None
                  .... for value in dataset:
                  .... if value != last:
                  .... yield value
                  .... last = value
                  ....[color=blue][color=green][color=darkred]
                  >>> print list(changes(da ta ))[/color][/color][/color]

                  which is quite readable/elegant IMO.

                  Have fun,
                  Mike

                  _______________ _______________ _______________ ___
                  Mike C. Fletcher
                  Designer, VR Plumber, Coder


                  PyCon is coming...

                  Comment

                  • Steven Bethard

                    #10
                    Re: &quot;Collapsin g&quot; a list into a list of changes

                    Mike C. Fletcher wrote:[color=blue]
                    > Alan McIntyre wrote:
                    > ...
                    >[color=green]
                    >> I have a list of items that has contiguous repetitions of values, but
                    >> the number and location of the repetitions is not important, so I just
                    >> need to strip them out. For example, if my original list is
                    >> [0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5], I want to end up with
                    >> [0,1,2,3,2,4,5].[/color]
                    >
                    >
                    > ...
                    >[color=green]
                    >> Is there an elegant way to do this, or should I just stick with the
                    >> code above?[/color]
                    >
                    >[color=green][color=darkred]
                    > >>> def changes( dataset ):[/color][/color]
                    > ... last = None
                    > ... for value in dataset:
                    > ... if value != last:
                    > ... yield value
                    > ... last = value
                    > ... >>> print list(changes(da ta ))
                    >
                    > which is quite readable/elegant IMO.[/color]

                    But fails if the list starts with None:

                    py> lst = [None,0,0,1,1,1, 2,2,3,3,3,2,2,2 ,4,4,4,5]
                    py> def changes(dataset ):
                    .... last = None
                    .... for value in dataset:
                    .... if value != last:
                    .... yield value
                    .... last = value
                    ....
                    py> list(changes(ls t))
                    [0, 1, 2, 3, 2, 4, 5]

                    A minor modification that does appear to work:

                    py> def changes(dataset ):
                    .... last = object()
                    .... for value in dataset:
                    .... if value != last:
                    .... yield value
                    .... last = value
                    ....
                    py> list(changes(ls t))
                    [None, 0, 1, 2, 3, 2, 4, 5]

                    STeVe

                    Comment

                    • John J. Lee

                      #11
                      Re: &quot;Collapsin g&quot; a list into a list of changes

                      Steven Bethard <steven.bethard @gmail.com> writes:
                      [color=blue]
                      > Mike C. Fletcher wrote:[/color]
                      [...][color=blue][color=green][color=darkred]
                      > > >>> def changes( dataset ):[/color]
                      > > ... last = None
                      > > ... for value in dataset:
                      > > ... if value != last:
                      > > ... yield value
                      > > ... last = value
                      > > ... >>> print list(changes(da ta ))
                      > > which is quite readable/elegant IMO.[/color]
                      >
                      > But fails if the list starts with None:
                      >
                      > py> lst = [None,0,0,1,1,1, 2,2,3,3,3,2,2,2 ,4,4,4,5]
                      > py> def changes(dataset ):
                      > ... last = None
                      > ... for value in dataset:
                      > ... if value != last:
                      > ... yield value
                      > ... last = value
                      > ...
                      > py> list(changes(ls t))
                      > [0, 1, 2, 3, 2, 4, 5]
                      >
                      > A minor modification that does appear to work:
                      >
                      > py> def changes(dataset ):
                      > ... last = object()
                      > ... for value in dataset:
                      > ... if value != last:
                      > ... yield value
                      > ... last = value
                      > ...
                      > py> list(changes(ls t))
                      > [None, 0, 1, 2, 3, 2, 4, 5][/color]

                      Unless the first object in the list has a weird __cmp__ (does
                      happen...). OK, weird __cmp__s are nasty anyway, but still, why
                      compound it through cleverness when you can write a really plodding
                      function that *always* does what it says on the tin?

                      clever-is-evil-ly y'rs,


                      John

                      Comment

                      • Nick Coghlan

                        #12
                        Re: &quot;Collapsin g&quot; a list into a list of changes

                        John J. Lee wrote:[color=blue]
                        > Steven Bethard <steven.bethard @gmail.com> writes:
                        >
                        >[color=green]
                        >>Mike C. Fletcher wrote:[/color]
                        >
                        > [...]
                        >[color=green][color=darkred]
                        >>> >>> def changes( dataset ):
                        >>>... last = None
                        >>>... for value in dataset:
                        >>>... if value != last:
                        >>>... yield value
                        >>>... last = value
                        >>>... >>> print list(changes(da ta ))
                        >>>which is quite readable/elegant IMO.[/color]
                        >>
                        >>But fails if the list starts with None:
                        >>
                        >>py> lst = [None,0,0,1,1,1, 2,2,3,3,3,2,2,2 ,4,4,4,5]
                        >>py> def changes(dataset ):
                        >>... last = None
                        >>... for value in dataset:
                        >>... if value != last:
                        >>... yield value
                        >>... last = value
                        >>...
                        >>py> list(changes(ls t))
                        >>[0, 1, 2, 3, 2, 4, 5]
                        >>
                        >>A minor modification that does appear to work:
                        >>
                        >>py> def changes(dataset ):
                        >>... last = object()
                        >>... for value in dataset:
                        >>... if value != last:
                        >>... yield value
                        >>... last = value
                        >>...
                        >>py> list(changes(ls t))
                        >>[None, 0, 1, 2, 3, 2, 4, 5][/color]
                        >
                        >
                        > Unless the first object in the list has a weird __cmp__ (does
                        > happen...). OK, weird __cmp__s are nasty anyway, but still, why
                        > compound it through cleverness when you can write a really plodding
                        > function that *always* does what it says on the tin?[/color]

                        In that case:

                        Py> def collapsed(itera ble):
                        .... itr = iter(iterable)
                        .... last = itr.next()
                        .... yield last
                        .... for value in itr:
                        .... if value != last:
                        .... yield value
                        .... last = value
                        ....
                        Py> from Tkinter import _flatten #**
                        Py> lst = list(_flatten([[i] * 5 for i in range(10)]))
                        Py> lst
                        [0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5
                        , 5, 5, 5, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9]
                        Py> list(collapsed( lst))
                        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

                        ** Hmm, this function really is quite handy when generating test data. Maybe in
                        Python 2.5 the line should be spelt "from itertools import flatten"

                        Cheers,
                        Nick.

                        --
                        Nick Coghlan | ncoghlan@email. com | Brisbane, Australia
                        ---------------------------------------------------------------

                        Comment

                        • Nick Coghlan

                          #13
                          Re: &quot;Collapsin g&quot; a list into a list of changes

                          Jack Diederich wrote:[color=blue]
                          > Since this is 2.4 you could also return a generator expression.
                          >
                          >[color=green][color=darkred]
                          >>>>def iter_collapse(m yList):[/color][/color]
                          >
                          > ... return (x[0] for (x) in it.groupby([0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5]))
                          > ...[/color]

                          But why write a function that returns a generator expression, when you could
                          just turn the function itself into a generator?

                          Py>def iter_collapse(m yList):
                          .... for x in it.groupby(myLi st):
                          .... yield x[0]

                          Cheers,
                          Nick.

                          --
                          Nick Coghlan | ncoghlan@email. com | Brisbane, Australia
                          ---------------------------------------------------------------

                          Comment

                          • Alex Martelli

                            #14
                            Re: &quot;Collapsin g&quot; a list into a list of changes

                            Steven Bethard <steven.bethard @gmail.com> wrote:
                            [color=blue]
                            > Here's a solution that works for iterables other than lists:
                            >
                            > py> def collapse(iterab le):
                            > ... enumeration = enumerate(itera ble)
                            > ... _, lastitem = enumeration.nex t()
                            > ... yield lastitem
                            > ... for i, item in enumeration:
                            > ... if item != lastitem:
                            > ... yield item
                            > ... lastitem = item
                            > ...
                            > py> lst = [0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5]
                            > py> list(collapse(l st))
                            > [0, 1, 2, 3, 2, 4, 5]
                            >
                            > Again, I'm still not sure I'd call this more elegant...[/color]

                            Hmmmm, what role does the enumeration play here? I don't see how you're
                            using it, at all. Why not just:

                            def collapse(iterab le):
                            it = iter(iterable)
                            lastitem = it.next()
                            yield lastitem
                            for item in it:
                            if item != lastitem:
                            yield item
                            lastitem = item

                            that's basically just the same as your code but without the strangeness
                            of making an enumerate for the purpose of ignoring what the enumerate
                            adds to an ordinary iterator.


                            Alex

                            Comment

                            • Tony

                              #15
                              Re: &quot;Collapsin g&quot; a list into a list of changes


                              Alan McIntyre wrote:[color=blue]
                              > Hi all,
                              >
                              > I have a list of items that has contiguous repetitions of values, but[/color]
                              [color=blue]
                              > the number and location of the repetitions is not important, so I[/color]
                              just[color=blue]
                              > need to strip them out. For example, if my original list is
                              > [0,0,1,1,1,2,2,3 ,3,3,2,2,2,4,4, 4,5], I want to end up with[/color]
                              [0,1,2,3,2,4,5].[color=blue]
                              >
                              > Here is the way I'm doing this now:
                              >
                              > def straightforward _collapse(myLis t):
                              > collapsed = [myList[0]]
                              > for n in myList[1:]:
                              > if n != collapsed[-1]:
                              > collapsed.appen d(n)
                              >
                              > return collapsed
                              >
                              > Is there an elegant way to do this, or should I just stick with the[/color]
                              code[color=blue]
                              > above?
                              >
                              > Thanks,
                              > Alan McIntyre
                              > http://www.esrgtech.com[/color]

                              Here is a version using dictionary properties, ie no duplication of
                              keys.

                              def condense(l):
                              d={}
                              for item in l:
                              d[item]=1
                              l=d.keys()
                              return l

                              Cheers
                              Tony

                              Comment

                              Working...