opposite of zip()?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • igor.tatarinov@gmail.com

    opposite of zip()?

    Given a bunch of arrays, if I want to create tuples, there is
    zip(arrays). What if I want to do the opposite: break a tuple up and
    append the values to given arrays:
    map(append, arrays, tupl)
    except there is no unbound append() (List.append() does not exist,
    right?).

    Without append(), I am forced to write a (slow) explicit loop:
    for (a, v) in zip(arrays, tupl):
    a.append(v)

    I assume using an index variable instead wouldn't be much faster.

    Is there a better solution?

    Thanks,
    igor
  • Paddy

    #2
    Re: opposite of zip()?

    On Dec 15, 5:47 am, igor.tatari...@ gmail.com wrote:
    Given a bunch of arrays, if I want to create tuples, there is
    zip(arrays). What if I want to do the opposite: break a tuple up and
    append the values to given arrays:
    map(append, arrays, tupl)
    except there is no unbound append() (List.append() does not exist,
    right?).
    >
    Without append(), I am forced to write a (slow) explicit loop:
    for (a, v) in zip(arrays, tupl):
    a.append(v)
    >
    I assume using an index variable instead wouldn't be much faster.
    >
    Is there a better solution?
    >
    Thanks,
    igor
    I can't quite get what you require from your explanation. Do you have
    sample input & output?

    Maybe this:
    Someone blogged about Python not having an unzip function to go with zip(). unzip is straight-forward to calculate because: >>> t1 = (0,1,2,...

    Will help.

    - Paddy.

    Comment

    • Gary Herron

      #3
      Re: opposite of zip()?

      igor.tatarinov@ gmail.com wrote:
      Given a bunch of arrays, if I want to create tuples, there is
      zip(arrays). What if I want to do the opposite: break a tuple up and
      append the values to given arrays:
      map(append, arrays, tupl)
      except there is no unbound append() (List.append() does not exist,
      right?).
      >
      Without append(), I am forced to write a (slow) explicit loop:
      for (a, v) in zip(arrays, tupl):
      a.append(v)
      >
      I assume using an index variable instead wouldn't be much faster.
      >
      Is there a better solution?
      >
      Thanks,
      igor
      >

      But it *does* exist, and its named list.append, and it works as you wanted.
      >>list.append
      <method 'append' of 'list' objects>
      >>a = [[],[]]
      >>map(list.appe nd, a, (1,2))
      [None, None]
      >>a
      [[1], [2]]
      >>map(list.appe nd, a, (3,4))
      [None, None]
      >>a
      [[1, 3], [2, 4]]
      >>map(list.appe nd, a, (30,40))
      [None, None]
      >>a
      [[1, 3, 30], [2, 4, 40]]


      Gary Herron


      Comment

      • Steven D'Aprano

        #4
        Re: opposite of zip()?

        On Fri, 14 Dec 2007 21:47:06 -0800, igor.tatarinov wrote:
        Given a bunch of arrays, if I want to create tuples, there is
        zip(arrays). What if I want to do the opposite: break a tuple up and
        append the values to given arrays:
        map(append, arrays, tupl)
        except there is no unbound append() (List.append() does not exist,
        right?).

        Don't guess, test.
        >>list.append # Does this exist?
        <method 'append' of 'list' objects>


        Apparently it does. Here's how *not* to use it to do what you want:
        >>arrays = [[1, 2, 3, 4], [101, 102, 103, 104]]
        >>tupl = tuple("ab")
        >>map(lambda alist, x: alist.append(x) , arrays, tupl)
        [None, None]
        >>arrays
        [[1, 2, 3, 4, 'a'], [101, 102, 103, 104, 'b']]

        It works, but is confusing and hard to understand, and the lambda
        probably makes it slow. Don't do it that way.


        Without append(), I am forced to write a (slow) explicit loop:
        for (a, v) in zip(arrays, tupl):
        a.append(v)
        Are you sure it's slow? Compared to what?


        For the record, here's the explicit loop:
        >>arrays = [[1, 2, 3, 4], [101, 102, 103, 104]]
        >>tupl = tuple("ab")
        >>zip(arrays, tupl)
        [([1, 2, 3, 4], 'a'), ([101, 102, 103, 104], 'b')]
        >>for (a, v) in zip(arrays, tupl):
        .... a.append(v)
        ....
        >>arrays
        [[1, 2, 3, 4, 'a'], [101, 102, 103, 104, 'b']]


        I think you're making it too complicated. Why use zip()?

        >>arrays = [[1, 2, 3, 4], [101, 102, 103, 104]]
        >>tupl = tuple("ab")
        >>for i, alist in enumerate(array s):
        .... alist.append(tu pl[i])
        ....
        >>arrays
        [[1, 2, 3, 4, 'a'], [101, 102, 103, 104, 'b']]




        --
        Steven

        Comment

        • Steven D'Aprano

          #5
          Re: opposite of zip()?

          On Sat, 15 Dec 2007 06:46:44 +0000, Steven D'Aprano wrote:
          Here's how *not* to use it to do what you want:
          >
          >>>arrays = [[1, 2, 3, 4], [101, 102, 103, 104]] tupl = tuple("ab")
          >>>map(lambda alist, x: alist.append(x) , arrays, tupl)
          [None, None]
          >>>arrays
          [[1, 2, 3, 4, 'a'], [101, 102, 103, 104, 'b']]
          >
          It works, but is confusing and hard to understand, and the lambda
          probably makes it slow. Don't do it that way.
          As Gary Herron points out, you don't need to use lambda:

          map(list.append , arrays, tupl)

          will work. I still maintain that this is the wrong way to to it: taking
          the lambda out makes the map() based solution marginally faster than the
          explicit loop, but I don't believe that the gain in speed is worth the
          loss in readability.

          (e.g. on my PC, for an array of 900000 sub-lists, the map() version takes
          0.4 second versus 0.5 second for the explicit loop. For smaller arrays,
          the results are similar.)



          --
          Steven.

          Comment

          • igor.tatarinov@gmail.com

            #6
            Re: opposite of zip()?

            Hi folks,

            Thanks, for all the help. I tried running the various options, and
            here is what I found:


            from array import array
            from time import time

            def f1(recs, cols):
            for r in recs:
            for i,v in enumerate(r):
            cols[i].append(v)

            def f2(recs, cols):
            for r in recs:
            for v,c in zip(r, cols):
            c.append(v)

            def f3(recs, cols):
            for r in recs:
            map(list.append , cols, r)

            def f4(recs):
            return zip(*recs)

            records = [ tuple(range(10) ) for i in xrange(1000000) ]

            columns = tuple([] for i in xrange(10))
            t = time()
            f1(records, columns)
            print 'f1: ', time()-t

            columns = tuple([] for i in xrange(10))
            t = time()
            f2(records, columns)
            print 'f2: ', time()-t

            columns = tuple([] for i in xrange(10))
            t = time()
            f3(records, columns)
            print 'f3: ', time()-t

            t = time()
            columns = f4(records)
            print 'f4: ', time()-t

            f1: 5.10132408142
            f2: 5.06787180901
            f3: 4.04700708389
            f4: 19.13633203506

            So there is some benefit in using map(list.append ). f4 is very clever
            and cool but it doesn't seem to scale.

            Incidentally, it took me a while to figure out why the following
            initialization doesn't work:
            columns = ([],)*10
            apparently you end up with 10 copies of the same list.

            Finally, in my case the output columns are integer arrays (to save
            memory). I can still use array.append but it's a little slower so the
            difference between f1-f3 gets even smaller. f4 is not an option with
            arrays.

            Comment

            • Gary Herron

              #7
              Re: opposite of zip()?

              igor.tatarinov@ gmail.com wrote:
              Hi folks,
              >
              Thanks, for all the help. I tried running the various options, and
              here is what I found:
              >
              >
              from array import array
              from time import time
              >
              def f1(recs, cols):
              for r in recs:
              for i,v in enumerate(r):
              cols[i].append(v)
              >
              def f2(recs, cols):
              for r in recs:
              for v,c in zip(r, cols):
              c.append(v)
              >
              def f3(recs, cols):
              for r in recs:
              map(list.append , cols, r)
              >
              def f4(recs):
              return zip(*recs)
              >
              records = [ tuple(range(10) ) for i in xrange(1000000) ]
              >
              columns = tuple([] for i in xrange(10))
              t = time()
              f1(records, columns)
              print 'f1: ', time()-t
              >
              columns = tuple([] for i in xrange(10))
              t = time()
              f2(records, columns)
              print 'f2: ', time()-t
              >
              columns = tuple([] for i in xrange(10))
              t = time()
              f3(records, columns)
              print 'f3: ', time()-t
              >
              t = time()
              columns = f4(records)
              print 'f4: ', time()-t
              >
              f1: 5.10132408142
              f2: 5.06787180901
              f3: 4.04700708389
              f4: 19.13633203506
              >
              So there is some benefit in using map(list.append ). f4 is very clever
              and cool but it doesn't seem to scale.
              >
              Incidentally, it took me a while to figure out why the following
              initialization doesn't work:
              columns = ([],)*10
              apparently you end up with 10 copies of the same list.
              >
              Yes. A well known gotcha in Python and a FAQ.
              Finally, in my case the output columns are integer arrays (to save
              memory). I can still use array.append but it's a little slower so the
              difference between f1-f3 gets even smaller. f4 is not an option with
              arrays.
              >

              Comment

              • rasmus

                #8
                Re: opposite of zip()?

                On Dec 15, 4:45 am, Gary Herron <gher...@island training.comwro te:
                igor.tatari...@ gmail.com wrote:
                Hi folks,
                >
                Thanks, for all the help. I tried running the various options, and
                here is what I found:
                >
                from array import array
                from time import time
                >
                def f1(recs, cols):
                for r in recs:
                for i,v in enumerate(r):
                cols[i].append(v)
                >
                def f2(recs, cols):
                for r in recs:
                for v,c in zip(r, cols):
                c.append(v)
                >
                def f3(recs, cols):
                for r in recs:
                map(list.append , cols, r)
                >
                def f4(recs):
                return zip(*recs)
                >
                records = [ tuple(range(10) ) for i in xrange(1000000) ]
                >
                columns = tuple([] for i in xrange(10))
                t = time()
                f1(records, columns)
                print 'f1: ', time()-t
                >
                columns = tuple([] for i in xrange(10))
                t = time()
                f2(records, columns)
                print 'f2: ', time()-t
                >
                columns = tuple([] for i in xrange(10))
                t = time()
                f3(records, columns)
                print 'f3: ', time()-t
                >
                t = time()
                columns = f4(records)
                print 'f4: ', time()-t
                >
                f1: 5.10132408142
                f2: 5.06787180901
                f3: 4.04700708389
                f4: 19.13633203506
                >
                So there is some benefit in using map(list.append ). f4 is very clever
                and cool but it doesn't seem to scale.
                >
                Incidentally, it took me a while to figure out why the following
                initialization doesn't work:
                columns = ([],)*10
                apparently you end up with 10 copies of the same list.
                >
                Yes. A well known gotcha in Python and a FAQ.
                >
                Finally, in my case the output columns are integer arrays (to save
                memory). I can still use array.append but it's a little slower so the
                difference between f1-f3 gets even smaller. f4 is not an option with
                arrays.
                If you want another answer. The opposite of zip(lists) is zip(*
                list_of_tuples)

                That is:
                lists == zip(zip(* lists))

                I don't know about its speed though compared to the other suggestions.

                Matt

                Comment

                • greg

                  #9
                  Re: opposite of zip()?

                  igor.tatarinov@ gmail.com wrote:
                  map(append, arrays, tupl)
                  except there is no unbound append() (List.append() does not exist,
                  right?).
                  Er, no, but list.append does:
                  >>list.append
                  <method 'append' of 'list' objects>

                  so you should be able to do

                  map(list.append , arrays, tupl)

                  provided you know that all the elements of 'arrays' are
                  actual lists.

                  --
                  Greg

                  Comment

                  • Rich Harkins

                    #10
                    Re: opposite of zip()?

                    igor.tatarinov@ gmail.com wrote:
                    Given a bunch of arrays, if I want to create tuples, there is
                    zip(arrays). What if I want to do the opposite: break a tuple up and
                    append the values to given arrays:
                    map(append, arrays, tupl)
                    except there is no unbound append() (List.append() does not exist,
                    right?).
                    >
                    list.append does exist (try the lower-case flavor).
                    Without append(), I am forced to write a (slow) explicit loop:
                    for (a, v) in zip(arrays, tupl):
                    a.append(v)
                    >
                    Except that isn't technically the opposite of zip. The opposite would
                    be a tuple of single-dimensional tuples:

                    def unzip(zipped):
                    """
                    Given a sequence of size-sized sequences, produce a tuple of tuples
                    that represent each index within the zipped object.

                    Example:
                    >>zipped = zip((1, 2, 3), (4, 5, 6))
                    >>zipped
                    [(1, 4), (2, 5), (3, 6)]
                    >>unzip(zippe d)
                    ((1, 2, 3), (4, 5, 6))
                    """
                    if len(zipped) < 1:
                    raise ValueError, 'At least one item is required for unzip.'
                    indices = range(len(zippe d[0]))
                    return tuple(tuple(pai r[index] for pair in zipped)
                    for index in indices)

                    This is probably not the most efficient hunk of code for this but this
                    would seem to be the correct behavior for the opposite of zip and it
                    should scale well.

                    Modifying the above with list.extend would produce a variant closer to
                    what I think you're asking for:

                    def unzip_extend(de sts, zipped):
                    """
                    Appends the unzip versions of zipped into dests. This avoids an
                    unnecessary allocation.

                    Example:
                    >>zipped = zip((1, 2, 3), (4, 5, 6))
                    >>zipped
                    [(1, 4), (2, 5), (3, 6)]
                    >>dests = [[], []]
                    >>unzip_extend( dests, zipped)
                    >>dests
                    [[1, 2, 3], [4, 5, 6]]
                    """
                    if len(zipped) < 1:
                    raise ValueError, 'At least one item is required for unzip.'
                    for index in range(len(zippe d[0])):
                    dests[index].extend(pair[index] for pair in zipped)

                    This should perform pretty well, as extend with a comprehension is
                    pretty fast. Not that it's truly meaningful, here's timeit on my 2GHz
                    laptop:

                    bash-3.1$ python -m timeit -s 'import unzip; zipped=zip(rang e(1024),
                    range(1024))' 'unzip.unzip_ex tend([[], []], zipped)'
                    1000 loops, best of 3: 510 usec per loop

                    By comparison, here's the unzip() version above:

                    bash-3.1$ python -m timeit -s 'import unzip; zipped=zip(rang e(1024),
                    range(1024))' 'unzip.unzip(zi pped)'
                    1000 loops, best of 3: 504 usec per loop

                    Rich

                    Comment

                    • Matt Nordhoff

                      #11
                      Re: opposite of zip()?

                      Rich Harkins wrote:
                      igor.tatarinov@ gmail.com wrote:
                      >Given a bunch of arrays, if I want to create tuples, there is
                      >zip(arrays). What if I want to do the opposite: break a tuple up and
                      >append the values to given arrays:
                      > map(append, arrays, tupl)
                      >except there is no unbound append() (List.append() does not exist,
                      >right?).
                      >>
                      >
                      list.append does exist (try the lower-case flavor).
                      >
                      >Without append(), I am forced to write a (slow) explicit loop:
                      > for (a, v) in zip(arrays, tupl):
                      > a.append(v)
                      >>
                      >
                      Except that isn't technically the opposite of zip. The opposite would
                      be a tuple of single-dimensional tuples:
                      >
                      def unzip(zipped):
                      """
                      Given a sequence of size-sized sequences, produce a tuple of tuples
                      that represent each index within the zipped object.
                      >
                      Example:
                      >>zipped = zip((1, 2, 3), (4, 5, 6))
                      >>zipped
                      [(1, 4), (2, 5), (3, 6)]
                      >>unzip(zippe d)
                      ((1, 2, 3), (4, 5, 6))
                      """
                      if len(zipped) < 1:
                      raise ValueError, 'At least one item is required for unzip.'
                      indices = range(len(zippe d[0]))
                      return tuple(tuple(pai r[index] for pair in zipped)
                      for index in indices)
                      >
                      This is probably not the most efficient hunk of code for this but this
                      would seem to be the correct behavior for the opposite of zip and it
                      should scale well.
                      >
                      Modifying the above with list.extend would produce a variant closer to
                      what I think you're asking for:
                      >
                      def unzip_extend(de sts, zipped):
                      """
                      Appends the unzip versions of zipped into dests. This avoids an
                      unnecessary allocation.
                      >
                      Example:
                      >>zipped = zip((1, 2, 3), (4, 5, 6))
                      >>zipped
                      [(1, 4), (2, 5), (3, 6)]
                      >>dests = [[], []]
                      >>unzip_extend( dests, zipped)
                      >>dests
                      [[1, 2, 3], [4, 5, 6]]
                      """
                      if len(zipped) < 1:
                      raise ValueError, 'At least one item is required for unzip.'
                      for index in range(len(zippe d[0])):
                      dests[index].extend(pair[index] for pair in zipped)
                      >
                      This should perform pretty well, as extend with a comprehension is
                      pretty fast. Not that it's truly meaningful, here's timeit on my 2GHz
                      laptop:
                      >
                      bash-3.1$ python -m timeit -s 'import unzip; zipped=zip(rang e(1024),
                      range(1024))' 'unzip.unzip_ex tend([[], []], zipped)'
                      1000 loops, best of 3: 510 usec per loop
                      >
                      By comparison, here's the unzip() version above:
                      >
                      bash-3.1$ python -m timeit -s 'import unzip; zipped=zip(rang e(1024),
                      range(1024))' 'unzip.unzip(zi pped)'
                      1000 loops, best of 3: 504 usec per loop
                      >
                      Rich
                      As Paddy wrote, zip is its own unzip:
                      >>zipped = zip((1, 2, 3), (4, 5, 6))
                      >>zipped
                      [(1, 4), (2, 5), (3, 6)]
                      >>unzipped = zip(*zipped)
                      >>unzipped
                      [(1, 2, 3), (4, 5, 6)]

                      Neat and completely confusing, huh? :-)

                      <http://paddy3118.blogs pot.com/2007/02/unzip-un-needed-in-python.html>
                      --

                      Comment

                      • Rich Harkins

                        #12
                        Re: opposite of zip()?

                        Matt Nordhoff wrote:
                        [snip]
                        >
                        As Paddy wrote, zip is its own unzip:
                        >
                        >>>zipped = zip((1, 2, 3), (4, 5, 6))
                        >>>zipped
                        [(1, 4), (2, 5), (3, 6)]
                        >>>unzipped = zip(*zipped)
                        >>>unzipped
                        [(1, 2, 3), (4, 5, 6)]
                        >
                        Neat and completely confusing, huh? :-)
                        >
                        <http://paddy3118.blogs pot.com/2007/02/unzip-un-needed-in-python.html>
                        I hadn't thought about zip() being symmetrical like that. Very cool...

                        Rich

                        Comment

                        Working...