Nested generator caveat

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Dieter Maurer

    Nested generator caveat

    I met the following surprising behaviour
    >>def gen0():
    .... for i in range(3):
    .... def gen1():
    .... yield i
    .... yield i, gen1()
    ....
    >>for i,g in gen0(): print i, g.next()
    ....
    0 0
    1 1
    2 2
    >>for i,g in list(gen0()): print i, g.next()
    ....
    0 2
    1 2
    2 2


    If this is not a bug, it is at least quite confusing.


    The apparent reason is that the free variables in
    nested generator definitions are not bound (to a value) at invocation
    time but only at access time.


    Almost surely, the same applies to all locally defined functions
    with free variables.
    This would mean that locally defined functions with free
    variables are very risky in generators.


    --
    Dieter
  • Raymond Hettinger

    #2
    Re: Nested generator caveat

    On Jul 3, 9:20 pm, "Dieter Maurer" <die...@handsha ke.dewrote:
    The apparent reason is that the free variables in
    nested generator definitions are not bound (to a value) at invocation
    time but only at access time.
    <YawnThat's what it is supposed to do. Welcome to a dynamic
    language.


    Raymond

    Comment

    • Mark Wooding

      #3
      Re: Nested generator caveat

      Dieter Maurer <dieter@handsha ke.dewrote:
      I met the following surprising behaviour
      [code moved until later...]
      The apparent reason is that the free variables in nested generator
      definitions are not bound (to a value) at invocation time but only at
      access time.
      No. This is about the difference between binding and assignment.
      Unfortunately, Python doesn't have explicit syntax for doing the former.

      Here's what's actually going on in your generator.
      >>>def gen0():
      ... for i in range(3):
      ... def gen1():
      ... yield i
      ... yield i, gen1()
      The function gen0 contains a yield statement; it's therefore a
      generator. It contains an assignment to a variable i (in this case,
      it's implicit in the `for' loop). So, on entry to the code, a fresh
      location is allocated, and the variable i is bound to it.

      The function gen1 contains a yield statement too, so it's also a
      generator. It contains a free reference to a variable i, so it shares
      the binding in the outer scope.

      Here's the important part: the for loop works by assigning to the
      location named by i each time through. It doesn't rebind i to a fresh
      location. So each time you kick gen1, it produces the current value of
      i at that time. So...
      >>>for i,g in gen0(): print i, g.next()
      0 0
      1 1
      2 2
      Here, the for loop in gen0 is suspended each iteration while we do some
      printing. So the variable i (in gen0) still matches the value yielded
      by gen0.

      But...
      >>>for i,g in list(gen0()): print i, g.next()
      0 2
      1 2
      2 2
      Here, gen0 has finished all of its iterations before we start kicking
      any of the returned generators. So the value of i in gen0 is 2 (the
      last element of range(3)).
      Almost surely, the same applies to all locally defined functions
      with free variables.
      This would mean that locally defined functions with free
      variables are very risky in generators.
      It means that you must be careful about the difference between binding
      and assignment when dealing with closures of whatever kind.

      Here's an example without involving nested generators.

      def gen():
      for i in xrange(3):
      yield lambda: i
      for f in gen(): print f()
      for f in list(gen()): print f()

      To fix the problem, you need to arrange for something to actually rebind
      a variable around your inner generator on each iteration through. Since
      Python doesn't have any mechanism for controlling variable binding other
      than defining functions, that's what you'll have to do.

      def genfix():
      for i in xrange(3):
      def bind(i):
      def geninner():
      yield i
      return geninner()
      yield i, bind(i)

      shows the general pattern, but since a generator has the syntactic form
      of a function anyway, we can fold the two together.

      def genfix2():
      for i in xrange(3):
      def geninner(i):
      yield i
      yield i, geninner(i)

      Yes, this is cumbersome. A `let' statement would help a lot. Or
      macros. ;-)

      -- [mdw]

      Comment

      • marek.rocki@wp.pl

        #4
        Re: Nested generator caveat

        Excellent explanation by Mark Wooding. I would only like to add that
        the standard pythonic idiom in such cases seems to be the (ab)use of a
        default argument to the function, because these get evaluated at the
        definition time:
        def gen0():
        for i in range(3):
        def gen1(i = i):
        yield i
        yield i, gen1()

        Comment

        Working...