AssertionError in pickle's memoize function

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Michael Hohn

    AssertionError in pickle's memoize function

    Hi,

    under python 2.2, the pickle/unpickle sequence incorrectly restores
    a larger data structure I have.

    Under Python 2.3, these structures now give an explicit exception from
    Pickle.memoize( ):
    assert id(obj) not in self.memo

    I'm shrinking the offending data structure down to find the problem
    and provide an easily reproducible example,
    but maybe someone on the list could tell me under what general
    conditions this assertion is expected to fail.

    Thanks,
    Michael
  • Tim Peters

    #2
    Re: AssertionError in pickle's memoize function

    [Michael Hohn][color=blue]
    > under python 2.2, the pickle/unpickle sequence incorrectly restores
    > a larger data structure I have.
    >
    > Under Python 2.3, these structures now give an explicit exception from
    > Pickle.memoize( ):
    > assert id(obj) not in self.memo
    >
    > I'm shrinking the offending data structure down to find the problem
    > and provide an easily reproducible example,
    > but maybe someone on the list could tell me under what general
    > conditions this assertion is expected to fail.[/color]

    Assertions are never expected to fail, so "something impossible
    happened" when they do fail.

    See whether your Python has Lib/pickletools.py. There's an enormous
    amount of info about pickles in that (for example, it will tell you
    what "memo" means).

    May help to try cPickle instead of pickle. Since they're distinct
    implementations , they have different bugs. cPickle can be much faster
    than pickle, but it's a lot easier to understand pickle.py.

    Comment

    • Michael Hohn

      #3
      Re: AssertionError in pickle's memoize function

      Tim Peters <tim.peters@gma il.com> writes:
      [color=blue]
      > [Michael Hohn][color=green]
      > > under python 2.2, the pickle/unpickle sequence incorrectly restores
      > > a larger data structure I have.
      > >
      > > Under Python 2.3, these structures now give an explicit exception from
      > > Pickle.memoize( ):
      > > assert id(obj) not in self.memo
      > >
      > > I'm shrinking the offending data structure down to find the problem
      > > and provide an easily reproducible example,
      > > but maybe someone on the list could tell me under what general
      > > conditions this assertion is expected to fail.[/color]
      >
      > Assertions are never expected to fail, so "something impossible
      > happened" when they do fail.
      >
      > See whether your Python has Lib/pickletools.py. There's an enormous
      > amount of info about pickles in that (for example, it will tell you
      > what "memo" means).
      >
      > May help to try cPickle instead of pickle. Since they're distinct
      > implementations , they have different bugs. cPickle can be much faster
      > than pickle, but it's a lot easier to understand pickle.py.[/color]

      Here is a code sample that shows the problem I ran into:

      test.py:
      =============== =============== ===
      import pickle

      class aList(list):
      def __init__(self, arg):
      # w/o this call, pickle works...
      list.__init__(s elf, arg)
      pass

      A = aList([1,2])
      B = aList([A, 3])

      the_data = {'a': A, 'b': B}
      A._stored_by = the_data

      pickle.dumps([the_data, B]) # ok
      pickle.dumps([B, the_data]) # fails

      =============== =============== ===


      Outputs under:

      Python 2.3 (#1, Sep 13 2003, 00:49:11)
      [GCC 3.3 20030304 (Apple Computer, Inc. build 1495)] on darwin


      9 scarlet::~:0> python test.py
      Traceback (most recent call last):
      File "test.py", line 16, in ?
      pickle.dumps([B, the_data]) # fails
      File "/System/Library/Frameworks/Python.framewor k/Versions/2.3/lib/python2.3/p
      ickle.py", line 1386, in dumps
      Pickler(file, protocol, bin).dump(obj)
      File "/System/Library/Frameworks/Python.framewor k/Versions/2.3/lib/python2.3/p
      ickle.py", line 231, in dump
      self.save(obj)
      File "/System/Library/Frameworks/Python.framewor k/Versions/2.3/lib/python2.3/p
      ickle.py", line 293, in save
      f(self, obj) # Call unbound method with explicit self
      File "/System/Library/Frameworks/Python.framewor k/Versions/2.3/lib/python2.3/p
      ickle.py", line 614, in save_list
      self._batch_app ends(iter(obj))
      File "/System/Library/Frameworks/Python.framewor k/Versions/2.3/lib/python2.3/p
      ickle.py", line 629, in _batch_appends
      save(x)
      File "/System/Library/Frameworks/Python.framewor k/Versions/2.3/lib/python2.3/p
      ickle.py", line 338, in save
      self.save_reduc e(obj=obj, *rv)
      File "/System/Library/Frameworks/Python.framewor k/Versions/2.3/lib/python2.3/p
      ickle.py", line 419, in save_reduce
      self.memoize(ob j)
      File "/System/Library/Frameworks/Python.framewor k/Versions/2.3/lib/python2.3/p
      ickle.py", line 251, in memoize
      assert id(obj) not in self.memo
      AssertionError

      with the same problem under python on linux:

      Python 2.3 (#1, Jul 31 2003, 14:19:24)
      [GCC 2.96 20000731 (Red Hat Linux 7.3 2.96-113)] on linux2


      Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "/usr/tmp/python-286703ll", line 1, in ?
      pickle.dumps([B, the_data]) # fails
      File "/usr/local_cci/Python-2.3/lib/python2.3/pickle.py", line 1386, in dumps
      Pickler(file, protocol, bin).dump(obj)
      File "/usr/local_cci/Python-2.3/lib/python2.3/pickle.py", line 231, in dump
      self.save(obj)
      File "/usr/local_cci/Python-2.3/lib/python2.3/pickle.py", line 293, in save
      f(self, obj) # Call unbound method with explicit self
      File "/usr/local_cci/Python-2.3/lib/python2.3/pickle.py", line 614, in save_list
      self._batch_app ends(iter(obj))
      File "/usr/local_cci/Python-2.3/lib/python2.3/pickle.py", line 629, in _batch_appends
      save(x)
      File "/usr/local_cci/Python-2.3/lib/python2.3/pickle.py", line 338, in save
      self.save_reduc e(obj=obj, *rv)
      File "/usr/local_cci/Python-2.3/lib/python2.3/pickle.py", line 419, in save_reduce
      self.memoize(ob j)
      File "/usr/local_cci/Python-2.3/lib/python2.3/pickle.py", line 251, in memoize
      assert id(obj) not in self.memo
      AssertionError

      Comment

      • Dima Dorfman

        #4
        Pickle breakage with reduce of recursive structures (was:AssertionE rror in pickle's memoize function)

        [Followups to python-dev, please.]

        [Michael Hohn][color=blue][color=green][color=darkred]
        > > > under python 2.2, the pickle/unpickle sequence incorrectly restores
        > > > a larger data structure I have.
        > > >
        > > > Under Python 2.3, these structures now give an explicit exception from
        > > > Pickle.memoize( ):
        > > > assert id(obj) not in self.memo
        > > >[/color][/color][/color]

        [Tim Peters][color=blue][color=green]
        > > Assertions are never expected to fail, so "something impossible
        > > happened" when they do fail.[/color][/color]

        [Michael Hohn][color=blue]
        > Here is a code sample that shows the problem I ran into:[/color]

        Summary for the OP: This is a bug in Python. Using cPickle won't help,
        but if you don't subclass builtin container types others than list,
        dict, and tuple, using pickle protocol 2 should work. The rest of this
        message is for python-dev.


        The simplest breaking case is:

        t = type('t', (list,), {})
        obj = t()
        obj.append(obj)
        pickle.dumps(ob j)
        [infinite recursion]

        The subclass causes save_reduce to be used instead of save_list. For
        proto < 2, copy_reg._reduc e_ex returns (_reconstructor , list(a)), and
        the args--list(a)--cycle back through obj. Initially it looks like this
        should be okay, but the args are saved before obj is memoized, and obj
        can't be memoized until REDUCE can be executed with the args--and
        there's the cycle. It is even more obviously impossible from the
        unpickler's perspective because it has to call _reconstructor([obj]) to
        create obj!

        There are two separate problems:

        1. Any __reduce__ implementation that returns args that cycle back
        through the object it tried to reduce hasn't done its job. As
        described above, _reduce_ex is one such implementation. reduce_2
        avoids this by using the listitems and dictitems parameters. Since
        that's a pickler-side feature it can be used in _reduce_ex too. The
        basetype(obj) hook (documented in PEP 307) would remain for
        immutable bases; it doesn't work for containers, but user-defined
        containers already have to implement their own reduce functions.
        POC patch: http://www.trit.org/~dima/home/reduce_ex.diff

        At least the set and deque types also have this problem.

        2. The pickle implementations don't detect reduction cycles. Pickling
        an instance of this obviously broken class causes an infinite
        recursion:

        class evil(object):
        def __reduce__(self ):
        return evil, (self,)

        It's easy to detect this case. POC patch for the pickle module:


        BTW, the failed assert the OP is seeing happens when the cycle goes
        through another object:

        t = type('t', (list,), {})
        obj = t()
        d = {'obj': obj}
        obj.append(d)
        pickle.dumps(ob j)
        [AssertionError]

        cPickle has the same problem, but it lacks the assert, so it writes
        garbage instead:

        new = cPickle.loads(c Pickle.dumps(ob j))
        new[0]['obj'] is new -> False # wrong
        obj[0]['obj'] is obj -> True # right

        This makes the reduction cycle check (#2 above) more than just cosmetic
        since if cPickle had that assert (it should) it would've been a crash.
        Right now it's garbage output instead, which is arguably worse.

        Formally complete versions of the above patches will be on SF tomorrow
        unless someone suggests better alternatives.

        Dima.

        Comment

        Working...