Re: Memory error while saving dictionary of size 65000X50 using pickle

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Nagu

    Re: Memory error while saving dictionary of size 65000X50 using pickle

    I didn't have the problem with dumping as a string. When I tried to
    save this object to a file, memory error pops up.

    I am sorry for the mention of size for a dictionary. What I meant by
    65000X50 is that it has 65000 keys and each key has a list of 50
    tuples.

    I was able to save a dictionary object with 65000 keys and a list of
    15-tuple values to a file. But I could not do the same when I have a
    list of 25-tuple values for 65000 keys.

    You exmple works just fine on my side.

    Thank you,
    Nagu
  • =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

    #2
    Re: Memory error while saving dictionary of size 65000X50 using pickle

    I didn't have the problem with dumping as a string. When I tried to
    save this object to a file, memory error pops up.
    That's not what the backtrace says. The backtrace says that the error
    occurs inside pickle.dumps() (and it is consistent with the functions
    being called, so it's plausible).
    I am sorry for the mention of size for a dictionary. What I meant by
    65000X50 is that it has 65000 keys and each key has a list of 50
    tuples.
    [...]
    >
    You exmple works just fine on my side.
    I can get the program

    import pickle

    d = {}

    for i in xrange(65000):
    d[i]=[(x,) for x in range(50)]
    print "Starting dump"
    s = pickle.dumps(d)

    to complete successfully, also, however, it consumes a lot
    of memory. I can reduce memory usage slightly by
    a) dumping directly to a file, and
    b) using cPickle instead of pickle
    i.e.

    import cPickle as pickle

    d = {}

    for i in xrange(65000):
    d[i]=[(x,) for x in range(50)]
    print "Starting dump"
    pickle.dump(d,o pen("/tmp/t.pickle","wb") )

    The memory consumed originates primarily from the need to determine
    shared references. If you are certain that no object sharing occurs
    in your graph, you can do
    import cPickle as pickle

    d = {}

    for i in xrange(65000):
    d[i]=[(x,) for x in range(50)]
    print "Starting dump"
    p = pickle.Pickler( open("/tmp/t.pickle","wb") )
    p.fast = True
    p.dump(d)

    With that, I see no additional memory usage, and pickling completes
    really fast.

    Regards,
    Martin

    Comment

    Working...