Recursion limit of pickle?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Victor Lin

    Recursion limit of pickle?

    Hi,

    I encounter a problem with pickle.
    I download a html from:



    and parse it with BeautifulSoup.
    This page is very huge.
    When I use pickle to dump it, a RuntimeError: maximum recursion depth
    exceeded occur.
    I think it is cause by this problem at first :



    But and then I do not think so, because I log recursion call of pickle
    in file
    I found that the recursion limit is exceeded in mid-way to expand
    whole the BeautifulSoup object.
    Not repeat to call some methods.

    This is the code for test.

    from BeautifulSoup import *

    import pickle as pickle
    import urllib

    doc = urllib.urlopen( 'http://www.amazon.com/Magellan-Maestro-4040-
    Widescreen-Navigator/dp/B000NMKHW6/ref=sr_1_2?
    ie=UTF8&s=elect ronics&qid=1202 541889&sr=1-2')

    import sys
    sys.setrecursio nlimit(40000)

    soup = BeautifulSoup(d oc)
    print pickle.dumps(so up)

    -------------------
    What I want to ask is: Is this cause by the limit of recursion limit
    and stack size?

    I had tired cPickle at first, and then I try pickle, cPickle just stop
    running program without any message.
    I think it is also implement with recursion way, and it also over flow
    stack when dumping soup.

    Are there any version of pickle that implement with no-recursion way?

    Thanks.

    Victor Lin.
  • Gabriel Genellina

    #2
    Re: Recursion limit of pickle?

    En Sat, 09 Feb 2008 09:49:46 -0200, Victor Lin <Bornstub@gmail .com>
    escribi�:
    I encounter a problem with pickle.
    I download a html from:
    >

    >
    and parse it with BeautifulSoup.
    This page is very huge.
    When I use pickle to dump it, a RuntimeError: maximum recursion depth
    exceeded occur.
    BeautifulSoup objects usually aren't pickleable, independently of your
    recursion error.

    pyimport pickle
    pyimport BeautifulSoup
    pysoup = BeautifulSoup.B eautifulSoup("< html><body>Hell o, world!</html>")
    pyprint pickle.dumps(so up)
    Traceback (most recent call last):
    ....
    TypeError: 'NoneType' object is not callable
    py>

    Why do you want to pickle it? Store the downloaded page instead, and
    rebuild the BeautifulSoup object later when needed.

    --
    Gabriel Genellina

    Comment

    • Victor Lin

      #3
      Re: Recursion limit of pickle?

      On 2月10æ—¥, 上午11時42åˆ †, "GabrielGenelli na" <gagsl-...@yahoo.com.a r>
      wrote:
      En Sat, 09 Feb 2008 09:49:46 -0200, Victor Lin <Borns...@gmail .com>
      escribi�:
      >
      I encounter a problem with pickle.
      I download a html from:
      >>
      and parse it with BeautifulSoup.
      This page is very huge.
      When I use pickle to dump it, a RuntimeError: maximum recursion depth
      exceeded occur.
      >
      BeautifulSoup objects usually aren't pickleable, independently of your
      recursion error.
      But I pickle and unpickle other soup objects successfully.
      Only this object seems too deep to pickle.
      >
      pyimport pickle
      pyimport BeautifulSoup
      pysoup = BeautifulSoup.B eautifulSoup("< html><body>Hell o, world!</html>")
      pyprint pickle.dumps(so up)
      Traceback (most recent call last):
      ...
      TypeError: 'NoneType' object is not callable
      py>
      >
      Why do you want to pickle it? Store the downloaded page instead, and
      rebuild the BeautifulSoup object later when needed.
      >
      --
      Gabriel Genellina
      Because parsing html cost a lots of cpu time. So I want to cache soup
      object as file. If I have to get same page, I can get it from cache
      file, even the parsed soup file. My program's bottleneck is on parsing
      html, so if I can parse once and unpickle them later, it could save a
      lots of time.

      Comment

      Working...