Shelve operations are very slow and create huge files

**Peter Otten** · Jul 18 '05, 04:42 AM

Re: Shelve operations are very slow and create huge files

Eric Wichterich wrote:
[color=blue]
> Hello Pythonistas,
>
> I use Python shelves to store results from MySQL-Queries (using Python
> for web scripting).
> One script searches the MySQL-database and stores the result, the next
> script reads the shelve again and processes the result. But there is a
> problem: if the second script is called too early, the error "(11,
> 'Resource temporarily unavailable') " occurs.
> So I took a closer look at the file that is generated by the shelf: The
> result-list from MySQL-Query contains 14.600 rows with 7 columns. But,
> the saved file is over 3 MB large and contains over 230.000 lines (!),
> which seems way too much![/color]

Let's see:
[color=blue][color=green][color=darkred]
>>> 3*2**20/14600/7[/color][/color][/color]
30.780117416829 746[color=blue][color=green][color=darkred]
>>>[/color][/color][/color]

Are thirty bytes per field, including administrative data, that much?
By the way, don't bother counting the lines in a file containing pickled
data; the pickle protocol inserts a newline after each attribute, unless
you specify the binary mode, e. g.:

shelve.open(fil ename, binary=True)
[color=blue]
> Following statements are used:
> dbase = shelve.open(fil ename)
> if dbase.has_key(k ey): #overwrite objects stored with same key
> del dbase[key]
> dbase[key] = object
> dbase.close()[/color]

I've never used the shelve module so far, but the rule of least surprise
would suggest that

if dbase.has_key(k ey):
del dbase[key]
dbase[key] = data

is the same as

dbase[key] = data
[color=blue]
> Any ideas?[/color]

Try to omit the shelve completely, preferably by moving the second script's
operations into the first. If you want to keep two scripts, don't invoke
them independently, make a little batch file or shell script instead.

If you need an intermediate step with a preprocessed snapshot of the MySQL
table, and you have sufficient rights, use a MySQL table for the temporary
data.

Peter

Shelve operations are very slow and create huge files

Shelve operations are very slow and create huge files

Comment