sqlite3 import performance

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Ben Lee

    sqlite3 import performance

    hi folks --

    a quick python and sqlite3 performance question. i find that
    inserting a million rows of in-memory data into an in-memory database
    via a single executemany() is about 30% slower than using the sqlite3
    CLI and the .import command (reading the same data from a disk file,
    even.) i find this surprising, executemany() i assume is using a
    prepared statement and this is exactly what the .import command does
    (based on my quick perusal of the source.)

    is this discrepancy to be expected? where is the overhead coming
    from?

    for full disclosure: the python code is at the end; run it first to
    generate the data file. to test the CLI, i couldn't find a better way
    than to create an init file "sqlcmds" containing

    create table test (k int primary key, v int not null);
    ..import data test

    and then run

    time sqlite3 -init sqlcmds ':memory:' '.quit'

    the python code is

    #!/usr/bin/env python

    import sqlite3, random, timeit

    con = None
    def prepare():
    global con, cur

    con=sqlite3.con nect(':memory:' )
    con.isolation_l evel="EXCLUSIVE "
    cur=con.cursor( )

    def ins():
    global con, data, cur

    try:
    cur.execute('dr op table test')
    except:
    pass
    cur.execute('cr eate table test (key int primary key, val int not
    null);')
    con.commit()

    cur.executemany ("INSERT into test (key, val) values (?, 10)",data)
    con.commit()

    hs = {}

    print 'generating data...'
    size = 1000000
    data = [[a] for a in random.sample(x range(10000000) , size)]
    print 'done!'

    # save the data for the sqlite3 CLI
    f = file('data', 'w')
    f.writelines([str(a[0])+"|10\n" for a in data])
    f.close()

    print 'testing ' + str(size) + ' inserts...'
    # test 100K inserts
    t = timeit.Timer(st mt='ins()',
    setup="import sqlite3\nfrom __main__ import prepare, ins
    \nprepare()")
    print t.repeat(3,1)

    thanks, ben
Working...