pySQLite Insert speed

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • mdboldin@gmail.com

    pySQLite Insert speed

    I hav read on this forum that SQL coding (A) below is preferred over
    (B), but I find (B) is much faster (20-40% faster)

    (A)

    sqla= 'INSERT INTO DTABLE1 VALUES (%d, %d, %d, %f)' % values
    curs.execute(sq la)

    (B)
    pf= '?, ?, ?, ?'
    sqlxb= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
    curs.execute( sqlxb, values )

    Any intution on why (A) is slower?
  • Carsten Haese

    #2
    Re: pySQLite Insert speed

    On Thu, 28 Feb 2008 19:35:03 -0800 (PST), mdboldin wrote
    I hav read on this forum that SQL coding (A) below is preferred over
    (B), but I find (B) is much faster (20-40% faster)
    >
    (A)
    >
    sqla= 'INSERT INTO DTABLE1 VALUES (%d, %d, %d, %f)' % values
    curs.execute(sq la)
    >
    (B)
    pf= '?, ?, ?, ?'
    sqlxb= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
    curs.execute( sqlxb, values )
    >
    Any intution on why (A) is slower?
    My only problem with (B) is that it should really be this:

    sqlxb= 'INSERT INTO DTABLE2 VALUES (?, ?, ?, ?)'
    curs.execute( sqlxb, values )

    Apart from that, (B) is better than (A). The parameter binding employed in (B)
    is not only faster on many databases, but more secure. See, for example,
    http://informixdb.blogspot.com/2007/...in-blanks.html for some
    in-depth explanations of why parameter binding is better than string
    formatting for performing SQL queries with variable values.

    HTH,

    --
    Carsten Haese

    Comment

    • mdboldin@gmail.com

      #3
      Re: pySQLite Insert speed

      (B) is better than (A). The parameter binding employed in (B)
      is not only faster on many databases, but more secure.
      See, for example,http://informixdb.blogspot.com/2007/07/filling-in-
      blanks.html

      Thx. The link was helpful, and I think I have read similar things
      before-- that B is faster.
      So ... I just rewrote the test code from scratch and B is faster. I
      must have had something wrong in my original timing.

      Comment

      • Tim Roberts

        #4
        Re: pySQLite Insert speed

        mdboldin@gmail. com wrote:
        >
        >I hav read on this forum that SQL coding (A) below is preferred over
        >(B), but I find (B) is much faster (20-40% faster)
        >
        >(A)
        >
        sqla= 'INSERT INTO DTABLE1 VALUES (%d, %d, %d, %f)' % values
        curs.execute(sq la)
        >
        >(B)
        pf= '?, ?, ?, ?'
        sqlxb= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
        curs.execute( sqlxb, values )
        >
        >Any intution on why (A) is slower?
        I think you misunderstood. (B) is *ALWAYS* the proper way of doing
        parameterized SQL queries. Unconditionally . The (A) style is way too
        vulnerable to SQL injection attacks.
        --
        Tim Roberts, timr@probo.com
        Providenza & Boekelheide, Inc.

        Comment

        • mdboldin@gmail.com

          #5
          Re: pySQLite Insert speed

          Steve, I want to make sure I understand. My test code is below, where
          ph serves as a placeholder. I am preparing for a case where the number
          of ? will be driven by the length of the insert record (dx)

          dtable= 'DTABLE3'
          print 'Insert data into table %s, version #3' % dtable
          ph= '?, ?, ?, ?'
          sqlx= 'INSERT INTO %s VALUES ( %s ) ' % (dtable,ph)
          t0a=time.time()
          for dx in d1:
          curs1.execute(s qlx,dx)
          print (time.time()-t0a)
          print curs1.lastrowid
          conn1.commit()

          I think you are saying that sqlx is re-evaluated in each loop, i.e.
          not the same as pure hard coding of
          sqlx= 'INSERT INTO DTABLE3 VALUES ( ?, ?, ?, ? ) '
          Is that right? Hence (if I understand python convention), this can be
          solved by adding
          sqlx= copy.copy(sqlx)
          before the looping. And in tests adding this step saved about 5-10% in
          time.

          And yes, I can see why (B) is always better from a security
          standpoint. The python solutions for problems such as there are a
          great help for people like me, in the sense that the most secure way
          does not have a speed penalty (and in this case is 3-4x faster).

          Comment

          • Steve Holden

            #6
            Re: pySQLite Insert speed

            mdboldin@gmail. com wrote:
            Steve, I want to make sure I understand. My test code is below, where
            ph serves as a placeholder. I am preparing for a case where the number
            of ? will be driven by the length of the insert record (dx)
            >
            dtable= 'DTABLE3'
            print 'Insert data into table %s, version #3' % dtable
            ph= '?, ?, ?, ?'
            sqlx= 'INSERT INTO %s VALUES ( %s ) ' % (dtable,ph)
            t0a=time.time()
            for dx in d1:
            curs1.execute(s qlx,dx)
            print (time.time()-t0a)
            print curs1.lastrowid
            conn1.commit()
            >
            I think you are saying that sqlx is re-evaluated in each loop, i.e.
            not the same as pure hard coding of
            sqlx= 'INSERT INTO DTABLE3 VALUES ( ?, ?, ?, ? ) '
            Is that right?
            Yes. If the sql is constant then you would be performing an unnecessary
            computation inside the loop. Not a biggie, but it all takes time. Is the
            loop above your original code? If so I was wrong about the loop.
            Hence (if I understand python convention), this can be
            solved by adding
            sqlx= copy.copy(sqlx)
            before the looping. And in tests adding this step saved about 5-10% in
            time.
            >
            Now this I don;t really understand at all. What's the point of trying to
            replace sqlx with a copy of itself? Perhaps if you explained what you
            hope this will achieve I could comment more intelligently.
            And yes, I can see why (B) is always better from a security
            standpoint. The python solutions for problems such as there are a
            great help for people like me, in the sense that the most secure way
            does not have a speed penalty (and in this case is 3-4x faster).
            Yes, it's a real win-win. Since both the table and the number of
            arguments appear to be variable one possible solution is to build a dict
            that would allow you to look up the right SQL using the table name. So,
            suppose you have the following tables and number of arguments:

            tables = (("table1", 3),
            ("table2", 5),
            ("table3", 2)
            )

            you could create a suitable dict as (untested):

            tdict = {}
            for tbl, ct in tables:
            tdict[tbl] = "INSERT INTO %s VALUES (%s)" % \
            (tbl, ", ".join(["?"] * ct))

            Then you can use the table to look up the right SQL, quite a fast
            operation compared with actually building it.

            regards
            Steve
            --
            Steve Holden +1 571 484 6266 +1 800 494 3119
            Holden Web LLC http://www.holdenweb.com/

            Comment

            • mmm

              #7
              Re: pySQLite Insert speed

              Hence (if I understand python convention), this can be
              solved by adding
              sqlx= copy.copy(sqlx)
              before the looping. And in tests adding this step saved about 5-10% in
              time.
              >
              Now this I don;t really understand at all. What's the point of trying to
              replace sqlx with a copy of itself? Perhaps if you explained what you
              hope this will achieve I could comment more intelligently.
              >
              I am/was attempting to convert

              sqlx= 'INSERT INTO %s VALUES ( %s ) ' % (dtable,ph)

              to code that did to need to be re-evaluated. i.e. to insert the
              dtable and ph values as if they were hard coded.

              copy.copy -- A shallow copy constructs a new compound object and
              then (to the extent possible) inserts references into it to the
              objects found in the original.

              Comment

              • Steve Holden

                #8
                Re: pySQLite Insert speed

                mmm wrote:
                >>Hence (if I understand python convention), this can be
                >>solved by adding
                >>sqlx= copy.copy(sqlx)
                >>before the looping. And in tests adding this step saved about 5-10% in
                >>time.
                >Now this I don;t really understand at all. What's the point of trying to
                >replace sqlx with a copy of itself? Perhaps if you explained what you
                >hope this will achieve I could comment more intelligently.
                >>
                >
                I am/was attempting to convert
                >
                sqlx= 'INSERT INTO %s VALUES ( %s ) ' % (dtable,ph)
                >
                to code that did to need to be re-evaluated. i.e. to insert the
                dtable and ph values as if they were hard coded.
                >
                copy.copy -- A shallow copy constructs a new compound object and
                then (to the extent possible) inserts references into it to the
                objects found in the original.
                Unfortunately you weren't dealing with a compound object object here, so
                all you are doing is creating a copy of the string you've just created
                and replacing the original with it. Copy.copy() is meant for creating
                (say) lists, tuples and dicts where the elements are references to the
                same objects that the elements of the original structure referred to.

                regards
                Steve
                --
                Steve Holden +1 571 484 6266 +1 800 494 3119
                Holden Web LLC http://www.holdenweb.com/

                Comment

                • mmm

                  #9
                  Re: pySQLite Insert speed


                  Steve, I think you were right the first time is saying
                  it should really be this:
                  sqlxb= 'INSERT INTO DTABLE2 VALUES (?, ?, ?, ?)'
                  my copy.copy() has the equivalent effect.

                  Running this test code produces the output below

                  import copy

                  print 'Test 1'
                  pf= '?,?,?,?'
                  sqlx1= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
                  print sqlx1

                  print
                  print 'Test 2'
                  sqlx2= copy.copy(sqlx1 )
                  sqlx3= sqlx1
                  pf= '?,?,?, ****'
                  sqlx1= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
                  print 'sqlx1= ', sqlx1
                  print 'sqlx2= ', sqlx2
                  print 'sqlx3= ', sqlx2

                  == output
                  Test group 1
                  INSERT INTO DTABLE2 VALUES ( ?,?,?,? )

                  Test group 2
                  sqlx1= INSERT INTO DTABLE2 VALUES ( ?,?,?, **** )
                  sqlx2= INSERT INTO DTABLE2 VALUES ( ?,?,?,? )
                  sqlx3= INSERT INTO DTABLE2 VALUES ( ?,?,?,? )

                  I interpret this to mean that sqlx1 is not a simple string

                  Comment

                  • Peter Otten

                    #10
                    Re: pySQLite Insert speed

                    Steve Holden wrote:
                    What I will repeat, however, is that while there is a *slight*
                    difference is semantics between
                    >
                    s = "some string"
                    s1 = s
                    >
                    and
                    >
                    s = "some string"
                    s1 = copy.copy(s)
                    >
                    that difference is only to ensure that s and s1 point to different
                    copies of the same string in the latter case, whereas in the former case
                    s and s1 point to the same string.
                    No, both "point" to the same string:
                    >>import copy
                    >>s = "some string"
                    >>s1 = s
                    >>s1 is s
                    True
                    >>s2 = copy.copy(s)
                    >>s2 is s
                    True

                    copy.copy() is just an expensive no-op here.

                    Peter

                    Comment

                    • Steve Holden

                      #11
                      Re: pySQLite Insert speed

                      Peter Otten wrote:
                      Steve Holden wrote:
                      >
                      >What I will repeat, however, is that while there is a *slight*
                      >difference is semantics between
                      >>
                      >s = "some string"
                      >s1 = s
                      >>
                      >and
                      >>
                      >s = "some string"
                      >s1 = copy.copy(s)
                      >>
                      >that difference is only to ensure that s and s1 point to different
                      >copies of the same string in the latter case, whereas in the former case
                      >s and s1 point to the same string.
                      >
                      No, both "point" to the same string:
                      >
                      >>>import copy
                      >>>s = "some string"
                      >>>s1 = s
                      >>>s1 is s
                      True
                      >>>s2 = copy.copy(s)
                      >>>s2 is s
                      True
                      >
                      copy.copy() is just an expensive no-op here.
                      >
                      I suppose wiht strings being immutable there is no need for copy.copy()
                      to actually return anything other than its argument for a string. Thanks
                      for pointing that out.

                      regards
                      Steve
                      --
                      Steve Holden +1 571 484 6266 +1 800 494 3119
                      Holden Web LLC http://www.holdenweb.com/

                      Comment

                      • Steve Holden

                        #12
                        Re: pySQLite Insert speed

                        Peter Otten wrote:
                        Steve Holden wrote:
                        >
                        >What I will repeat, however, is that while there is a *slight*
                        >difference is semantics between
                        >>
                        >s = "some string"
                        >s1 = s
                        >>
                        >and
                        >>
                        >s = "some string"
                        >s1 = copy.copy(s)
                        >>
                        >that difference is only to ensure that s and s1 point to different
                        >copies of the same string in the latter case, whereas in the former case
                        >s and s1 point to the same string.
                        >
                        No, both "point" to the same string:
                        >
                        >>>import copy
                        >>>s = "some string"
                        >>>s1 = s
                        >>>s1 is s
                        True
                        >>>s2 = copy.copy(s)
                        >>>s2 is s
                        True
                        >
                        copy.copy() is just an expensive no-op here.
                        >
                        I suppose wiht strings being immutable there is no need for copy.copy()
                        to actually return anything other than its argument for a string. Thanks
                        for pointing that out.

                        regards
                        Steve
                        --
                        Steve Holden +1 571 484 6266 +1 800 494 3119
                        Holden Web LLC http://www.holdenweb.com/

                        Comment

                        Working...