pymssql - execute loads all results into memory!

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • ChaosKCW

    pymssql - execute loads all results into memory!

    Hi

    I am trying to use pymssql, and have an issue where by the execute
    (not the fetch) is appearing to load all records into memory.

    if I execute

    con = pymssql.connect (...)
    cur = con.cursor()
    cur.execute(sql )
    rec = cur.fetchone()

    if I put in a query which returns a lot of records into "sql" then the
    execute never returns, pythons memory usage slowly ballons till the
    machine cant give anymore. If I put a sql query returning only few
    rows, then it works fine.

    So I am not sure why an execute would feel the need to load all rows,
    but its seriously crippling. Does anyone know if this is a bug or
    something I can "turn off"

    thanks,


  • Eric Wertman

    #2
    Re: pymssql - execute loads all results into memory!

    I am trying to use pymssql, and have an issue where by the execute
    (not the fetch) is appearing to load all records into memory.
    >
    if I execute
    >
    con = pymssql.connect (...)
    cur = con.cursor()
    cur.execute(sql )
    rec = cur.fetchone()
    >
    if I put in a query which returns a lot of records into "sql" then the
    execute never returns, pythons memory usage slowly ballons till the
    machine cant give anymore. If I put a sql query returning only few
    rows, then it works fine.
    >
    So I am not sure why an execute would feel the need to load all rows,
    but its seriously crippling. Does anyone know if this is a bug or
    something I can "turn off"
    I ran into this myself. After some digging I discovered that what you
    are after is a server-side cursor that isn't implemented yet in
    pymssql. There is one in MySQLdb, but it's not the default behavior.
    Regardless of your usage (fetchone vs fetchmany), the result set is
    held client side. AFAIK the only workaround is to keep your result
    set small (enough). If you use fetchmany and iterate over it
    directly, it may keep your memory usage down, I can't remember if
    that worked. I definitely tried making a generator with it, that did
    not help.

    Eric

    Comment

    • Tim Golden

      #3
      Re: pymssql - execute loads all results into memory!

      Eric Wertman wrote:
      >I am trying to use pymssql, and have an issue where by the execute
      >(not the fetch) is appearing to load all records into memory.
      >>
      >if I execute
      >>
      >con = pymssql.connect (...)
      >cur = con.cursor()
      >cur.execute(sq l)
      >rec = cur.fetchone()
      >>
      >if I put in a query which returns a lot of records into "sql" then the
      >execute never returns, pythons memory usage slowly ballons till the
      >machine cant give anymore. If I put a sql query returning only few
      >rows, then it works fine.
      >>
      >So I am not sure why an execute would feel the need to load all rows,
      >but its seriously crippling. Does anyone know if this is a bug or
      >something I can "turn off"
      >
      I ran into this myself. After some digging I discovered that what you
      are after is a server-side cursor that isn't implemented yet in
      pymssql. There is one in MySQLdb, but it's not the default behavior.
      Regardless of your usage (fetchone vs fetchmany), the result set is
      held client side. AFAIK the only workaround is to keep your result
      set small (enough). If you use fetchmany and iterate over it
      directly, it may keep your memory usage down, I can't remember if
      that worked. I definitely tried making a generator with it, that did
      not help.

      .... or just switch to pyodbc, for example, which behaves
      perfectly well with this snippet against a table of >24 million
      rows:

      <code>
      import pyodbc

      conn = [
      "Driver={SQ L Server}",
      "Server=SVR 17",
      "Database=T DI",
      "TrustedConnect ion=Yes"
      ]
      db = pyodbc.connect (";".join (conn))
      q = db.cursor ()
      q.execute ("SELECT * FROM revenue") # 24 million rows
      q.fetchone ()
      q.close ()

      </code>


      TJG

      Comment

      • ChaosKCW

        #4
        Re: pymssql - execute loads all results into memory!

        On Oct 20, 3:38 pm, Tim Golden <m...@timgolden .me.ukwrote:
        Eric Wertman wrote:
        I am trying to use pymssql, and have an issue where by the execute
        (not the fetch) is appearing to load all records into memory.
        >
        if I execute
        >
        con = pymssql.connect (...)
        cur = con.cursor()
        cur.execute(sql )
        rec  = cur.fetchone()
        >
        if I put in a query which returns a lot of records into "sql" then the
        execute never returns, pythons memory usage slowly ballons till the
        machine cant give anymore. If I put a sql query returning only few
        rows, then it works fine.
        >
        So I am not sure why an execute would feel the need to load all rows,
        but its seriously crippling. Does anyone know if this is a bug or
        something I can "turn off"
        >
        I ran into this myself.  After some digging I discovered that what you
        are after is a server-side cursor that isn't implemented yet in
        pymssql.  There is one in MySQLdb, but it's not the default behavior.
        Regardless of your usage (fetchone vs fetchmany), the result set is
        held client side.  AFAIK the only workaround is to keep your result
        set small (enough).  If you use fetchmany and iterate over it
        directly, it may keep your memory usage down,  I can't remember if
        that worked.  I definitely tried making a generator with it, that did
        not help.
        >
        ... or just switch to pyodbc, for example, which behaves
        perfectly well with this snippet against a table of >24 million
        rows:
        >
        <code>
        import pyodbc
        >
        conn = [
          "Driver={SQ L Server}",
          "Server=SVR 17",
          "Database=T DI",
          "TrustedConnect ion=Yes"
        ]
        db = pyodbc.connect (";".join (conn))
        q = db.cursor ()
        q.execute ("SELECT * FROM revenue") # 24 million rows
        q.fetchone ()
        q.close ()
        >
        </code>
        >
        TJG
        Thanks for the responses, I am astounded any db api tool doesnt
        support cursors! pymssql is mostly useless, I will switch to an odbc
        interface.

        Comment

        • Aspersieman

          #5
          Re: pymssql - execute loads all results into memory!

          On Tue, 21 Oct 2008 10:14:56 +0200, ChaosKCW <da.martian@gma il.comwrote:
          On Oct 20, 3:38 pm, Tim Golden <m...@timgolden .me.ukwrote:
          >Eric Wertman wrote:
          >I am trying to use pymssql, and have an issue where by the execute
          >(not the fetch) is appearing to load all records into memory.
          >>
          >if I execute
          >>
          >con = pymssql.connect (...)
          >cur = con.cursor()
          >cur.execute(sq l)
          >rec  = cur.fetchone()
          >>
          >if I put in a query which returns a lot of records into "sql" then
          >the
          >execute never returns, pythons memory usage slowly ballons till the
          >machine cant give anymore. If I put a sql query returning only few
          >rows, then it works fine.
          >>
          >So I am not sure why an execute would feel the need to load all rows,
          >but its seriously crippling. Does anyone know if this is a bug or
          >something I can "turn off"
          >>
          I ran into this myself.  After some digging I discovered that what you
          are after is a server-side cursor that isn't implemented yet in
          pymssql.  There is one in MySQLdb, but it's not the default behavior.
          Regardless of your usage (fetchone vs fetchmany), the result set is
          held client side.  AFAIK the only workaround is to keep your result
          set small (enough).  If you use fetchmany and iterate over it
          directly, it may keep your memory usage down,  I can't remember if
          that worked.  I definitely tried making a generator with it, that did
          not help.
          >>
          >... or just switch to pyodbc, for example, which behaves
          >perfectly well with this snippet against a table of >24 million
          >rows:
          >>
          ><code>
          >import pyodbc
          >>
          >conn = [
          >  "Driver={SQ L Server}",
          >  "Server=SVR 17",
          >  "Database=T DI",
          >  "TrustedConnect ion=Yes"
          >]
          >db = pyodbc.connect (";".join (conn))
          >q = db.cursor ()
          >q.execute ("SELECT * FROM revenue") # 24 million rows
          >q.fetchone ()
          >q.close ()
          >>
          ></code>
          >>
          >TJG
          >
          Thanks for the responses, I am astounded any db api tool doesnt
          support cursors! pymssql is mostly useless, I will switch to an odbc
          interface.
          --
          http://mail.python.org/mailman/listinfo/python-list
          Yes this is true, unfortunately.. .

          However, pyODBC doesn't support return variables(param eters) in stored
          procedures (at least with MS SQL). pymssql is the only db api for python
          that I've found that can reliably do this. I've tried adodbapi, pyodbc and
          one or two others (can't think of the names now... :-/).

          Regards

          Nicol

          --
          Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo

          Comment

          • Tim Golden

            #6
            Re: pymssql - execute loads all results into memory!

            Aspersieman wrote:
            However, pyODBC doesn't support return variables(param eters) in stored
            procedures (at least with MS SQL). pymssql is the only db api for python
            that I've found that can reliably do this. I've tried adodbapi, pyodbc
            and one or two others (can't think of the names now... :-/).
            That's a fair point (I'm assuming you're right; I've never tried). That
            tends to be a shortcoming of SQL interfaces. However, pymssql is based
            on the defunct ntwdblib.dll interface which is becoming harder and harder
            to find as of SQL2005. Have you tried putting a feature request in to
            the pyodbc team? I've not had anything to do with them for a bit, but
            they did implement the .nextset method at my request a couple of years
            ago. (That's assuming it's even possible with ODBC in general, which
            I also haven't checked).

            TJG

            Comment

            • Aspersieman

              #7
              Re: pymssql - execute loads all results into memory!

              On Tue, 21 Oct 2008 13:06:37 +0200, Tim Golden <mail@timgolden .me.uk>
              wrote:
              Aspersieman wrote:
              >However, pyODBC doesn't support return variables(param eters) in stored
              >procedures (at least with MS SQL). pymssql is the only db api for
              >python that I've found that can reliably do this. I've tried adodbapi,
              >pyodbc and one or two others (can't think of the names now... :-/).
              >
              That's a fair point (I'm assuming you're right; I've never tried). That
              tends to be a shortcoming of SQL interfaces. However, pymssql is based
              on the defunct ntwdblib.dll interface which is becoming harder and harder
              to find as of SQL2005. Have you tried putting a feature request in to
              the pyodbc team? I've not had anything to do with them for a bit, but
              they did implement the .nextset method at my request a couple of years
              ago. (That's assuming it's even possible with ODBC in general, which
              I also haven't checked).
              >
              TJG
              --

              Oh, trust me - I've tried a *LOT*. I aggree, submitting a feature request
              is a good idea. I think I'll do that. :)

              I did read somewhere (I think on the pyodbc site) that pyodbc isn't DB API
              2.0 feature complete yet - so I'll have to probably check if they haven't
              already planned this.

              Obtaining output parameters from stored procedures is something I have to
              do quite often, unfortunately - so maybe I can submit a patch even... :)

              Alternatively - I have found the adodbapi quite useful (although it also
              doesn't support output parameters), so I might ask them too.

              Thanks for the suggestion.

              Regards

              Nicol

              --
              Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo

              Comment

              • Tim Golden

                #8
                Re: pymssql - execute loads all results into memory!

                Aspersieman wrote:
                [... re output params in pymssql / pyodbc ...]
                Oh, trust me - I've tried a *LOT*. I aggree, submitting a feature
                request is a good idea. I think I'll do that. :)
                Looks like someone already has:



                I did read somewhere (I think on the pyodbc site) that pyodbc isn't DB
                API 2.0 feature complete yet - so I'll have to probably check if they
                haven't already planned this.
                >
                Obtaining output parameters from stored procedures is something I have
                to do quite often, unfortunately - so maybe I can submit a patch even... :)

                I'm sure that will be more than welcome. Looks like it's a
                one-man band (as I guess most projects are) and there aren't
                always enough minutes in the day.

                TJG

                Comment

                Working...