Database Query Contains Old Data

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • giraffeboy@gmail.com

    Database Query Contains Old Data

    Hi there,

    I'm having a problem with the Python db api, using MySQL.

    I've written a program with a GUI using wxPython, the GUI is contained
    in main.py which imports another module - reports.py. There are
    several reports that are selected using the gui, and each report is a
    class in the file reports.py. These classes contain a method which is
    passed data as arguments and produces a report, writing HTML to a file
    and using matplotlib to create a graph file.

    The report class methods are called from the GUI using the following
    code:
    agentlist = self.getselecte d()
    detail = self.cb1.GetVal ue()
    if self.getdates() != False:
    fromdate, todate = self.getdates()
    app.mainframe.S etStatusText("G enerating Report...")
    if self.reportchoi ce.GetSelection () == 0:
    thereport = reports.Vehicle RunningCost()
    thereport.produ cereport(agentl ist, fromdate=fromda te,
    todate=todate, detail=detail)
    app.mainframe.S etStatusText("R eport Complete")
    viewerframe = ViewerFrame(Non e, -1, "Mileage and Fuel
    Report Viewer")
    viewerframe.Sho w(True)

    The first time you run a report, everything works as expected but if
    you run it a second time, after modifying data, it seems that the data
    from before the modification is selected on the second report run.

    It would be much appreciated if anyone could give me any hints as to
    why the old data is selected. I haven't posted the full code because
    it's quite long but will post more if anyone wants to see specific
    parts.

    Thanks!
    Andrew
  • Jerry Hill

    #2
    Re: Database Query Contains Old Data

    On Wed, May 21, 2008 at 6:30 AM, <giraffeboy@gma il.comwrote:
    The first time you run a report, everything works as expected but if
    you run it a second time, after modifying data, it seems that the data
    from before the modification is selected on the second report run.
    Did you remember to commit your changes before re-running the report?
    Python's DB API requires that any auto-commit feature of the
    underlying database be turned off by default, so you are required to
    commit changes yourself. If you're used to having auto-commit turned
    on, this can be confusing.

    See http://www.python.org/dev/peps/pep-0249/ for more details of
    python's DB API.

    --
    Jerry

    Comment

    • Paul Boddie

      #3
      Re: Database Query Contains Old Data

      On 21 Mai, 15:22, giraffe...@gmai l.com wrote:
      >
      I did and I confirmed this by modifying the data, selecting it from
      the mysql command line client to verify the changes, then running the
      report again. If I exit the application and then start it again,
      everything works as expected until the second instance of the report
      is run.
      Note that if you have a connection open in your program, especially if
      that connection has already been used to select data, it may be the
      case that you then have to perform a rollback or commit before
      attempting to access newly added data. The reason for this behaviour
      is that the DB-API modules will have begun a transaction on your
      behalf, and while that transaction is open, changes committed in other
      transactions may be unavailable to your own transaction, depending on
      the transaction isolation level.

      MySQL appears to use "repeatable read" by default [1] as its
      transaction isolation level, whereas PostgreSQL (for example) uses
      "read committed" by default [2]. I would guess that if you were using
      PostgreSQL, this particular problem would not have occurred, but there
      are other reasons to be aware of the effects of long duration
      transactions in PostgreSQL, and the practice of periodically
      performing a rollback would still be worth considering with that
      database system.

      Paul

      [1] http://dev.mysql.com/doc/refman/5.1/...isolation.html
      [2] http://www.postgresql.org/docs/8.1/i...ction-iso.html

      Comment

      • giraffeboy@gmail.com

        #4
        Re: Database Query Contains Old Data

        On May 21, 3:23 pm, Paul Boddie <p...@boddie.or g.ukwrote:
        Note that if you have a connection open in your program, especially if
        that connection has already been used to select data, it may be the
        case that you then have to perform a rollback or commit before
        attempting to access newly added data. The reason for this behaviour
        is that the DB-API modules will have begun a transaction on your
        behalf, and while that transaction is open, changes committed in other
        transactions may be unavailable to your own transaction, depending on
        the transaction isolation level.
        Thanks for that Paul, seems to have solved the problem perfectly. I
        had always just thought querying a database would always give you the
        most current data, guess it just goes to show that things are never as
        simple as they first appear!

        Comment

        • John Nagle

          #5
          Re: Database Query Contains Old Data

          Paul Boddie wrote:
          On 21 Mai, 15:22, giraffe...@gmai l.com wrote:
          >I did and I confirmed this by modifying the data, selecting it from
          >the mysql command line client to verify the changes, then running the
          >report again. If I exit the application and then start it again,
          >everything works as expected until the second instance of the report
          >is run.
          >
          Note that if you have a connection open in your program, especially if
          that connection has already been used to select data, it may be the
          case that you then have to perform a rollback or commit before
          attempting to access newly added data.
          Exactly. Although it seems counterintutive , it's not enough
          to do a COMMIT after UPDATE and INSERT operations. You also have to
          do a COMMIT after a SELECT if you're going to reuse the database handle
          and do another SELECT. Otherwise, you reread the same data forever.

          COMMIT, by the way, is per database handle, so if you have
          multiple database handles, each needs to handle its own COMMIT needs.

          John Nagle

          Comment

          • Paul Boddie

            #6
            Re: Database Query Contains Old Data

            On 23 Mai, 17:18, John Nagle <na...@animats. comwrote:
            >
            Exactly. Although it seems counterintutive , it's not enough
            to do a COMMIT after UPDATE and INSERT operations. You also have to
            do a COMMIT after a SELECT if you're going to reuse the database handle
            and do another SELECT. Otherwise, you reread the same data forever.
            You can also do a rollback, as I noted, since the aim is merely to
            obtain a new transaction by discarding the current one. Upon
            performing a new select using the DB-API such a new transaction will
            then be obtained, since the rollback or commit will have ensured that
            no transaction is currently open.

            If the DB-API exposed the MySQL/PostgreSQL semantics with explicit
            transactions (see [1] for pertinent material), then the technique
            discussed above would either be superfluous (you might not be using
            transactions at all) or more obvious (you would have issued an
            explicit "start transaction" or "begin" command), but there are
            obviously good arguments for exposing the standard semantics through
            the API instead.
            COMMIT, by the way, is per database handle, so if you have
            multiple database handles, each needs to handle its own COMMIT needs.
            That's worth remembering, yes.

            Paul

            [1] http://www.postgresql.org/docs/8.1/s...ansaction.html

            Comment

            • James A. Donald

              #7
              Re: Database Query Contains Old Data

              On Wed, 21 May 2008 07:23:04 -0700 (PDT), Paul Boddie
              MySQL appears to use "repeatable read" by default [1] as its
              transaction isolation level, whereas PostgreSQL (for example) uses
              "read committed" by default [2]. I would guess that if you were using
              PostgreSQL, this particular problem would not have occurred, but there
              are other reasons to be aware of the effects of long duration
              transactions in PostgreSQL, and the practice of periodically
              performing a rollback would still be worth considering with that
              database system.
              If one has transactions open for a long time, or transactions that
              involve a great deal of data, this will result in poor performance or
              poor scalability. But one may have such large transactions without
              being aware of it. Is there any way to make transaction size salient
              to the developer? Any way to make sure one is committing as early and
              often as possible?

              --
              ----------------------
              We have the right to defend ourselves and our property, because
              of the kind of animals that we are. True law derives from this
              right, not from the arbitrary power of the omnipotent state.

              http://www.jim.com/ James A. Donald

              Comment

              • Paul Boddie

                #8
                Re: Database Query Contains Old Data

                On 3 Jun, 00:17, James A. Donald <jam...@echeque .comwrote:
                On Wed, 21 May 2008 07:23:04 -0700 (PDT), Paul Boddie
                >
                MySQL appears to use "repeatable read" by default [1] as its
                transaction isolation level, whereas PostgreSQL (for example) uses
                "read committed" by default [2]. I would guess that if you were using
                PostgreSQL, this particular problem would not have occurred, but there
                are other reasons to be aware of the effects of long duration
                transactions in PostgreSQL, and the practice of periodically
                performing a rollback would still be worth considering with that
                database system.
                >
                If one has transactions open for a long time, or transactions that
                involve a great deal of data, this will result in poor performance or
                poor scalability.
                I think you need to explain this to me. If there's a long-running
                transaction happening in the background and my own transactions get
                created and rolled back periodically, how would the long-running
                transaction be affected? If, on the other hand, my own transaction is
                long-running, I can see that rolling it back would incur a cost, but
                what choice do I have other than to perform a rollback more often (or
                to disable transactions, which might incur other costs)? I don't want
                to perform a commit instead merely for performance reasons, especially
                if it impacts correctness.

                I was actually thinking of lock acquisition in PostgreSQL when I made
                the remark. With lots of tables in a database, it's possible to
                acquire a large number of locks, and retaining locks can also prevent
                other operations from being carried out.
                But one may have such large transactions without
                being aware of it. Is there any way to make transaction size salient
                to the developer? Any way to make sure one is committing as early and
                often as possible?
                I'm not aware of anything which will tell you how big your transaction
                is, but there may be some kind of table or function which provides
                some details about such things. However, it is possible to see how
                many locks your transaction has, and on which tables.

                Paul

                Comment

                • M.-A. Lemburg

                  #9
                  Re: Database Query Contains Old Data

                  On 2008-06-03 00:17, James A. Donald wrote:
                  On Wed, 21 May 2008 07:23:04 -0700 (PDT), Paul Boddie
                  >MySQL appears to use "repeatable read" by default [1] as its
                  >transaction isolation level, whereas PostgreSQL (for example) uses
                  >"read committed" by default [2]. I would guess that if you were using
                  >PostgreSQL, this particular problem would not have occurred, but there
                  >are other reasons to be aware of the effects of long duration
                  >transactions in PostgreSQL, and the practice of periodically
                  >performing a rollback would still be worth considering with that
                  >database system.
                  >
                  If one has transactions open for a long time, or transactions that
                  involve a great deal of data, this will result in poor performance or
                  poor scalability.
                  Poor performance is usually not an issue since databases are
                  optimized to work with transactions.

                  What's more important is that an open transaction will cause locks
                  on the tables you are writing to. Depending on the database
                  backend these locks may lock the entire table or just a few rows.

                  In any case, such locks prevent accessing the tables or rows
                  in question from other connections and that will quickly turn
                  into a major problem if you have more than just one connection
                  to the database.
                  But one may have such large transactions without
                  being aware of it. Is there any way to make transaction size salient
                  to the developer? Any way to make sure one is committing as early and
                  often as possible?
                  This depends on the database in question. By accessing system tables
                  directly you can usually find out a lot about the database and your
                  current transaction.

                  That said, it's not a good idea to commit a logical transaction (ie.
                  a user entering data) in multiple chunks. You'd lose the most important
                  feature of transactions: that of being able to rollback to the start
                  of the transaction.

                  As others have mentioned, in systems that have long running logical
                  transactions, it's usually best to collect the data until the very
                  end and then apply all changes in one go (and one database
                  transaction).

                  Another problem with long running transactions is that the data
                  in the tables may change after the start of the transaction. This
                  can result in invalid data being committed (e.g. one part of a calculation
                  uses the data at time t1 and another at time t2). This can be
                  avoided by using snapshots, versioning and timestamps in the tables,
                  so that all queries use the same data.

                  And if this were not enough, you often run into conflicts during
                  the commit phase due to changes made by others to the same tables.

                  These can usually only be avoided by implementing merge strategies
                  in your application, unless you want to lock out all other users
                  during the transaction ... which would bring you back to the
                  original problem.

                  In summary: long running transactions are not easy to get right :-)

                  --
                  Marc-Andre Lemburg
                  eGenix.com

                  Professional Python Services directly from the Source (#1, Jun 03 2008)
                  >>Python/Zope Consulting and Support ... http://www.egenix.com/
                  >>mxODBC.Zope.D atabase.Adapter ... http://zope.egenix.com/
                  >>mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
                  _______________ _______________ _______________ _______________ ____________
                  2008-07-07: EuroPython 2008, Vilnius, Lithuania 33 days to go

                  :::: Try mxODBC.Zope.DA for Windows,Linux,S olaris,MacOSX for free ! ::::


                  eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
                  D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
                  Registered at Amtsgericht Duesseldorf: HRB 46611

                  Comment

                  • James A. Donald

                    #10
                    Re: Database Query Contains Old Data

                    On Tue, 03 Jun 2008 12:07:07 +0200, "M.-A. Lemburg" <mal@egenix.com >
                    wrote:
                    As others have mentioned, in systems that have long running logical
                    transactions, it's usually best to collect the data until the very
                    end and then apply all changes in one go (and one database
                    transaction).
                    I understand you to mean that one should arrange matters so that what
                    is a lengthy transaction from the point of view of the user is a short
                    transaction from the point of view of the database.

                    --
                    ----------------------
                    We have the right to defend ourselves and our property, because
                    of the kind of animals that we are. True law derives from this
                    right, not from the arbitrary power of the omnipotent state.

                    http://www.jim.com/ James A. Donald

                    Comment

                    • M.-A. Lemburg

                      #11
                      Re: Database Query Contains Old Data

                      On 2008-06-03 14:29, James A. Donald wrote:
                      On Tue, 03 Jun 2008 12:07:07 +0200, "M.-A. Lemburg" <mal@egenix.com >
                      wrote:
                      >As others have mentioned, in systems that have long running logical
                      >transactions , it's usually best to collect the data until the very
                      >end and then apply all changes in one go (and one database
                      >transaction) .
                      >
                      I understand you to mean that one should arrange matters so that what
                      is a lengthy transaction from the point of view of the user is a short
                      transaction from the point of view of the database.
                      Yes, mainly to avoid database locks.

                      In a multi-user environment, you will need to add some merge logic
                      in your application, to prevent conflict errors in the database.

                      --
                      Marc-Andre Lemburg
                      eGenix.com

                      Professional Python Services directly from the Source (#1, Jun 04 2008)
                      >>Python/Zope Consulting and Support ... http://www.egenix.com/
                      >>mxODBC.Zope.D atabase.Adapter ... http://zope.egenix.com/
                      >>mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
                      _______________ _______________ _______________ _______________ ____________
                      2008-07-07: EuroPython 2008, Vilnius, Lithuania 32 days to go

                      :::: Try mxODBC.Zope.DA for Windows,Linux,S olaris,MacOSX for free ! ::::


                      eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
                      D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
                      Registered at Amtsgericht Duesseldorf: HRB 46611

                      Comment

                      Working...