Database abstraction layers

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Colin McKinnon

    Database abstraction layers

    Hi all,

    There's lots of DB abstraction layers out there, but a quick look around
    them hasn't turned up anything which seems to met my requirements. Before I
    go off and write one I thought I'd ask here if anyone knows of such a
    beast...

    I want some code where I present an array of data, and the corresponding
    primary key and let the code work out whether to INSERT or UPDATE it, I
    also want to be able to present the data from a QBF or QBE then be able to
    step through the result set. However I don't want to have to configure the
    DBMS structure - after all most of it is already in the DBMS (OK so not the
    relationships in a MySQL db). It'd be really cool if I could throw SQL
    directly at it *too*.

    Anybody any suggestions?

    TIA,

    C.
  • Iván Sánchez Ortega

    #2
    Re: Database abstraction layers

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    Colin McKinnon wrote:
    [color=blue]
    > I want some code where I present an array of data, and the corresponding
    > primary key and let the code work out whether to INSERT or UPDATE it,[...][/color]
    [...][color=blue]
    > Anybody any suggestions?[/color]

    Use MySQL and REPLACE statements.

    - --
    - ----------------------------------
    Iván Sánchez Ortega -i-punto-sanchez--arroba-mirame-punto-net

    Un ordenador no es un televisor ni un microondas, es una herramienta
    compleja.
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.3 (GNU/Linux)

    iD8DBQFEY8CT3jc Q2mg3Pc8RAl57AJ 901ARs69nLk3mpt IDYKlfLDCeBogCf ft10
    HOi1URDxXXfp8rN g1RoHIGI=
    =93vZ
    -----END PGP SIGNATURE-----

    Comment

    • Toby Inkster

      #3
      Re: Database abstraction layers

      Iván Sánchez Ortega wrote:[color=blue]
      > Colin McKinnon wrote:
      >[color=green]
      >> I want some code where I present an array of data, and the corresponding
      >> primary key and let the code work out whether to INSERT or UPDATE it,[...][/color]
      > [...][color=green]
      >> Anybody any suggestions?[/color]
      >
      > Use MySQL and REPLACE statements.[/color]

      Very abstract.

      One technique for the OP might be to prepare both an INSERT and an UPDATE
      statement. Run the INSERT statement; if it fails then there's probably
      already a row with that primary key, so run the UPDATE statement. Or you
      could run UPDATE first, check the number of rows updated; if 0, run the
      INSERT. Not particularly clean, but should be quite portable.

      --
      Toby A Inkster BSc (Hons) ARCS
      Contact Me ~ http://tobyinkster.co.uk/contact

      Comment

      • NC

        #4
        Re: Database abstraction layers

        Colin McKinnon wrote:[color=blue]
        >
        > There's lots of DB abstraction layers out there, but a quick look around
        > them hasn't turned up anything which seems to met my requirements.[/color]

        Which probably means that you shouldn't even attempt to solve your
        problem with an abstraction layer; easier solutions might be
        available...
        [color=blue]
        > I want some code where I present an array of data, and the corresponding
        > primary key and let the code work out whether to INSERT or UPDATE it[/color]

        You don't need PHP code for that; this can be handled on the database
        level:

        INSERT INTO the_table
        SET field1=value1, field2=value2, field3=value3, ...
        ON DUPLICATE KEY UPDATE field1=value1, field2=value2, field3=value3,
        ....

        This syntax is available since MySQL 4.1.0.

        You could also use a REPLACE query, but keep in mind that it works by
        deleting the old record and inserting a new one, so if there are fields
        that exist in the table, but not in the query, they will be lost.

        Cheers,
        NC

        Comment

        • Kenneth Downs

          #5
          Re: Database abstraction layers

          Colin McKinnon wrote:

          Are you married to mySQL?

          I've created a dictionary-based comprehensive tool that uses Postgres as the
          back end and runs on linux, and is GPL.



          I never code inserts or updates ever. My app code is full of stuff like:

          $row=array("col name"=>value,"c ol2"=>value)
          SQLX_Insert($ta ble_dd,$row);

          You can also do:

          SQLX_InsertOrUp date($table,$ro w)

          which looks for the existing row and does an appropriate insert or update as
          the case may be.

          You can make a big batch of rows like:

          $rows=array(
          "table1"=>array (
          array( .... row .....)
          ,array( .... row .....)
          ,array( .... row .....)
          )
          ,"table2"=>arra y(
          array( .... row .....)
          ,array( .... row .....)
          ,array( .... row .....)
          )
          );
          SQLX_Transactio nBegin();
          SQLX_InsertsOrU pdates($rows);
          SQLX_Transactio nEnd("NAME");

          If there is an error in the transaction, it gets logged and tagged with
          "NAME" so that it can be examined later.

          You can also grab anything you need to know from the online dictionary. The
          complete description of every column in every table is available in the
          "flat" array for each table:

          $dd=dd_tableref ("orders");
          echo "The description of order # is: ".$dd['flat']['order']['description']

          This includes detailed information about foreign keys.

          There is a huge amount of database-side automation as well.

          Our first system went live a few weeks ago and two more are coming soon. I
          consider it late beta for my own use because i know all of the foibles it
          has, but it is probably early alpha for somebody picking it up right now.




          [color=blue]
          > Hi all,
          >
          > There's lots of DB abstraction layers out there, but a quick look around
          > them hasn't turned up anything which seems to met my requirements. Before
          > I go off and write one I thought I'd ask here if anyone knows of such a
          > beast...[/color]

          [color=blue]
          >
          > I want some code where I present an array of data, and the corresponding
          > primary key and let the code work out whether to INSERT or UPDATE it, I
          > also want to be able to present the data from a QBF or QBE then be able to
          > step through the result set. However I don't want to have to configure the
          > DBMS structure - after all most of it is already in the DBMS (OK so not
          > the relationships in a MySQL db). It'd be really cool if I could throw SQL
          > directly at it *too*.
          >
          > Anybody any suggestions?
          >
          > TIA,
          >
          > C.[/color]

          --
          Kenneth Downs
          Secure Data Software, Inc.
          (Ken)nneth@(Sec )ure(Dat)a(.com )

          Comment

          • C.

            #6
            Re: Database abstraction layers

            Thanks Kenneth - sounds interesting.

            I had a quick look....

            How do you deal with the problem of updating the primary key of a
            record? Cascade update/deletes?

            TIA,

            C.

            Comment

            • Dikkie Dik

              #7
              Re: Database abstraction layers

              > I want some code where I present an array of data, and the corresponding[color=blue]
              > primary key and let the code work out whether to INSERT or UPDATE it,[/color]

              Quite simple. If the primary key in your application is numeric, you got
              it from the database. Otherwise, how would you know the key value?
              If the primary key value is NULL, it is a new record. even if it
              contains the same non-key data as an existing record, it is different
              anyhow. If you have more than one instance representing one record, you
              may look at http://www.w-p.dds.nl/article/wrtabrec.htm for how to fix
              that (shamelessly plugging my own article).
              [color=blue]
              > I also want to be able to present the data from a QBF or QBE then be able to
              > step through the result set. However I don't want to have to configure the
              > DBMS structure - after all most of it is already in the DBMS (OK so not the
              > relationships in a MySQL db). It'd be really cool if I could throw SQL
              > directly at it *too*.[/color]

              I think having to configure a mapper is a bad code smell. My biggest
              problem today is the dependency across systems. You probably have
              encountered situations where file names were in the database, or where
              constants in an application referred to enumerations in a database
              field. These dependencies cannot be enforced (no one will stop you from
              deleting a file if its name is in a database, for example). If you have
              a configuration file for a mapper, you introduce *yet another* non
              enforceable dependency, along with an untestable system.

              But then, I always write my mappings myself. Common code gets into
              superclasses and the strict typing (in languages that support it) and
              the communication stategies into the wrappers. So definition/lookup
              tables become read-only collections, for instance. For each table, I
              decide whether I want it to be lazy, greedy, preloadable (you can
              "schedule" a record without loading it yet, but it will be loaded at the
              first necessary database action), or whatever other combination of lazy
              and greedy.
              Always remember that a class should encapsulate and hide its internal
              structure. If a table strategy would change the interface of its
              wrapping collection, it is not a useful wrapper.

              Best regards

              Comment

              • Colin McKinnon

                #8
                Re: Database abstraction layers

                Dikkie Dik wrote:
                [color=blue][color=green]
                >> I want some code where I present an array of data, and the corresponding
                >> primary key and let the code work out whether to INSERT or UPDATE it,[/color]
                >
                > Quite simple. If the primary key in your application is numeric, you got
                > it from the database. Otherwise, how would you know the key value?
                > If the primary key value is NULL, it is a new record. even if it
                > contains the same non-key data as an existing record, it is different
                > anyhow.[/color]
                That only really works if you don't bother normalizing your data but chuck
                in an auto-incrementing record identifier (the rot started with MS-Access
                on that one, but its far too common in web apps).

                It all rather falls apart when you're dealing with very large sets of
                normalized data - what about updating the data within the PK? Cascade
                updates?
                [color=blue]
                > If you have more than one instance representing one record, you
                > may look at http://www.w-p.dds.nl/article/wrtabrec.htm for how to fix
                > that (shamelessly plugging my own article).
                >[color=green]
                >> I also want to be able to present the data from a QBF or QBE then be able
                >> to step through the result set. However I don't want to have to configure
                >> the DBMS structure - after all most of it is already in the DBMS (OK so
                >> not the relationships in a MySQL db). It'd be really cool if I could
                >> throw SQL directly at it *too*.[/color]
                >
                > I think having to configure a mapper is a bad code smell. My biggest
                > problem today is the dependency across systems. You probably have
                > encountered situations where file names were in the database, or where
                > constants in an application referred to enumerations in a database
                > field. These dependencies cannot be enforced (no one will stop you from
                > deleting a file if its name is in a database, for example). If you have
                > a configuration file for a mapper, you introduce *yet another* non
                > enforceable dependency, along with an untestable system.
                >[/color]

                I think that's the point I was making - there should be enough description
                of the structure accessible via the DDL to make that redundant. In fact
                using just SQL on its own its not sufficient - even with foreign key
                declarations and constraints. Having said that there is a lot you can do
                with what is avilable even in MySQL. Particularly if you maintain a strict
                domain naming strategy.

                In my search I came across a couple of systems which seemed to come close to
                what I wanted, but both used a static description of the data structure
                which was maintained independently of the DBMS - yes that causes problems.
                [color=blue]
                > But then, I always write my mappings myself. Common code gets into
                > superclasses and the strict typing (in languages that support it) and
                > the communication stategies into the wrappers. So definition/lookup
                > tables become read-only collections, for instance. For each table, I
                > decide whether I want it to be lazy, greedy, preloadable (you can
                > "schedule" a record without loading it yet, but it will be loaded at the
                > first necessary database action), or whatever other combination of lazy
                > and greedy.
                > Always remember that a class should encapsulate and hide its internal
                > structure. If a table strategy would change the interface of its
                > wrapping collection, it is not a useful wrapper.[/color]

                You're the one who started talking about OO. I spend my working hours fixing
                problems that have arisen from O-R impedance mismatch. I think the idea of
                using a relational database as a persistance layer for an OO application is
                fundamentally flawed. While I admit that it does take more work in
                producing the initial version of an application, its not really much effort
                to extrapolate a normalized database design and implement the factories to
                populate the runtime rather than simply a 1:1 object:record mapping - and
                the benefits are huge when you start looking at whole lifecycle -
                particularly post-release development. This was one of my reasons for
                tackling the problem in this way - to eliminate that hurdle. It also means
                that the abstraction is applicable in other idioms.

                But the article is interesting.

                C.

                Comment

                • Toby Inkster

                  #9
                  Re: Database abstraction layers

                  Dikkie Dik wrote:
                  [color=blue]
                  > Quite simple. If the primary key in your application is numeric, you got
                  > it from the database. Otherwise, how would you know the key value?[/color]

                  Because you know it?

                  For example, if I'm keeping a table of my customers' credit cards, I might
                  use:

                  customer_id (int, foreign key)
                  card_number (big int, primary key)
                  card_holder_nam e (varchar)
                  expiry_date (timestamp)

                  If I'm inserting a new record into the table then I already know the
                  primary key, don't I?

                  --
                  Toby A Inkster BSc (Hons) ARCS
                  Contact Me ~ http://tobyinkster.co.uk/contact

                  Comment

                  • Dikkie Dik

                    #10
                    Re: Database abstraction layers

                    >> Quite simple. If the primary key in your application is numeric, you got[color=blue][color=green]
                    >> it from the database. Otherwise, how would you know the key value?[/color]
                    >
                    > Because you know it?
                    >
                    > For example, if I'm keeping a table of my customers' credit cards, I might
                    > use:
                    >
                    > customer_id (int, foreign key)
                    > card_number (big int, primary key)
                    > card_holder_nam e (varchar)
                    > expiry_date (timestamp)
                    >
                    > If I'm inserting a new record into the table then I already know the
                    > primary key, don't I?
                    >[/color]

                    No, you don't. You know a FIELD value of your record, NOT its primary
                    key. That you wish to combine them (that is always going to get you in
                    trouble sooner or later) does not change anything. For clarity: A
                    primary key is only a unique pointer to a record and nothing more.
                    Especially, primary key values should never be related to the data in
                    the record.

                    Let me show it. Suppose this card object is a new one. In this case, you
                    do NOT know its primary key. You can't, because the primary key value
                    does not exist yet. Oh, you might know what it WILL be, but I could also
                    know what it would be if I used an autonumber. The card number column in
                    you table serves to purposes: it holds both the identity of the record
                    and some data.
                    You don't believe me? Just lookup your "known" record in the database.
                    You won't find it. It does not exist. You will only find it if you know
                    its primary key value. And for the record to have a primary key value,
                    it must exist in the database first.

                    But apart from that, your code does know whether the record comes from
                    the database or from another source. If you can't trust your primary
                    keys, you could have the table wrapper just set a boolean when getting
                    the record from the database. The original question was: how do you know
                    that a record comes from the database? If you want to know that, your
                    database code has to be held responsible for keeping track of that. If
                    you can't do that with the primary key because it interferes with data,
                    just find another way. Anyhow, your table wrapper knows if it came from
                    the storage.

                    Comment

                    • Jerry Stuckle

                      #11
                      Re: Database abstraction layers

                      Dikkie Dik wrote:[color=blue][color=green][color=darkred]
                      >>> Quite simple. If the primary key in your application is numeric, you
                      >>> got it from the database. Otherwise, how would you know the key value?[/color]
                      >>
                      >>
                      >> Because you know it?
                      >>
                      >> For example, if I'm keeping a table of my customers' credit cards, I
                      >> might
                      >> use:
                      >>
                      >> customer_id (int, foreign key)
                      >> card_number (big int, primary key)
                      >> card_holder_nam e (varchar)
                      >> expiry_date (timestamp)
                      >>
                      >> If I'm inserting a new record into the table then I already know the
                      >> primary key, don't I?
                      >>[/color]
                      >
                      > No, you don't. You know a FIELD value of your record, NOT its primary
                      > key. That you wish to combine them (that is always going to get you in
                      > trouble sooner or later) does not change anything. For clarity: A
                      > primary key is only a unique pointer to a record and nothing more.
                      > Especially, primary key values should never be related to the data in
                      > the record.
                      >[/color]

                      Actually, that depends on the primary key. It might not be an autoincrement
                      column, for instance. I've had it be a equipment id (inventory), part number,
                      (inventory again), SSN (banking transaction), phone number (non-profit call
                      list) and many other things. A primary key only identifies a unique row; it
                      does not have to be an auto-increment column, and in many cases something else
                      does make more sense.
                      [color=blue]
                      > Let me show it. Suppose this card object is a new one. In this case, you
                      > do NOT know its primary key. You can't, because the primary key value
                      > does not exist yet. Oh, you might know what it WILL be, but I could also
                      > know what it would be if I used an autonumber. The card number column in
                      > you table serves to purposes: it holds both the identity of the record
                      > and some data.[/color]

                      Again, it depends on your primary key. The program logic shouldn't even care
                      what the primary key is - just field values.
                      [color=blue]
                      > You don't believe me? Just lookup your "known" record in the database.
                      > You won't find it. It does not exist. You will only find it if you know
                      > its primary key value. And for the record to have a primary key value,
                      > it must exist in the database first.
                      >[/color]

                      True, it won't exist. But you don't necessarily know that unless you check.
                      [color=blue]
                      > But apart from that, your code does know whether the record comes from
                      > the database or from another source. If you can't trust your primary
                      > keys, you could have the table wrapper just set a boolean when getting
                      > the record from the database. The original question was: how do you know
                      > that a record comes from the database? If you want to know that, your
                      > database code has to be held responsible for keeping track of that. If
                      > you can't do that with the primary key because it interferes with data,
                      > just find another way. Anyhow, your table wrapper knows if it came from
                      > the storage.[/color]

                      Again, the program itself should not know nor should it care about primary keys.

                      --
                      =============== ===
                      Remove the "x" from my email address
                      Jerry Stuckle
                      JDS Computer Training Corp.
                      jstucklex@attgl obal.net
                      =============== ===

                      Comment

                      • Dikkie Dik

                        #12
                        Re: Database abstraction layers

                        <snip>[color=blue][color=green]
                        >> You don't believe me? Just lookup your "known" record in the database.
                        >> You won't find it. It does not exist. You will only find it if you
                        >> know its primary key value. And for the record to have a primary key
                        >> value, it must exist in the database first.
                        >>[/color]
                        >
                        > True, it won't exist. But you don't necessarily know that unless you
                        > check.[/color]

                        Wrong. That is what I am trying to tell you all the time. Whether you
                        should update or insert the record is something you know beforehand. If
                        the user selects a card from editing, that card is retrieved from the
                        database and marked (with any means you like) as existing. You should
                        now be able to change any DATA value of the record.
                        If the user clicked "enter new card" or just did not select an existing
                        record, he is entering a new card which is marked as such. So before I
                        edit anything, I already have checked the existence of the record.

                        And it's quite simple: existing records get updated, new records get
                        inserted.

                        By the way, if you take the credit card number as a primary key AND get
                        the existence from the database, you would only have to give someone
                        else's credit card number to hijack his card. If the credit card already
                        exists for another user in the database, you have a data validity
                        problem, NOT a key problem (unless you abuse data for a key, off course).
                        Oh, and if I correct a misspelled credit card in the system, I end up
                        with TWO credit cards. I told you this is trouble to no ends.
                        [color=blue]
                        >[color=green]
                        >> But apart from that, your code does know whether the record comes from
                        >> the database or from another source. If you can't trust your primary
                        >> keys, you could have the table wrapper just set a boolean when getting
                        >> the record from the database. The original question was: how do you
                        >> know that a record comes from the database? If you want to know that,
                        >> your database code has to be held responsible for keeping track of
                        >> that. If you can't do that with the primary key because it interferes
                        >> with data, just find another way. Anyhow, your table wrapper knows if
                        >> it came from the storage.[/color]
                        >
                        > Again, the program itself should not know nor should it care about
                        > primary keys.
                        >[/color]
                        Say what? The primary key uniquely identifies the record. Without it,
                        you don't even know which record to update. As shown above, you should
                        identify a record by its key, NOT by its data. That is why keys should
                        never relate to data.

                        Comment

                        • Toby Inkster

                          #13
                          Re: Database abstraction layers

                          Dikkie Dik wrote:
                          [color=blue]
                          > Say what? The primary key uniquely identifies the record. Without it,
                          > you don't even know which record to update. As shown above, you should
                          > identify a record by its key, NOT by its data. That is why keys should
                          > never relate to data.[/color]

                          This is complete rubbish from someone who's clearly learned their database
                          design principles from _The Muppets' Big Book of RDBMS Programming_.

                          Yes, when identifying a particular row in a database table, you should use
                          *a* key. (Not necessarily the primary key though, as in many cases a table
                          will have alternative candidate keys that may be used. All things being
                          equal though, might as well use the primary key if we have the choice.)

                          *But*, the primary key will often relate to data. In the general case
                          there is no need to introduce an arbitrary additional "pkey int" column to
                          the table. (Though in certain specific situations it might be necessary,
                          or simply more convenient.)


                          versus


                          --
                          Toby A Inkster BSc (Hons) ARCS
                          Contact Me ~ http://tobyinkster.co.uk/contact

                          Comment

                          • Toby Inkster

                            #14
                            Re: Database abstraction layers

                            Dikkie Dik wrote:[color=blue]
                            > Toby Inkster wrote:
                            >[color=green]
                            >> For example, if I'm keeping a table of my customers' credit cards, I might
                            >> use:
                            >>
                            >> customer_id (int, foreign key)
                            >> card_number (big int, primary key)
                            >> card_holder_nam e (varchar)
                            >> expiry_date (timestamp)
                            >>
                            >> If I'm inserting a new record into the table then I already know the
                            >> primary key, don't I?[/color]
                            >
                            > No, you don't. You know a FIELD value of your record, NOT its primary
                            > key.[/color]

                            Look at the table structure. The credit card number *is* the primary key.
                            If I know the card details, I know the primary key.
                            [color=blue]
                            > Especially, primary key values should never be related to the data in
                            > the record.[/color]

                            Yes it should.

                            Any field, or combination thereof, that will never be null and where
                            values will always be unique in a particular table is a "candidate key".
                            Any candidate key can be chosen to be the primary key.

                            In my example above, if I'm never going to need to store the same credit
                            card number twice, then I can use the card number as a primary key.

                            --
                            Toby A Inkster BSc (Hons) ARCS
                            Contact Me ~ http://tobyinkster.co.uk/contact

                            Comment

                            • Kenneth Downs

                              #15
                              Re: Database abstraction layers

                              Dikkie Dik wrote:
                              [color=blue]
                              >
                              > No, you don't. You know a FIELD value of your record, NOT its primary
                              > key. That you wish to combine them (that is always going to get you in
                              > trouble sooner or later) does not change anything. For clarity: A
                              > primary key is only a unique pointer to a record and nothing more.
                              > Especially, primary key values should never be related to the data in
                              > the record.
                              >[/color]

                              A natural primary key is one composed of one or more values that make up
                              part of the data being recorded. It sounds like our OP is describing a
                              natural key. Natural keys are a fundamental building block of relational
                              theory, going right back to Codd, allowing access to data only by the name
                              of the column values, w/o respect to any implementation-specific pointer
                              values or record numbers.

                              Your post describes what we usually call a surrogate key, to distinguish
                              from a natural primary key. The surrogate is generaly described as a
                              column that holds no business meaning, but which is unique. They are
                              usually implemented as sequential integers because those are handy, but
                              GUID's are sometimes used as well. Relational theorists often confuse
                              these with pointers and declare them to be evil, but they do not violate
                              any relational principles.

                              To make it more confusing, sometimes a natural key can be
                              computer-generated, such as a sales order #. That looks like a surrogate
                              to many people but it becomes meaningful outside of the system using it, so
                              it is really a natural key.



                              --
                              Kenneth Downs
                              Secure Data Software, Inc.
                              (Ken)nneth@(Sec )ure(Dat)a(.com )

                              Comment

                              Working...