most idiomatic way to "update or insert"?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Mark Harrison

    most idiomatic way to "update or insert"?

    So I have some data that I want to put into a table. If the
    row already exists (as defined by the primary key), I would
    like to update the row. Otherwise, I would like to insert
    the row.

    I've been doing something like

    delete from foo where name = 'xx';
    insert into foo values('xx',1,2 ,...);

    but I've been wondering if there's a more idiomatic or canonical
    way to do this.

    Many TIA,
    Mark

    --
    Mark Harrison
    Pixar Animation Studios

    ---------------------------(end of broadcast)---------------------------
    TIP 7: don't forget to increase your free space map settings

  • Peter Darley

    #2
    Re: most idiomatic way to "update or insert"?

    Mark,
    It's not canonical by any means, but what I do is:

    update foo set thing='stuff' where name = 'xx' and thing<>'stuff';
    insert into foo (name, thing) (select 'xx' as name, 'stuff' as thing where
    not exists (select 1 from foo where name='xx'));

    I believe if you put these on the same line it will be a single
    transaction. It has the benefit of not updating the row if there aren't
    real changes. It's plenty quick too, if name is indexed.

    Thanks,
    Peter Darley

    -----Original Message-----
    From: pgsql-general-owner@postgresq l.org
    [mailto:pgsql-general-owner@postgresq l.org]On Behalf Of Mark Harrison
    Sent: Wednesday, August 04, 2004 4:26 PM
    To: pgsql-general@postgre sql.org
    Subject: [GENERAL] most idiomatic way to "update or insert"?


    So I have some data that I want to put into a table. If the
    row already exists (as defined by the primary key), I would
    like to update the row. Otherwise, I would like to insert
    the row.

    I've been doing something like

    delete from foo where name = 'xx';
    insert into foo values('xx',1,2 ,...);

    but I've been wondering if there's a more idiomatic or canonical
    way to do this.

    Many TIA,
    Mark

    --
    Mark Harrison
    Pixar Animation Studios

    ---------------------------(end of broadcast)---------------------------
    TIP 7: don't forget to increase your free space map settings


    ---------------------------(end of broadcast)---------------------------
    TIP 7: don't forget to increase your free space map settings

    Comment

    • Greg Stark

      #3
      Re: most idiomatic way to &quot;update or insert&quot;?


      I'll mention that often I do exactly what you're doing. I find deleting all
      existing records and then inserting what I want to appear to be cleaner than
      handling the various cases that can arise if you don't.

      This happens most often when I have a list of items and have a UI that allows
      the user to edit the entire list and commit a whole new list in one action.
      It's much easier to simply delete the old list and insert the entire new list
      in a single query than to try to figure out which rows to delete and which to
      insert.

      --
      greg


      ---------------------------(end of broadcast)---------------------------
      TIP 4: Don't 'kill -9' the postmaster

      Comment

      • Lincoln Yeoh

        #4
        Re: most idiomatic way to &quot;update or insert&quot;?

        I don't think that works - there's a race condition if you do not do any
        locking.

        Why:
        Before a transaction that inserts rows is committed, other transactions are
        not aware of the inserted rows, so the select returns no rows.

        So:
        You can either create a unique index and catch insert duplicate failures.

        Or:
        lock the relevant tables, then do the select ... update/insert or insert
        .... select , or whatever it is you want to do.

        Or:
        both.

        Test it out yourself.

        At 07:51 AM 8/5/2004, Peter Darley wrote:[color=blue]
        >Mark,
        > It's not canonical by any means, but what I do is:
        >
        >update foo set thing='stuff' where name = 'xx' and thing<>'stuff';
        >insert into foo (name, thing) (select 'xx' as name, 'stuff' as thing where
        >not exists (select 1 from foo where name='xx'));
        >
        > I believe if you put these on the same line it will be a single
        >transaction. It has the benefit of not updating the row if there aren't
        >real changes. It's plenty quick too, if name is indexed.
        >
        >Thanks,
        >Peter Darley[/color]



        ---------------------------(end of broadcast)---------------------------
        TIP 7: don't forget to increase your free space map settings

        Comment

        • Pierre-Frédéric Caillaud

          #5
          Re: most idiomatic way to &quot;update or insert&quot;?


          I use stored procedures :

          create function insertorupdate( ....)
          UPDATE mytable WHERE ... SET ...
          IF NOT FOUND THEN
          INSERT INTO mytable ...
          END IF;


          You lose flecibility in your request though.

          I wish Postgresql had an INSERT OR UPDATE like MySQL does. So far it's
          the only thing that I regret from MySQL.

          ---------------------------(end of broadcast)---------------------------
          TIP 3: if posting/reading through Usenet, please send an appropriate
          subscribe-nomail command to majordomo@postg resql.org so that your
          message can get through to the mailing list cleanly

          Comment

          • Richard Huxton

            #6
            Re: most idiomatic way to &quot;update or insert&quot;?

            Mark Harrison wrote:[color=blue]
            > I've been doing something like
            >
            > delete from foo where name = 'xx';
            > insert into foo values('xx',1,2 ,...);
            >
            > but I've been wondering if there's a more idiomatic or canonical
            > way to do this.[/color]

            The delete+insert isn't quite the same as an update since you might have
            foreign keys referring to foo with "ON DELETE CASCADE" - oops, just lost
            all your dependant rows. Other people have warned about race conditions
            with insert/test/update.

            An "update or insert" would be useful sometimes, but it's not always
            necessary. Indeed, if I find I don't know whether I'm adding or updating
            something I take a long hard look at my design - it ususally means I've
            not thought clearly about something.

            For a "running total" table it can make more sense to have an entry with
            a total of 0 created automatically via a trigger. Likewise with some
            other summary tables.

            Can you give an actual example of where you need this?

            --
            Richard Huxton
            Archonet Ltd

            ---------------------------(end of broadcast)---------------------------
            TIP 3: if posting/reading through Usenet, please send an appropriate
            subscribe-nomail command to majordomo@postg resql.org so that your
            message can get through to the mailing list cleanly

            Comment

            • Tommi Maekitalo

              #7
              Re: most idiomatic way to &quot;update or insert&quot;?

              Hi,

              I prefer to update and if the number of updated rows equals 0 do an insert. So
              in case of update I need only one roundtrip. If insert is far more common in
              this case it might be better try insert and catch the error. But I try to
              avoid running on an error intentionally.

              First delete and then insert works but needs 2 SQL-statements in every case.
              And the database need to update indexes at least once. There might be also
              problems with cascaded deletes.


              Tommi

              Am Donnerstag, 5. August 2004 01:25 schrieb Mark Harrison:[color=blue]
              > So I have some data that I want to put into a table. If the
              > row already exists (as defined by the primary key), I would
              > like to update the row. Otherwise, I would like to insert
              > the row.
              >
              > I've been doing something like
              >
              > delete from foo where name = 'xx';
              > insert into foo values('xx',1,2 ,...);
              >
              > but I've been wondering if there's a more idiomatic or canonical
              > way to do this.
              >
              > Many TIA,
              > Mark[/color]

              ---------------------------(end of broadcast)---------------------------
              TIP 2: you can get off all lists at once with the unregister command
              (send "unregister YourEmailAddres sHere" to majordomo@postg resql.org)

              Comment

              • Peter Darley

                #8
                Re: most idiomatic way to &quot;update or insert&quot;?

                Lincoln,
                It works for me...
                I think what you said is wrong because it updates first (if there is a row
                to update), then inserts. If there is a row to update the insert won't
                insert anything. If there is no row to update the insert inserts a row.
                Either way, the insert is the last thing in the transaction. Plus, as shown
                in the code to follow, I have almost this exact thing in my application and
                I know that it does work for me. :)

                Code (Perl):
                $Neo::DB::Neo->do ("UPDATE Sample_Settings SET Value=" . Quote($Args{Val ue})
                .. " WHERE Sample_ID=" . Quote($Args{Sam pleID}) . " AND Setting=" .
                Quote($Args{Set ting}) . "; INSERT INTO Sample_Settings (Sample_ID, Setting,
                Value) (SELECT " . Quote($Args{Sam pleID}) . ", " . Quote($Args{Set ting}) .
                ", " . Quote($Args{Val ue}) . " WHERE NOT EXISTS (SELECT 1 FROM
                Sample_Settings WHERE Sample_ID=" . Quote($Args{Sam pleID}) . " AND Setting="
                .. Quote($Args{Set ting}) . "));");

                Thanks,
                Peter Darley

                -----Original Message-----
                From: Lincoln Yeoh [mailto:lyeoh@po p.jaring.my]
                Sent: Wednesday, August 04, 2004 6:49 PM
                To: Peter Darley; Mark Harrison; pgsql-general@postgre sql.org
                Subject: Re: [GENERAL] most idiomatic way to "update or insert"?


                I don't think that works - there's a race condition if you do not do any
                locking.

                Why:
                Before a transaction that inserts rows is committed, other transactions are
                not aware of the inserted rows, so the select returns no rows.

                So:
                You can either create a unique index and catch insert duplicate failures.

                Or:
                lock the relevant tables, then do the select ... update/insert or insert
                .... select , or whatever it is you want to do.

                Or:
                both.

                Test it out yourself.

                At 07:51 AM 8/5/2004, Peter Darley wrote:[color=blue]
                >Mark,
                > It's not canonical by any means, but what I do is:
                >
                >update foo set thing='stuff' where name = 'xx' and thing<>'stuff';
                >insert into foo (name, thing) (select 'xx' as name, 'stuff' as thing where
                >not exists (select 1 from foo where name='xx'));
                >
                > I believe if you put these on the same line it will be a single
                >transaction. It has the benefit of not updating the row if there aren't
                >real changes. It's plenty quick too, if name is indexed.
                >
                >Thanks,
                >Peter Darley[/color]




                ---------------------------(end of broadcast)---------------------------
                TIP 1: subscribe and unsubscribe commands go to majordomo@postg resql.org

                Comment

                • Csaba Nagy

                  #9
                  Re: most idiomatic way to &quot;update or insert&quot;?

                  Peter,

                  The "does not work" part is not refering to the method not working at
                  all, but to the fact that it is not safe when you have multiple
                  transactions operating on the same row at the same time.
                  There are plenty of discussions about the insert-or-update race
                  conditions on this list, and the final conclusion was always that it is
                  not possible to solve the race condition without being prepared to catch
                  exceptions and retry the whole thing until it succedes...
                  The reason of the race condition: let's say 2 transactions A and B try
                  to insert-or-update the same row which does not exist. They do the
                  update statement at the same time, and BOTH OF THEM gets as a result
                  that no rows were updated, since the row does not exist yet. Now both
                  transactions try to insert the row, and obviously one of them will fail.
                  So your code must be prepared that the insert can fail, and in that case
                  it should retry with the update.
                  People tried to devise a method to avoid the race condition and throwing
                  exception, but it is just not possible.
                  Now the one bad thing in postgres which people complained about in this
                  context is that the transaction gets rolled back on any error, so
                  actually instead of just retrying the update, you will have to redo your
                  whole transaction.

                  HTH,
                  Csaba.


                  On Thu, 2004-08-05 at 15:28, Peter Darley wrote:[color=blue]
                  > Lincoln,
                  > It works for me...
                  > I think what you said is wrong because it updates first (if there is a row
                  > to update), then inserts. If there is a row to update the insert won't
                  > insert anything. If there is no row to update the insert inserts a row.
                  > Either way, the insert is the last thing in the transaction. Plus, as shown
                  > in the code to follow, I have almost this exact thing in my application and
                  > I know that it does work for me. :)
                  >
                  > Code (Perl):
                  > $Neo::DB::Neo->do ("UPDATE Sample_Settings SET Value=" . Quote($Args{Val ue})
                  > . " WHERE Sample_ID=" . Quote($Args{Sam pleID}) . " AND Setting=" .
                  > Quote($Args{Set ting}) . "; INSERT INTO Sample_Settings (Sample_ID, Setting,
                  > Value) (SELECT " . Quote($Args{Sam pleID}) . ", " . Quote($Args{Set ting}) .
                  > ", " . Quote($Args{Val ue}) . " WHERE NOT EXISTS (SELECT 1 FROM
                  > Sample_Settings WHERE Sample_ID=" . Quote($Args{Sam pleID}) . " AND Setting="
                  > . Quote($Args{Set ting}) . "));");
                  >
                  > Thanks,
                  > Peter Darley
                  >
                  > -----Original Message-----
                  > From: Lincoln Yeoh [mailto:lyeoh@po p.jaring.my]
                  > Sent: Wednesday, August 04, 2004 6:49 PM
                  > To: Peter Darley; Mark Harrison; pgsql-general@postgre sql.org
                  > Subject: Re: [GENERAL] most idiomatic way to "update or insert"?
                  >
                  >
                  > I don't think that works - there's a race condition if you do not do any
                  > locking.
                  >
                  > Why:
                  > Before a transaction that inserts rows is committed, other transactions are
                  > not aware of the inserted rows, so the select returns no rows.
                  >
                  > So:
                  > You can either create a unique index and catch insert duplicate failures.
                  >
                  > Or:
                  > lock the relevant tables, then do the select ... update/insert or insert
                  > ... select , or whatever it is you want to do.
                  >
                  > Or:
                  > both.
                  >
                  > Test it out yourself.
                  >
                  > At 07:51 AM 8/5/2004, Peter Darley wrote:[color=green]
                  > >Mark,
                  > > It's not canonical by any means, but what I do is:
                  > >
                  > >update foo set thing='stuff' where name = 'xx' and thing<>'stuff';
                  > >insert into foo (name, thing) (select 'xx' as name, 'stuff' as thing where
                  > >not exists (select 1 from foo where name='xx'));
                  > >
                  > > I believe if you put these on the same line it will be a single
                  > >transaction. It has the benefit of not updating the row if there aren't
                  > >real changes. It's plenty quick too, if name is indexed.
                  > >
                  > >Thanks,
                  > >Peter Darley[/color]
                  >
                  >
                  >
                  >
                  > ---------------------------(end of broadcast)---------------------------
                  > TIP 1: subscribe and unsubscribe commands go to majordomo@postg resql.org[/color]


                  ---------------------------(end of broadcast)---------------------------
                  TIP 6: Have you searched our list archives?



                  Comment

                  • Mike Mascari

                    #10
                    Re: most idiomatic way to &quot;update or insert&quot;?

                    Peter Darley wrote:[color=blue]
                    > Lincoln, It works for me... I think what you said is wrong
                    > because it updates first (if there is a row to update), then
                    > inserts. If there is a row to update the insert won't insert
                    > anything. If there is no row to update the insert inserts a row.
                    > Either way, the insert is the last thing in the transaction.
                    > Plus, as shown in the code to follow, I have almost this exact
                    > thing in my application and I know that it does work for me. :)[/color]

                    You're getting lucky. I suggested the same thing four years ago. The
                    race condition is still there:



                    HTH,

                    Mike Mascari


                    ---------------------------(end of broadcast)---------------------------
                    TIP 9: the planner will ignore your desire to choose an index scan if your
                    joining column's datatypes do not match

                    Comment

                    • Peter Darley

                      #11
                      Re: most idiomatic way to &quot;update or insert&quot;?

                      Mike,
                      Ahha! I didn't understand what the objection was. I guess I am getting
                      lucky. :)
                      It seems to me that this is true with any concurrent inserts, isn't it?
                      One will succeed and one will fail.
                      Thanks,
                      Peter Darley

                      -----Original Message-----
                      From: Mike Mascari [mailto:mascarm@ mascari.com]
                      Sent: Thursday, August 05, 2004 6:51 AM
                      To: Peter Darley
                      Cc: Lincoln Yeoh; Mark Harrison; pgsql-general@postgre sql.org
                      Subject: Re: [GENERAL] most idiomatic way to "update or insert"?


                      Peter Darley wrote:[color=blue]
                      > Lincoln, It works for me... I think what you said is wrong
                      > because it updates first (if there is a row to update), then
                      > inserts. If there is a row to update the insert won't insert
                      > anything. If there is no row to update the insert inserts a row.
                      > Either way, the insert is the last thing in the transaction.
                      > Plus, as shown in the code to follow, I have almost this exact
                      > thing in my application and I know that it does work for me. :)[/color]

                      You're getting lucky. I suggested the same thing four years ago. The
                      race condition is still there:


                      ..pgh.pa.us

                      HTH,

                      Mike Mascari



                      ---------------------------(end of broadcast)---------------------------
                      TIP 8: explain analyze is your friend

                      Comment

                      • award@dominionsciences.com

                        #12
                        Re: most idiomatic way to &quot;update or insert&quot;?

                        > An "update or insert" would be useful sometimes, but it's not always[color=blue]
                        > necessary. Indeed, if I find I don't know whether I'm adding or updating
                        > something I take a long hard look at my design - it ususally means I've
                        > not thought clearly about something.[/color]
                        ....[color=blue]
                        > Can you give an actual example of where you need this?[/color]

                        We have an environment where our data collection occurs by screen scraping
                        (er, web scraping?). Unfortunately, it takes two passes, once across
                        search results which provide partial data, then a second time over a
                        detail page loaded for each item in the search results we were given.
                        Since time is of the essence, we provide the partial data to our
                        customers, which means dealing with the insert or update. Additionally,
                        the process is multithreaded, so search results can be touching things
                        concurrently with details being loaded, otherwise we can't keep up.

                        I dealt with the problem by wrapping every touch of an item in a single
                        transaction with a loop around it, as has been recommended here many times
                        before. Any DB-exception (Python) inside the loop caused by concurrency
                        type problems causes a restart. As it turns out, the insert/update race
                        has yet to result in a retry. The real payoff in this design has proven to
                        be dealing with FK locking... without putting way more effort into fixing
                        it than the deadlocks are worth, we get around a dozen deadlocks a day
                        that are automatically retried.

                        ---------------------------(end of broadcast)---------------------------
                        TIP 9: the planner will ignore your desire to choose an index scan if your
                        joining column's datatypes do not match

                        Comment

                        • Ron St-Pierre

                          #13
                          Re: most idiomatic way to &quot;update or insert&quot;?

                          Richard Huxton wrote:
                          [color=blue]
                          >
                          > An "update or insert" would be useful sometimes, but it's not always
                          > necessary. Indeed, if I find I don't know whether I'm adding or
                          > updating something I take a long hard look at my design - it ususally
                          > means I've not thought clearly about something.[/color]

                          [color=blue]
                          > Can you give an actual example of where you need this?[/color]

                          How about stocks for a simple example? Let's say you have a simple table
                          with the stock symbol, stock exchange, high, low, open, close and
                          volume. Every day you update the data for each stock. But there are
                          always new stocks listed on an exchange, so when a new stock shows up
                          you have to do an insert instead of an update.

                          Ron



                          ---------------------------(end of broadcast)---------------------------
                          TIP 1: subscribe and unsubscribe commands go to majordomo@postg resql.org

                          Comment

                          • Richard Huxton

                            #14
                            Re: most idiomatic way to &quot;update or insert&quot;?

                            Ron St-Pierre wrote:[color=blue]
                            > Richard Huxton wrote:
                            >[color=green]
                            >>
                            >> An "update or insert" would be useful sometimes, but it's not always
                            >> necessary. Indeed, if I find I don't know whether I'm adding or
                            >> updating something I take a long hard look at my design - it ususally
                            >> means I've not thought clearly about something.[/color]
                            >[color=green]
                            >> Can you give an actual example of where you need this?[/color]
                            >
                            > How about stocks for a simple example? Let's say you have a simple table
                            > with the stock symbol, stock exchange, high, low, open, close and
                            > volume. Every day you update the data for each stock. But there are
                            > always new stocks listed on an exchange, so when a new stock shows up
                            > you have to do an insert instead of an update.[/color]

                            If it is just a simple table then delete all of them and insert from
                            scratch. If you wanted to track changes over time (perhaps more likely),
                            you'd have a separate table with the company name/address etc and a log
                            table. At which point you'll want to know if it's a new company or not...

                            --
                            Richard Huxton
                            Archonet Ltd

                            ---------------------------(end of broadcast)---------------------------
                            TIP 2: you can get off all lists at once with the unregister command
                            (send "unregister YourEmailAddres sHere" to majordomo@postg resql.org)

                            Comment

                            • Ron St-Pierre

                              #15
                              Re: most idiomatic way to &quot;update or insert&quot;?

                              Richard Huxton wrote:
                              [color=blue]
                              > Ron St-Pierre wrote:
                              >[color=green]
                              >> Richard Huxton wrote:
                              >>[color=darkred]
                              >>>
                              >>> An "update or insert" would be useful sometimes, but it's not always
                              >>> necessary. Indeed, if I find I don't know whether I'm adding or
                              >>> updating something I take a long hard look at my design - it
                              >>> ususally means I've not thought clearly about something.[/color]
                              >>
                              >>[color=darkred]
                              >>> Can you give an actual example of where you need this?[/color]
                              >>
                              >>
                              >> How about stocks for a simple example? Let's say you have a simple
                              >> table with the stock symbol, stock exchange, high, low, open, close
                              >> and volume. Every day you update the data for each stock. But there
                              >> are always new stocks listed on an exchange, so when a new stock
                              >> shows up you have to do an insert instead of an update.[/color]
                              >
                              >
                              > If it is just a simple table then delete all of them and insert from
                              > scratch. If you wanted to track changes over time (perhaps more
                              > likely), you'd have a separate table with the company name/address etc
                              > and a log table. At which point you'll want to know if it's a new
                              > company or not...
                              >[/color]
                              Okay, this simple example really exists, but the simple table also
                              includes a date that the stock was last traded, so we have:
                              stock symbol, stock exchange, high, low, open, close, volume, date, plus
                              a few more fields

                              But the data isn't always updated at one time, as we can update all
                              stocks for one exhange and possibly only some of the stocks for a
                              particular exchange in one go. Even if the data is received for only one
                              exchange we could delete all stocks for that exchange and insert new
                              ones, which would work fine. However some stocks are not traded every
                              day, so we need to show the trading information for the last date that
                              it was traded, so we can't delete them en masse even for the one exchange.

                              BTW these updates do take longer than we'd like so I would appreciate
                              more input on how this setup could be redesigned.

                              Ron




                              ---------------------------(end of broadcast)---------------------------
                              TIP 7: don't forget to increase your free space map settings

                              Comment

                              Working...