Buglist

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Bruno Wolff III

    #31
    Re: Buglist

    On Wed, Aug 20, 2003 at 10:31:25 -0400,
    Vivek Khera <khera@kcilink. com> wrote:[color=blue]
    >
    > I care for the performance. And how are you so sure that it was
    > faster the way it is now? Are you sure it was not done this way
    > because of ease of implementation?
    >
    > Seriously, how much slower can it be if the backend were to do the
    > checking for external references upon updating/deleting a row? The
    > cost would be distributed across time as opposed to concentrated at
    > once within a vacuum process. I am fairly certian it would reduce
    > disk bandwidth requirements since at least one necessary page will
    > already be in memory.[/color]

    It would probably be a lot slower. Any transaction that has started
    but not yet finished would need to lock all rows that exist at during
    the transaction (for serialized transaction isolation you would only
    need to worry about rows that existed at the start of the transaction
    or that were modified by the transaction). Immediately that is a big
    deal since a typical query may need to lock a bunch of rows that it
    will never actually touch (but you don't know that when the transaction
    starts). Managing all those locks would take up a lot of semiconductor
    memory or a lot of disk writes and be a major source of contention.
    The current system just has to mark rows when they are created and
    deleted (an update does both operations). The normal vacuum clean up
    actually isn't going to be much worse than what you would need to do
    at both the start and end of each transaction. The overhead of letting
    dead rows hang around for a while after they aren't needed isn't that high.

    Also, since at least 7.3, normal vacuums aren't normally going to affect the
    performance of your database server that much.

    The main issue against the current vacuum system is that it requires the
    DBA knowing what vacuum does and figuring out how it should be used in
    their situation to get reasonable performance. This makes it a bit harder
    for non-DBAs to jump right in to Postgres without running into problems.

    However, the work on autovacuum seems to be providing a reasonable solution
    to that problem.

    ---------------------------(end of broadcast)---------------------------
    TIP 1: subscribe and unsubscribe commands go to majordomo@postg resql.org

    Comment

    • scott.marlowe

      #32
      Re: Buglist

      On Wed, 20 Aug 2003, Vivek Khera wrote:
      [color=blue][color=green][color=darkred]
      > >>>>> "BW" == Bruno Wolff, <Bruno> writes:[/color][/color]
      >[color=green][color=darkred]
      > >> to see it incremental. This would result in pretty much near zero
      > >> internal fragmentation, I think.[/color][/color]
      >
      > BW> Why do you care about about the details of the implementation (rather than
      > BW> the performance)? If it were faster to do it that way, that's how it would
      > BW> have been done in the first place. The cost of doing the above is almost
      > BW> certainly going to be an overall performance loser.
      >
      > I care for the performance. And how are you so sure that it was
      > faster the way it is now? Are you sure it was not done this way
      > because of ease of implementation?
      >
      > Seriously, how much slower can it be if the backend were to do the
      > checking for external references upon updating/deleting a row? The
      > cost would be distributed across time as opposed to concentrated at
      > once within a vacuum process. I am fairly certian it would reduce
      > disk bandwidth requirements since at least one necessary page will
      > already be in memory.[/color]

      Time for a mental exercise.

      Our server has 2 users. Each backend has to check with all the other
      backends when it deletes a tuple (every update is also a delete, remember
      every change in an MVCC database is a create / delte cycle.) Let's create
      a name for the time it takes to do the update / mark deleted versus the
      time it takes to contact each of those other backends. Tw is the Time to
      do the work here, and Tc is the time to do the cleanup (i.e. vacuum the
      tuple) Note that we'd also need a Ta for answering the requests of all
      the other backends, but we can assume that on average, for each request a
      child process makes, it will receive exactly that many from each other
      backend running. Let x represent the number of backends. So the answer
      time is equal to x*Tc

      Time = Tw + Tc + Ta

      Time = Tw + Tc + (x * Tc)

      Time = Tw + ((x+1) * Tc)

      and our cleanup time starts to grow at an ugly rate as the number of
      backends increases. Lazy vacuuming allows the database to reclaim lost
      space in the background, as the newer non-full vacuum does.

      Many folks mistake this vacuum process for its older, slower cousin, full
      vacuum, which does eat a lot more disk bandwidth and slow the machine
      down.

      On a Dual CPU X86 box a lazy vacuum running in a continuous loop will eat
      about 5% of one CPU and drop pgbench scores by 10 to 15%. The important
      thing here, is that the machine will still run quite snappily when you
      throw several hundred clients at it, since the lazy vacuum just sits in
      the background using the spare cycles and not much more.

      that means your storage usage may baloon somewhat under intense usage, but
      you won't have an IPC storm kill the performance of the postgresql server.

      Knowing the postgresql development team, I'm sure the reasons they chose
      are clearly stated in the hackers mailing list somewhere in time, so I'm
      gonna go look, but trust me on one thing, the guys programming this
      database don't do much because it's easier / faster to implement without
      putting something in the TODO list about making it better some day.


      ---------------------------(end of broadcast)---------------------------
      TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to majordomo@postg resql.org so that your
      message can get through to the mailing list cleanly

      Comment

      • Stephan Szabo

        #33
        Re: Buglist

        On Wed, 20 Aug 2003, Vivek Khera wrote:
        [color=blue][color=green][color=darkred]
        > >>>>> "BW" == Bruno Wolff, <Bruno> writes:[/color][/color]
        >[color=green][color=darkred]
        > >> to see it incremental. This would result in pretty much near zero
        > >> internal fragmentation, I think.[/color][/color]
        >
        > BW> Why do you care about about the details of the implementation (rather than
        > BW> the performance)? If it were faster to do it that way, that's how it would
        > BW> have been done in the first place. The cost of doing the above is almost
        > BW> certainly going to be an overall performance loser.
        >
        > I care for the performance. And how are you so sure that it was
        > faster the way it is now? Are you sure it was not done this way
        > because of ease of implementation?
        >
        > Seriously, how much slower can it be if the backend were to do the
        > checking for external references upon updating/deleting a row? The
        > cost would be distributed across time as opposed to concentrated at
        > once within a vacuum process. I am fairly certian it would reduce
        > disk bandwidth requirements since at least one necessary page will
        > already be in memory.[/color]

        In general, excepting a case with only one session running, I would expect
        there to exist one or more transactions that can see the old state of the
        row.

        I would think it'd be something like:

        T1: begin;
        T1: update foo set bar=cheese where id=3;
        T2: begin;
        T2: select * from foo;
        -- this should see the old state of the row, so that update better
        -- not have removed the old one.
        T1: end;
        -- at this point read committed transactions may see the new
        -- state, unless it's currently running a query which is accessing
        -- that table, and in practice it's probably more complicated than
        -- that (since you might run functions tha access the table).
        -- You'd presumably need to wait until say the end of that full
        -- statement at the very least. And any serializable transactions
        -- would probably mean you can't do it either.

        Under some circumstances it might be better, yes. And there are ways to
        do things that don't involve non-overwriting (like Oracle's system) but
        they all have downsides. This might not be the best solution, but you're
        going to have to put alot more work into showing that your solution is
        better. ;)



        ---------------------------(end of broadcast)---------------------------
        TIP 4: Don't 'kill -9' the postmaster

        Comment

        • Jan Wieck

          #34
          Re: Buglist

          Alvaro Herrera wrote:
          [color=blue]
          > On Wed, Aug 20, 2003 at 10:31:25AM -0400, Vivek Khera wrote:
          >[color=green]
          >> Seriously, how much slower can it be if the backend were to do the
          >> checking for external references upon updating/deleting a row? The
          >> cost would be distributed across time as opposed to concentrated at
          >> once within a vacuum process. I am fairly certian it would reduce
          >> disk bandwidth requirements since at least one necessary page will
          >> already be in memory.[/color]
          >
          > There's no way to check for "external references", because said
          > references are actually the running transactions. So you can't drop a
          > row until all the transactions that were active during your transaction
          > are finished. Certainly your own backend can't do the checking, because
          > there's no way to even assert that it will be live when those other
          > transactions finish. Who will? The last of said transactions? It
          > certainly will be expensive for a backend to keep track of the
          > deleted/updated tuples by all _other_ backends, just in case...
          > Long running transactions can't be neglected, so you can't keep it in
          > memory.[/color]

          This isn't so. It's regular backends that turn on the flag in a tuples
          header telling that it's committed deleted. And they do so during a
          normal scan, so even during a SELECT. It's not the backend who actually
          deleted a row that knows, it's a later one checking the tuple visibility
          that can tell "this one's dead for all of us and will not come alive via
          rollback any more".

          But, that doesn't help for the index removal explained earlier.


          Jan

          --
          #============== =============== =============== =============== ===========#
          # It's easier to get forgiveness for being wrong than for being right. #
          # Let's break this rule - forgive me. #
          #============== =============== =============== ====== JanWieck@Yahoo. com #


          ---------------------------(end of broadcast)---------------------------
          TIP 9: the planner will ignore your desire to choose an index scan if your
          joining column's datatypes do not match

          Comment

          • Vivek Khera

            #35
            Re: Buglist

            >>>>> "BW" == Bruno Wolff, <Bruno> writes:

            BW> It would probably be a lot slower. Any transaction that has started
            BW> but not yet finished would need to lock all rows that exist at during
            BW> the transaction (for serialized transaction isolation you would only

            Why would you need to lock rows? Does the current vacuum need it? I
            don't think it does. Why can't the functionality of vacuum be made to
            operate incrementally per row delete/update? I don't know if it is
            possible.

            BW> Also, since at least 7.3, normal vacuums aren't normally going to
            BW> affect the performance of your database server that much.

            I disagree. Triggering a vacuum on a db that is nearly saturating the
            disk bandwidth has a significant impact.

            BW> The main issue against the current vacuum system is that it requires the
            BW> DBA knowing what vacuum does and figuring out how it should be used in
            BW> their situation to get reasonable performance. This makes it a bit harder
            BW> for non-DBAs to jump right in to Postgres without running into problems.

            BW> However, the work on autovacuum seems to be providing a reasonable solution
            BW> to that problem.

            Yes, this is a good thing.

            ---------------------------(end of broadcast)---------------------------
            TIP 4: Don't 'kill -9' the postmaster

            Comment

            • Bruno Wolff III

              #36
              Re: Buglist

              On Wed, Aug 20, 2003 at 12:40:03 -0400,
              Vivek Khera <khera@kcilink. com> wrote:[color=blue][color=green][color=darkred]
              > >>>>> "BW" == Bruno Wolff, <Bruno> writes:[/color][/color]
              >
              > BW> It would probably be a lot slower. Any transaction that has started
              > BW> but not yet finished would need to lock all rows that exist at during
              > BW> the transaction (for serialized transaction isolation you would only
              >
              > Why would you need to lock rows? Does the current vacuum need it? I
              > don't think it does. Why can't the functionality of vacuum be made to
              > operate incrementally per row delete/update? I don't know if it is
              > possible.[/color]

              How do you plan on knowing when no one is using a row any more?
              The way vacuum uses is not suitable for deleting the row after the
              last transaction that can see the row completes. When rows are created
              they are marked with the transaction id of the transaction that created
              the row. When they are deleted they are marked with the transaction id
              of the transaction that deleted the row. Any transaction with an id
              between those two ids can see the row. So it isn't the transaction that
              deletes a row that needs to worry about marking its space as available,
              but the last transaction that a row is visible to that would have to
              do it. Naively, this entails scanning the whole database, just like a vacuum
              would, after each completed transaction.
              [color=blue]
              > BW> Also, since at least 7.3, normal vacuums aren't normally going to
              > BW> affect the performance of your database server that much.
              >
              > I disagree. Triggering a vacuum on a db that is nearly saturating the
              > disk bandwidth has a significant impact.[/color]

              If you are running that close to the edge you have potential problems
              of which running vacuum is just one.

              ---------------------------(end of broadcast)---------------------------
              TIP 6: Have you searched our list archives?



              Comment

              • Vivek Khera

                #37
                Re: Buglist

                >>>>> "JW" == Jan Wieck <JanWieck@Yahoo .com> writes:

                JW> remove all the index entries pointing to these ctid's. Your idea is (so
                JW> far) lacking a place where to remember all the single removed rows and I
                JW> assume you're not planning to pay the cost of a full scan over all
                JW> indexes of a table to reclaim the space of one data row, are you?

                Well, that pretty much kills my idea... back to autovacuum ;-)


                ---------------------------(end of broadcast)---------------------------
                TIP 3: if posting/reading through Usenet, please send an appropriate
                subscribe-nomail command to majordomo@postg resql.org so that your
                message can get through to the mailing list cleanly

                Comment

                • Dennis Gearon

                  #38
                  Re: Buglist

                  I don't think the problem will be deleting the tuple itself, if I've been following this correclty. If every transaction left it's signature in some common, global array, where each transaction finishing its work could find out 'who's out there', killing unnneed transactions could be part of the exit routine.

                  But getting the indexes cleaned up, that's another story, from what I've been listening to.

                  Bruno Wolff III wrote:
                  [color=blue]
                  > On Wed, Aug 20, 2003 at 12:40:03 -0400,
                  > Vivek Khera <khera@kcilink. com> wrote:
                  >[color=green][color=darkred]
                  >>>>>>>"BW" == Bruno Wolff, <Bruno> writes:[/color]
                  >>
                  >>BW> It would probably be a lot slower. Any transaction that has started
                  >>BW> but not yet finished would need to lock all rows that exist at during
                  >>BW> the transaction (for serialized transaction isolation you would only
                  >>
                  >>Why would you need to lock rows? Does the current vacuum need it? I
                  >>don't think it does. Why can't the functionality of vacuum be made to
                  >>operate incrementally per row delete/update? I don't know if it is
                  >>possible.[/color]
                  >
                  >
                  > How do you plan on knowing when no one is using a row any more?
                  > The way vacuum uses is not suitable for deleting the row after the
                  > last transaction that can see the row completes. When rows are created
                  > they are marked with the transaction id of the transaction that created
                  > the row. When they are deleted they are marked with the transaction id
                  > of the transaction that deleted the row. Any transaction with an id
                  > between those two ids can see the row. So it isn't the transaction that
                  > deletes a row that needs to worry about marking its space as available,
                  > but the last transaction that a row is visible to that would have to
                  > do it. Naively, this entails scanning the whole database, just like a vacuum
                  > would, after each completed transaction.
                  >
                  >[color=green]
                  >>BW> Also, since at least 7.3, normal vacuums aren't normally going to
                  >>BW> affect the performance of your database server that much.
                  >>
                  >>I disagree. Triggering a vacuum on a db that is nearly saturating the
                  >>disk bandwidth has a significant impact.[/color]
                  >
                  >
                  > If you are running that close to the edge you have potential problems
                  > of which running vacuum is just one.
                  >
                  > ---------------------------(end of broadcast)---------------------------
                  > TIP 6: Have you searched our list archives?
                  >
                  > http://archives.postgresql.org
                  >[/color]


                  ---------------------------(end of broadcast)---------------------------
                  TIP 2: you can get off all lists at once with the unregister command
                  (send "unregister YourEmailAddres sHere" to majordomo@postg resql.org)

                  Comment

                  • Jan Wieck

                    #39
                    Re: Buglist

                    Vivek Khera wrote:
                    [color=blue][color=green][color=darkred]
                    >>>>>> "JW" == Jan Wieck <JanWieck@Yahoo .com> writes:[/color][/color]
                    >
                    > JW> remove all the index entries pointing to these ctid's. Your idea is (so
                    > JW> far) lacking a place where to remember all the single removed rows and I
                    > JW> assume you're not planning to pay the cost of a full scan over all
                    > JW> indexes of a table to reclaim the space of one data row, are you?
                    >
                    > Well, that pretty much kills my idea... back to autovacuum ;-)[/color]

                    Sorry that.

                    While at the "reclaim" and [non-] overwriting topic, the other day I had
                    some brilliant idea about index entries, so here comes your chance for
                    revenge ...

                    Assume we would have multiple ctid slots per index entry. An UPDATE on a
                    row that doesn't change indexed fields allways inserts the same index
                    value with a new ctid, so it'll come across this index tuple anyway. If
                    said index tuple has an empty slot, just put the ctid there and done. If
                    it does not, check the ctid's that are there if they point to "known
                    dead and ready to vacuum" tuples and make space. If that fails too, well
                    then insert another index tuple.

                    This has some advantages. Updating rows without changing key fields does
                    not result in constant index growth. An index scan will be fast even for
                    a value with lots of dead heap tuples. For non-unique indexes, multiple
                    data rows can "share" one index entry.

                    Index cleanup for vacuum remains pretty easy. Clear the ctid slots
                    pointing to removed tuples. Remove index tuples that have all slots cleared.



                    Jan

                    --
                    #============== =============== =============== =============== ===========#
                    # It's easier to get forgiveness for being wrong than for being right. #
                    # Let's break this rule - forgive me. #
                    #============== =============== =============== ====== JanWieck@Yahoo. com #


                    ---------------------------(end of broadcast)---------------------------
                    TIP 4: Don't 'kill -9' the postmaster

                    Comment

                    • Tom Lane

                      #40
                      Re: Buglist

                      Vivek Khera <khera@kcilink. com> writes:[color=blue]
                      > "JW" == Jan Wieck <JanWieck@Yahoo .com> writes:
                      > JW> remove all the index entries pointing to these ctid's. Your idea is (so
                      > JW> far) lacking a place where to remember all the single removed rows and I
                      > JW> assume you're not planning to pay the cost of a full scan over all
                      > JW> indexes of a table to reclaim the space of one data row, are you?[/color]
                      [color=blue]
                      > Well, that pretty much kills my idea... back to autovacuum ;-)[/color]

                      In addition to the index-cleanup issue that Jan explained, there are
                      locking problems. The tuple-is-dead hint bit mechanism is very
                      carefully designed so that a backend can set the hint bits while holding
                      only a shared lock on the page containing the tuple. Physically
                      removing a tuple requires a far stronger lock (see the source code for
                      details). Thus, having ordinary interactive backends remove tuples
                      would have bad consequences for concurrent performance.

                      But I think the real point here is that there's no reason to think that
                      doing tuple deletion on-the-fly in foreground transactions is superior
                      to doing it in background with a vacuum process. You're taking what
                      should be noncritical maintenance work and moving it into the critical
                      paths of your foreground applications. Not only that, but you're
                      probably doing more total work per tuple --- VACUUM "batches" its work
                      in more ways than just the index cleanup aspect, IIRC.

                      regards, tom lane

                      ---------------------------(end of broadcast)---------------------------
                      TIP 8: explain analyze is your friend

                      Comment

                      • Andrew Sullivan

                        #41
                        Re: Buglist

                        On Wed, Aug 20, 2003 at 12:40:03PM -0400, Vivek Khera wrote:[color=blue][color=green][color=darkred]
                        > >>>>> "BW" == Bruno Wolff, <Bruno> writes:[/color][/color]
                        > BW> Also, since at least 7.3, normal vacuums aren't normally going to
                        > BW> affect the performance of your database server that much.
                        >
                        > I disagree. Triggering a vacuum on a db that is nearly saturating the
                        > disk bandwidth has a significant impact.[/color]

                        Vivek is right about this. If your system is already very busy, then
                        a vacuum on a largish table is painful.

                        I don't actually think having the process done in real time will
                        help, though -- it seems to me what would be more useful is an even
                        lazier vacuum: something that could be told "clean up as cycles are
                        available, but make sure you stay out of the way." Of course, that's
                        easy to say glibly, and mighty hard to do, I expect.

                        A

                        --
                        ----
                        Andrew Sullivan 204-4141 Yonge Street
                        Liberty RMS Toronto, Ontario Canada
                        <andrew@liberty rms.info> M2P 2A8
                        +1 416 646 3304 x110


                        ---------------------------(end of broadcast)---------------------------
                        TIP 2: you can get off all lists at once with the unregister command
                        (send "unregister YourEmailAddres sHere" to majordomo@postg resql.org)

                        Comment

                        • Jan Wieck

                          #42
                          Re: Buglist

                          Andrew Sullivan wrote:
                          [color=blue]
                          > On Wed, Aug 20, 2003 at 12:40:03PM -0400, Vivek Khera wrote:[color=green][color=darkred]
                          >> >>>>> "BW" == Bruno Wolff, <Bruno> writes:[/color]
                          >> BW> Also, since at least 7.3, normal vacuums aren't normally going to
                          >> BW> affect the performance of your database server that much.
                          >>
                          >> I disagree. Triggering a vacuum on a db that is nearly saturating the
                          >> disk bandwidth has a significant impact.[/color]
                          >
                          > Vivek is right about this. If your system is already very busy, then
                          > a vacuum on a largish table is painful.
                          >
                          > I don't actually think having the process done in real time will
                          > help, though -- it seems to me what would be more useful is an even
                          > lazier vacuum: something that could be told "clean up as cycles are
                          > available, but make sure you stay out of the way." Of course, that's
                          > easy to say glibly, and mighty hard to do, I expect.[/color]

                          What about a little hint to the buffer management that if it has to
                          evict another buffer to physically read this one (meaning the buffer
                          pool was full already) then it will not put this buffer at the top of
                          the LRU chain but rather at it's end? This way a vacuum on a large table
                          will not cause a complete cache eviction.

                          Might be a useful hint for sequential scans too.


                          Jan

                          --
                          #============== =============== =============== =============== ===========#
                          # It's easier to get forgiveness for being wrong than for being right. #
                          # Let's break this rule - forgive me. #
                          #============== =============== =============== ====== JanWieck@Yahoo. com #


                          ---------------------------(end of broadcast)---------------------------
                          TIP 6: Have you searched our list archives?



                          Comment

                          • Karsten Hilbert

                            #43
                            Re: Buglist

                            > Just nice'ing the VACUUM process is likely to be counterproducti ve[color=blue]
                            > because of locking issues (priority inversion).[/color]
                            OK, getting out the brown paper bag |-(

                            Is there any concept of db engine idleness obtainable from
                            states of PG internal variables that might be leveraged ?

                            Karsten
                            --
                            GPG key ID E4071346 @ wwwkeys.pgp.net
                            E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

                            ---------------------------(end of broadcast)---------------------------
                            TIP 9: the planner will ignore your desire to choose an index scan if your
                            joining column's datatypes do not match

                            Comment

                            • Tom Lane

                              #44
                              Re: Buglist

                              Jan Wieck <JanWieck@Yahoo .com> writes:[color=blue]
                              > What about a little hint to the buffer management that if it has to
                              > evict another buffer to physically read this one (meaning the buffer
                              > pool was full already) then it will not put this buffer at the top of
                              > the LRU chain but rather at it's end? This way a vacuum on a large table
                              > will not cause a complete cache eviction.[/color]

                              I think what we really need is a way to schedule VACUUM's I/O at a lower
                              priority than normal I/Os. Wouldn't be very portable :-( ... but if the
                              OS offers a facility for requesting this, it'd be worth experimenting
                              with.

                              regards, tom lane

                              ---------------------------(end of broadcast)---------------------------
                              TIP 3: if posting/reading through Usenet, please send an appropriate
                              subscribe-nomail command to majordomo@postg resql.org so that your
                              message can get through to the mailing list cleanly

                              Comment

                              • Alvaro Herrera

                                #45
                                Re: Buglist

                                On Wed, Aug 20, 2003 at 05:21:35PM -0400, Jan Wieck wrote:
                                [color=blue]
                                > What about a little hint to the buffer management that if it has to
                                > evict another buffer to physically read this one (meaning the buffer
                                > pool was full already) then it will not put this buffer at the top of
                                > the LRU chain but rather at it's end? This way a vacuum on a large table
                                > will not cause a complete cache eviction.
                                >
                                > Might be a useful hint for sequential scans too.[/color]

                                Somebody was playing with using LRU-2 or ARC or some other algorithm for
                                page replacement instead of the current LRU... wouldn't it help with
                                this too?

                                --
                                Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
                                "La vida es para el que se aventura"

                                ---------------------------(end of broadcast)---------------------------
                                TIP 6: Have you searched our list archives?



                                Comment

                                Working...