disk performance benchmarks

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Shane  Wright

    disk performance benchmarks

    Hi,

    I've been trying to spec a new server for my company's database for a
    few weeks and one of the biggest problems I've had is trying to find
    meaningful performance information about how PostgreSQL will perfom
    under various disk configurations.

    But, we have now taken the plunge and I'm in a position to do some
    benchmarking to actually get some data. Basically I was wondering if
    anyone else had any particular recommendations (or requests) about the
    most useful kinds of benchmarks to do.


    The hardware I'll be benchmarking on is...

    server 1: single 2.8Ghz Xeon, 2Gb RAM. Adaptec 2410SA SATA hardware
    RAID, with 4 x 200Gb 7200rpm WD SATA drives. RAID in both RAID5 and
    RAID10 (currently RAID5, but want to experiment with write performance
    in RAID10). Gentoo Linux

    server 2: single 2.6Ghz Xeon, 2Gb RAM, single 80Gb IDE drive. Redhat
    Linux

    server 3: dual 2.6Ghz Xeon, 6Gb RAM, software RAID10 with 4 x 36Gb
    10kRPM U320 SCSI drives, RedHat Linux


    I realise the boxes aren't all identical - but some benchmarks on those
    should give some ballpark figures for anyone else speccing out a
    low-mid range box and wanting some performance figures on IDE vs IDE
    RAID vs SCSI RAID

    I'd be more than happy to post any results back to the list, and if
    anyone else can contribute any other data points that'd be great.

    Otherwise, any pointers to a quick/easy setup for some vaguely useful
    benchmarks would be great. At the moment I'm thinking just along the
    lines of 'pgbench -c 10 -s 100 -v'.

    Cheers

    Shane

  • Vivek Khera

    #2
    Re: disk performance benchmarks

    >>>>> "SW" == Shane Wright <Shane> writes:

    SW> But, we have now taken the plunge and I'm in a position to do some
    SW> benchmarking to actually get some data. Basically I was wondering if
    SW> anyone else had any particular recommendations (or requests) about the
    SW> most useful kinds of benchmarks to do.

    I did a bunch of benchmarking on a 14 disk SCSI RAID array comparing
    RAID 5, 10, and 50. My tests consisted of doing a full restore of a
    30Gb database (including indexes) and comparing the times to do the
    restore, the time to make the indexes, and the time to vacuum. Then I
    ran a bunch of queries.

    It was damn near impossible to pick a 'better' RAID config, so I just
    went with RAID5.

    You can find many of my posts on this topic on the list archives from
    about august - october of last year.

    Basically, you have to approach it holistically to tune the system: Pg
    config parameters, memory, and disk speed are the major factors.

    That and your schema needs to be not idiotic. :-)

    --
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    Vivek Khera, Ph.D. Khera Communications, Inc.
    Internet: khera@kciLink.c om Rockville, MD +1-301-869-4449 x806
    AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/

    ---------------------------(end of broadcast)---------------------------
    TIP 8: explain analyze is your friend

    Comment

    • Jeffrey W. Baker

      #3
      Re: disk performance benchmarks

      On Tue, 2004-09-14 at 10:28, Vivek Khera wrote:[color=blue][color=green][color=darkred]
      > >>>>> "SW" == Shane Wright <Shane> writes:[/color][/color]
      >
      > SW> But, we have now taken the plunge and I'm in a position to do some
      > SW> benchmarking to actually get some data. Basically I was wondering if
      > SW> anyone else had any particular recommendations (or requests) about the
      > SW> most useful kinds of benchmarks to do.
      >
      > I did a bunch of benchmarking on a 14 disk SCSI RAID array comparing
      > RAID 5, 10, and 50. My tests consisted of doing a full restore of a
      > 30Gb database (including indexes) and comparing the times to do the
      > restore, the time to make the indexes, and the time to vacuum. Then I
      > ran a bunch of queries.
      >
      > It was damn near impossible to pick a 'better' RAID config, so I just
      > went with RAID5.
      >
      > You can find many of my posts on this topic on the list archives from
      > about august - october of last year.
      >
      > Basically, you have to approach it holistically to tune the system: Pg
      > config parameters, memory, and disk speed are the major factors.
      >
      > That and your schema needs to be not idiotic. :-)[/color]

      I've recently bee frustrated by this topic, because it seems like you
      can design the hell out of a system, getting everything tuned with micro
      and macro benchmarks, but when you put it in production the thing falls
      apart.

      Current issue:

      A dual 64-bit Opteron 244 machine with 8GB main memory, two 4-disk RAID5
      arrays (one for database, one for xlogs). PG's config is extremely
      generous, and in isolated benchmarks it's very fast.

      But, in reality, performance is abyssmal. There's something about what
      PG does inside commits and checkpoints that sends Linux into a catatonic
      state. For instance here's a snapshot of vmstat during a parallel heavy
      select/insert load:

      procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
      r b swpd free buff cache si so bi bo in cs us sy id wa
      3 0 216 13852 39656 7739724 0 0 820 2664 2868 2557 16 2 74 7
      0 0 216 17580 39656 7736460 0 0 3024 4700 3458 4313 42 6 52 0
      0 0 216 16428 39676 7737324 0 0 840 4248 3930 4516 0 4 89 8
      0 1 216 18620 39672 7736920 0 0 7576 516 2738 3347 1 4 55 39
      0 0 216 14972 39672 7738960 0 0 1992 2532 2509 2288 2 3 93 3
      0 0 216 13564 39672 7740592 0 0 1640 2656 2581 2066 1 3 97 0
      0 0 216 12028 39672 7742292 0 0 1688 3576 2072 1626 1 2 96 0
      0 0 216 18364 39680 7736164 0 0 1804 3372 1836 1379 1 4 96 0
      0 0 216 16828 39684 7737588 0 0 1432 2756 2256 1720 1 3 94 2
      0 0 216 15452 39684 7738812 0 0 1188 2184 2384 1830 1 2 97 0
      0 1 216 15388 39684 7740104 0 0 1336 2628 2490 1974 2 3 94 2
      6 0 216 15424 39684 7740240 0 0 104 3472 2757 1940 3 2 92 2
      0 0 216 14784 39700 7741856 0 0 1668 3320 2718 2332 0 3 97 0

      You can see there's not much progress being made there. In the
      presence of a farily pathetic writeout, there's a tiny trickle of disk
      reads, userspace isn't making any progress, the kernel isn't busy, and
      few processes are in iowait. So what the heck is going on?

      This state of non-progress persists as long as the checkpoint subprocess
      is active. I'm sure there's some magic way to improve this but I
      haven't found it yet.

      PS this is with Linux 2.6.7.

      Regards,
      jwb

      ---------------------------(end of broadcast)---------------------------
      TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to majordomo@postg resql.org so that your
      message can get through to the mailing list cleanly

      Comment

      • Jim C. Nasby

        #4
        Re: disk performance benchmarks

        On Tue, Sep 14, 2004 at 11:11:38AM -0700, Jeffrey W. Baker wrote:[color=blue]
        > procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
        > r b swpd free buff cache si so bi bo in cs us sy id wa
        > 3 0 216 13852 39656 7739724 0 0 820 2664 2868 2557 16 2 74 7
        > 0 0 216 17580 39656 7736460 0 0 3024 4700 3458 4313 42 6 52 0
        > 0 0 216 16428 39676 7737324 0 0 840 4248 3930 4516 0 4 89 8
        > 0 1 216 18620 39672 7736920 0 0 7576 516 2738 3347 1 4 55 39
        > 0 0 216 14972 39672 7738960 0 0 1992 2532 2509 2288 2 3 93 3
        > 0 0 216 13564 39672 7740592 0 0 1640 2656 2581 2066 1 3 97 0
        > 0 0 216 12028 39672 7742292 0 0 1688 3576 2072 1626 1 2 96 0
        > 0 0 216 18364 39680 7736164 0 0 1804 3372 1836 1379 1 4 96 0
        > 0 0 216 16828 39684 7737588 0 0 1432 2756 2256 1720 1 3 94 2
        > 0 0 216 15452 39684 7738812 0 0 1188 2184 2384 1830 1 2 97 0
        > 0 1 216 15388 39684 7740104 0 0 1336 2628 2490 1974 2 3 94 2
        > 6 0 216 15424 39684 7740240 0 0 104 3472 2757 1940 3 2 92 2
        > 0 0 216 14784 39700 7741856 0 0 1668 3320 2718 2332 0 3 97 0
        >
        > You can see there's not much progress being made there. In the[/color]

        Those IO numbers look pretty high for nothing going on. Are you sure
        you're not IO bound?
        --
        Jim C. Nasby, Database Consultant decibel@decibel .org
        Give your computer some brain candy! www.distributed.net Team #1828

        Windows: "Where do you want to go today?"
        Linux: "Where do you want to go tomorrow?"
        FreeBSD: "Are you guys coming, or what?"

        ---------------------------(end of broadcast)---------------------------
        TIP 1: subscribe and unsubscribe commands go to majordomo@postg resql.org

        Comment

        • Jeffrey W. Baker

          #5
          Re: disk performance benchmarks

          On Tue, 2004-09-14 at 14:45, Jim C. Nasby wrote:[color=blue]
          > On Tue, Sep 14, 2004 at 11:11:38AM -0700, Jeffrey W. Baker wrote:[color=green]
          > > procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
          > > r b swpd free buff cache si so bi bo in cs us sy id wa
          > > 3 0 216 13852 39656 7739724 0 0 820 2664 2868 2557 16 2 74 7
          > > 0 0 216 17580 39656 7736460 0 0 3024 4700 3458 4313 42 6 52 0
          > > 0 0 216 16428 39676 7737324 0 0 840 4248 3930 4516 0 4 89 8
          > > 0 1 216 18620 39672 7736920 0 0 7576 516 2738 3347 1 4 55 39
          > > 0 0 216 14972 39672 7738960 0 0 1992 2532 2509 2288 2 3 93 3
          > > 0 0 216 13564 39672 7740592 0 0 1640 2656 2581 2066 1 3 97 0
          > > 0 0 216 12028 39672 7742292 0 0 1688 3576 2072 1626 1 2 96 0
          > > 0 0 216 18364 39680 7736164 0 0 1804 3372 1836 1379 1 4 96 0
          > > 0 0 216 16828 39684 7737588 0 0 1432 2756 2256 1720 1 3 94 2
          > > 0 0 216 15452 39684 7738812 0 0 1188 2184 2384 1830 1 2 97 0
          > > 0 1 216 15388 39684 7740104 0 0 1336 2628 2490 1974 2 3 94 2
          > > 6 0 216 15424 39684 7740240 0 0 104 3472 2757 1940 3 2 92 2
          > > 0 0 216 14784 39700 7741856 0 0 1668 3320 2718 2332 0 3 97 0
          > >
          > > You can see there's not much progress being made there. In the[/color]
          >
          > Those IO numbers look pretty high for nothing going on. Are you sure
          > you're not IO bound?[/color]

          Just for the list to get an idea of the kinds of performance problems
          I'm trying to eliminate, check out these vmstat captures:



          Performance is okay-ish for about three minutes at a stretch and then
          extremely bad during the fourth minute, and the cycle repeats all day.
          During the bad periods everything involving the database just blocks.

          -jwb

          ---------------------------(end of broadcast)---------------------------
          TIP 9: the planner will ignore your desire to choose an index scan if your
          joining column's datatypes do not match

          Comment

          • Joshua D. Drake

            #6
            Re: disk performance benchmarks

            [color=blue]
            >You can see there's not much progress being made there. In the
            >presence of a farily pathetic writeout, there's a tiny trickle of disk
            >reads, userspace isn't making any progress, the kernel isn't busy, and
            >few processes are in iowait. So what the heck is going on?
            >
            >This state of non-progress persists as long as the checkpoint subprocess
            >is active. I'm sure there's some magic way to improve this but I
            >haven't found it yet.
            >
            >
            >[/color]
            Hello,

            It is my experience that RAID 5 is not that great for heavy write
            situations and that RAID 10 is better.
            Also as you are on linux you may want to take a look at what file system
            you are using. EXT3 for example is
            known to be stable, if a very slow piggy.

            J




            [color=blue]
            >PS this is with Linux 2.6.7.
            >
            >Regards,
            >jwb
            >
            >---------------------------(end of broadcast)---------------------------
            >TIP 3: if posting/reading through Usenet, please send an appropriate
            > subscribe-nomail command to majordomo@postg resql.org so that your
            > message can get through to the mailing list cleanly
            >
            >[/color]


            --
            Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
            Postgresql support, programming shared hosting and dedicated hosting.
            +1-503-667-4564 - jd@commandpromp t.com - http://www.commandprompt.com
            PostgreSQL Replicator -- production quality replication for PostgreSQL


            ---------------------------(end of broadcast)---------------------------
            TIP 3: if posting/reading through Usenet, please send an appropriate
            subscribe-nomail command to majordomo@postg resql.org so that your
            message can get through to the mailing list cleanly

            Comment

            • Greg Stark

              #7
              Re: disk performance benchmarks


              Vivek Khera <khera@kcilink. com> writes:
              [color=blue]
              > On Sep 14, 2004, at 9:49 PM, Joshua D. Drake wrote:
              >[color=green]
              > > It is my experience that RAID 5 is not that great for heavy write situations
              > > and that RAID 10 is better.
              > >[/color]
              > It is my experience that this depends entirely on how many spindles you have in
              > your RAID. For 4 or 5 spindles, I find RAID10 faster. With 14 spindles, it
              > was more or less a toss-up for me.[/color]

              I think this depends massively on the hardware involved and the applications
              involved.

              For write heavy application I would expect RAID5 to be a lose on any
              software-raid based solution. Only with good hardware raid systems with very
              large battery-backed cache would it begin to be effective.

              --
              greg


              ---------------------------(end of broadcast)---------------------------
              TIP 8: explain analyze is your friend

              Comment

              • Michael Paesold

                #8
                Re: disk performance benchmarks

                Jeffrey W. Baker wrote:
                [color=blue]
                > Current issue:
                >
                > A dual 64-bit Opteron 244 machine with 8GB main memory, two 4-disk RAID5
                > arrays (one for database, one for xlogs). PG's config is extremely
                > generous, and in isolated benchmarks it's very fast.[/color]

                It depends on the controller, but usually I would expect a better
                performance if xlogs are just on a two-disk mirror and the rest of the disks
                for data (6 splindles instead of 4 then).

                I don't think RAID5 is a benefit for xlogs.

                Regards,
                Michael Paesold
                [color=blue]
                > But, in reality, performance is abyssmal. There's something about what
                > PG does inside commits and checkpoints that sends Linux into a catatonic
                > state. For instance here's a snapshot of vmstat during a parallel heavy
                > select/insert load:[/color]
                ....


                ---------------------------(end of broadcast)---------------------------
                TIP 5: Have you checked our extensive FAQ?



                Comment

                • Jeffrey W. Baker

                  #9
                  Re: disk performance benchmarks

                  On Wed, 2004-09-15 at 02:39, Michael Paesold wrote:[color=blue]
                  > Jeffrey W. Baker wrote:
                  >[color=green]
                  > > Current issue:
                  > >
                  > > A dual 64-bit Opteron 244 machine with 8GB main memory, two 4-disk RAID5
                  > > arrays (one for database, one for xlogs). PG's config is extremely
                  > > generous, and in isolated benchmarks it's very fast.[/color]
                  >
                  > It depends on the controller, but usually I would expect a better
                  > performance if xlogs are just on a two-disk mirror and the rest of the disks
                  > for data (6 splindles instead of 4 then).
                  >
                  > I don't think RAID5 is a benefit for xlogs.[/color]

                  All these replies are really interesting, but the point is not that my
                  RAIDs are too slow, or that my CPUs are too slow. My point is that, for
                  long stretches of time, by database doesn't come anywhere near using the
                  capacity of the hardware. And I think that's odd and would like to
                  config it to "false".

                  -jwb

                  ---------------------------(end of broadcast)---------------------------
                  TIP 7: don't forget to increase your free space map settings

                  Comment

                  • Vivek Khera

                    #10
                    Re: disk performance benchmarks

                    >>>>> "GS" == Greg Stark <gsstark@mit.ed u> writes:

                    GS> For write heavy application I would expect RAID5 to be a lose on
                    GS> any software-raid based solution. Only with good hardware raid
                    GS> systems with very large battery-backed cache would it begin to be
                    GS> effective.

                    Who in their right mind would run a 14 spindle RAID in software? :-)

                    Battery backed write-back cache is definitely mandatory for performance.

                    --
                    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
                    Vivek Khera, Ph.D. Khera Communications, Inc.
                    Internet: khera@kciLink.c om Rockville, MD +1-301-869-4449 x806
                    AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/

                    ---------------------------(end of broadcast)---------------------------
                    TIP 1: subscribe and unsubscribe commands go to majordomo@postg resql.org

                    Comment

                    • Vivek Khera

                      #11
                      Re: disk performance benchmarks

                      >>>>> "JWB" == Jeffrey W Baker <jwbaker@acm.or g> writes:

                      JWB> All these replies are really interesting, but the point is not that my
                      JWB> RAIDs are too slow, or that my CPUs are too slow. My point is that, for
                      JWB> long stretches of time, by database doesn't come anywhere near using the
                      JWB> capacity of the hardware. And I think that's odd and would like to
                      JWB> config it to "false".

                      Have you tried to increase your checkpoing_segm ents? I get the
                      suspicion that you're checkpointing every 3 minutes constantly.
                      You'll have to restart the postmaster for this setting to take effect,
                      I believe.


                      --
                      =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
                      Vivek Khera, Ph.D. Khera Communications, Inc.
                      Internet: khera@kciLink.c om Rockville, MD +1-301-869-4449 x806
                      AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/

                      ---------------------------(end of broadcast)---------------------------
                      TIP 3: if posting/reading through Usenet, please send an appropriate
                      subscribe-nomail command to majordomo@postg resql.org so that your
                      message can get through to the mailing list cleanly

                      Comment

                      • Jeffrey W. Baker

                        #12
                        Re: disk performance benchmarks

                        On Wed, 2004-09-15 at 10:53, Vivek Khera wrote:[color=blue][color=green][color=darkred]
                        > >>>>> "JWB" == Jeffrey W Baker <jwbaker@acm.or g> writes:[/color][/color]
                        >
                        > JWB> All these replies are really interesting, but the point is not that my
                        > JWB> RAIDs are too slow, or that my CPUs are too slow. My point is that, for
                        > JWB> long stretches of time, by database doesn't come anywhere near using the
                        > JWB> capacity of the hardware. And I think that's odd and would like to
                        > JWB> config it to "false".
                        >
                        > Have you tried to increase your checkpoing_segm ents? I get the
                        > suspicion that you're checkpointing every 3 minutes constantly.
                        > You'll have to restart the postmaster for this setting to take effect,
                        > I believe.[/color]

                        I have checkpoint_segm ents set to 24, but I get the feeling that making
                        it larger may have the opposite effect of what I want, by extending the
                        period during which the DB makes no progress.

                        -jwb

                        ---------------------------(end of broadcast)---------------------------
                        TIP 7: don't forget to increase your free space map settings

                        Comment

                        • Alvaro Herrera

                          #13
                          Re: disk performance benchmarks

                          On Wed, Sep 15, 2004 at 11:36:18AM -0700, Jeffrey W. Baker wrote:[color=blue]
                          > On Wed, 2004-09-15 at 10:53, Vivek Khera wrote:[color=green][color=darkred]
                          > > >>>>> "JWB" == Jeffrey W Baker <jwbaker@acm.or g> writes:[/color]
                          > >
                          > > JWB> All these replies are really interesting, but the point is not that my
                          > > JWB> RAIDs are too slow, or that my CPUs are too slow. My point is that, for
                          > > JWB> long stretches of time, by database doesn't come anywhere near using the
                          > > JWB> capacity of the hardware. And I think that's odd and would like to
                          > > JWB> config it to "false".
                          > >
                          > > Have you tried to increase your checkpoing_segm ents? I get the
                          > > suspicion that you're checkpointing every 3 minutes constantly.
                          > > You'll have to restart the postmaster for this setting to take effect,
                          > > I believe.[/color]
                          >
                          > I have checkpoint_segm ents set to 24, but I get the feeling that making
                          > it larger may have the opposite effect of what I want, by extending the
                          > period during which the DB makes no progress.[/color]

                          It sounds strange that the DB stops doing anything while the checkpoint
                          is in progress. Have you tried poking at pg_locks during that interval?

                          --
                          Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
                          "La naturaleza, tan frágil, tan expuesta a la muerte... y tan viva"


                          ---------------------------(end of broadcast)---------------------------
                          TIP 2: you can get off all lists at once with the unregister command
                          (send "unregister YourEmailAddres sHere" to majordomo@postg resql.org)

                          Comment

                          • Marc Slemko

                            #14
                            Re: disk performance benchmarks

                            On Wed, 15 Sep 2004 09:11:37 -0700, Jeffrey W. Baker <jwbaker@acm.or g> wrote:[color=blue]
                            > On Wed, 2004-09-15 at 02:39, Michael Paesold wrote:[color=green]
                            > > Jeffrey W. Baker wrote:
                            > >[color=darkred]
                            > > > Current issue:
                            > > >
                            > > > A dual 64-bit Opteron 244 machine with 8GB main memory, two 4-disk RAID5
                            > > > arrays (one for database, one for xlogs). PG's config is extremely
                            > > > generous, and in isolated benchmarks it's very fast.[/color]
                            > >
                            > > It depends on the controller, but usually I would expect a better
                            > > performance if xlogs are just on a two-disk mirror and the rest of the disks
                            > > for data (6 splindles instead of 4 then).
                            > >
                            > > I don't think RAID5 is a benefit for xlogs.[/color]
                            >
                            > All these replies are really interesting, but the point is not that my
                            > RAIDs are too slow, or that my CPUs are too slow. My point is that, for
                            > long stretches of time, by database doesn't come anywhere near using the
                            > capacity of the hardware. And I think that's odd and would like to
                            > config it to "false".[/color]

                            Umh, I don't think you have shown any numbers to show if the database
                            is using the capacity of the hardware or not...

                            If this is a seek heavy operation, the raw throughput is irrelevant;
                            you are limited by the number of seeks your disks can do. Run some
                            iostats and look at the number of transactions per second.

                            Using raid 5 can just destroy the number of write transactions per
                            second you can do, especially if it is software raid or a cheap raid
                            controller.

                            You can't just say "the hardware is fine and not stressed so I don't
                            want to discuss that, but everything is too slow so please make it
                            faster".

                            ---------------------------(end of broadcast)---------------------------
                            TIP 5: Have you checked our extensive FAQ?



                            Comment

                            Working...