Beginner: Enhancing postgresql performance

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Vittorio

    Beginner: Enhancing postgresql performance

    Context:
    Pentium PIII, 128 MB RAM, 5400 RPM HD-10GB (of which 1.8 GB as swap
    disk)
    Debian linux testing, Postgresql 7.4.2 from the deb box.
    Beginner with Postgresql but not with linux

    At office, I'm on the verge of receiving (fragmented in text format
    files) the readings of electricity load at each quarter of hour of
    1,800 recorders for a couple of years. In a nutshell It means to store
    and retrieve for study 1,800 * 365 *2 = 1,314,000 records exactly,
    each made at least of the 96 readings + id-recorder +date.

    After having convinced my IT-Dept morbidly prone to Micro$oft to
    install in our lan an old PC doomed to perish with debian and
    postgresql, I built on it a fake db, similar to that with which I'm
    going to work, to have a go with PG and "taste" it. To be on the safe
    side, you know.....

    The db is made of 2 tables one as I told before and the other with all
    the information on each recorder such as location, date of
    installation, precision, etc. linked to the previous, cumbersome
    table, the one made of 1,340,000 recs, each of which is composed by a
    date field, an int4 as recorder code and 96 decimal field of 6 figure
    after the point for the readings. By and large the db is 3.8 GB.

    I'm astonished by the fact that, in spite of the poor hardware, a:

    select sum(field1), avg(field2) from db;

    takes roughly 5 min. only to be executed (mean of 20 trials always
    changing the fields), leaving - in the process - 2.5 MB (out of 128
    MB) of memory free and a mere 4 MB of the 1.8 GB of the swap area
    occupied.

    Nonetheless I'm asking your most revered help to try to improve
    "dramatical ly" this performance.

    Is there any action I could take to improve the performance of this
    specific db?

    Vittorio
  • Colin Coghill

    #2
    Re: Beginner: Enhancing postgresql performance

    On 2004-07-26, Vittorio <vic50@email.it > wrote:[color=blue]
    >
    > I'm astonished by the fact that, in spite of the poor hardware, a:
    >
    > select sum(field1), avg(field2) from db;
    >
    > takes roughly 5 min. only to be executed (mean of 20 trials always
    > changing the fields), leaving - in the process - 2.5 MB (out of 128
    > MB) of memory free and a mere 4 MB of the 1.8 GB of the swap area
    > occupied.
    >
    > Is there any action I could take to improve the performance of this
    > specific db?[/color]

    I know it's a bit non-purist, but something I've done for speeding up
    the performance of a similar set of data is to have an extra table
    containing pre-calculated values.

    So in your case you might have a table containing the sum and avg for
    each *day*. To calculate the sum and avg for a given time period you can
    do a bit of work and only need the accurate values for the start and end
    day. A little bit more manipulation but a significantly faster query
    behind it.

    In my case (around 1,000,000 records), this brought an individual
    calculation down to under a second, from around a minute.

    This assumes the "historical " data doesn't change much, perhaps only
    being appended to. For cases where it can, even a nightly task
    to recalculate the extra table can be worthwhile.

    - Colin

    Comment

    • Vittorio

      #3
      Re: Beginner: Enhancing postgresql performance

      "Colin Coghill" <mugginsm@und er-the-fridge.com> wrote in message news:<slrncgbap t.m4i.mugginsm@ www.under-the-fridge.com>...[color=blue]
      > On 2004-07-26, Vittorio <vic50@email.it > wrote:[color=green]
      >> ............... ............... ............... ........[/color]
      > I know it's a bit non-purist, but something I've done for speeding up
      > the performance of a similar set of data is to have an extra table
      > containing pre-calculated values.
      >
      > So in your case you might have a table containing the sum and avg for
      > each *day*. To calculate the sum and avg for a given time period you can
      > do a bit of work and only need the accurate values for the start and end
      > day. A little bit more manipulation but a significantly faster query
      > behind it.
      >
      > In my case (around 1,000,000 records), this brought an individual
      > calculation down to under a second, from around a minute.
      >
      > This assumes the "historical " data doesn't change much, perhaps only
      > being appended to. For cases where it can, even a nightly task
      > to recalculate the extra table can be worthwhile.
      >
      > - Colin[/color]

      As a matter of fact I already wrote and have been using a function
      (upon inserting & updating)which in another table puts the sum of the
      96 readings (therefore the daily energy) by recorder and day. I could
      do the same following your suggestion, summing up by day each of the
      96 readings.

      But before this "dirty", non-purist, solution is there any FURTHER
      optimization I could carry out in terms of indexes, memory management,
      etc.?

      Again Vittorio

      Comment

      Working...