Question about SQL table definitions

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Kevin Thorpe

    Question about SQL table definitions

    I think I know the answer to this, but suggestions welcomed.

    I have to store lots of information. An electricity reading every half
    hour for hundreds of meters for several years.

    The ways I thought of storing this in SQL were:
    1. One table row per day with 48 fields for the readings.
    144M table, 3M index
    saves space and having a long skinny table
    however, getting the min/max reading in a day is a pain. I
    think I'll have to store it which goes against the grain
    2. One table row per reading.
    376M table, 231M index
    big time long skinny table with more control information than
    content, huge waste of disk space. However it is first normal
    form so queries are easier.
    3. Two tables, one to hold control information and one for the readings.
    added complications because of the join and creation of an
    internal link field.

    A quick trial proved 1 to be cumbersome but space saving. I'm currently
    using option 2.

    any suggestions or recommenndation s?




  • David Mackenzie

    #2
    Re: Question about SQL table definitions

    On Wed, 26 Nov 2003 14:48:58 +0000, Kevin Thorpe <kevin@pricetra k.com>
    wrote:
    [color=blue]
    >I think I know the answer to this, but suggestions welcomed.
    >
    >I have to store lots of information. An electricity reading every half
    >hour for hundreds of meters for several years.
    >
    >The ways I thought of storing this in SQL were:
    >1. One table row per day with 48 fields for the readings.
    > 144M table, 3M index
    > saves space and having a long skinny table
    > however, getting the min/max reading in a day is a pain. I
    > think I'll have to store it which goes against the grain[/color]

    Ouch!
    [color=blue]
    >2. One table row per reading.
    > 376M table, 231M index
    > big time long skinny table with more control information than
    > content, huge waste of disk space. However it is first normal
    > form so queries are easier.
    >3. Two tables, one to hold control information and one for the readings.
    > added complications because of the join and creation of an
    > internal link field.[/color]

    It really ought to be option 3 (which is in 3NF), as you're
    duplicating your control info with option 2. Indexing the date and
    control id should help performance.

    Depending on what kinds of reports you want to run, you may be able to
    archive off old data with option 3 thereby keeping the database to a
    manageable size. We do that with some of our customer's history
    records whenever they notice the system becoming slow.

    If most queries will concern, say, the previous week's readings you
    could create a separate database for those and have a scheduled job
    move the records into an archive database each day/week.

    Are your controls grouped (say, geographically, by purpose, etc)? Will
    you want to combine the readings of several controls?

    --
    David ( @priz.co.uk )
    The Internet Prisoner Database: http://www.priz.co.uk/ipdb/
    The Tarbrax Chronicle: http://www.tarbraxchronicle.com/

    Comment

    • Andy Hassall

      #3
      Re: Question about SQL table definitions

      On Wed, 26 Nov 2003 14:48:58 +0000, Kevin Thorpe <kevin@pricetra k.com> wrote:
      [color=blue]
      >I think I know the answer to this, but suggestions welcomed.
      >
      >I have to store lots of information. An electricity reading every half
      >hour for hundreds of meters for several years.
      >
      >The ways I thought of storing this in SQL were:
      >1. One table row per day with 48 fields for the readings.
      > 144M table, 3M index
      > saves space and having a long skinny table
      > however, getting the min/max reading in a day is a pain. I
      > think I'll have to store it which goes against the grain
      >2. One table row per reading.
      > 376M table, 231M index
      > big time long skinny table with more control information than
      > content, huge waste of disk space. However it is first normal
      > form so queries are easier.
      >3. Two tables, one to hold control information and one for the readings.
      > added complications because of the join and creation of an
      > internal link field.
      >
      >A quick trial proved 1 to be cumbersome but space saving. I'm currently
      >using option 2.
      >
      >any suggestions or recommenndation s?[/color]

      What database are you implementing this on? If it were Oracle, it would look
      like a prime candidate for an Index-Organised table, which would cut down your
      storage requirements. I don't think MySQL has an equivalent. I vaguely recall
      the SQL Server equivalent being a 'clustered index' or something like that.

      --
      Andy Hassall (andy@andyh.co. uk) icq(5747695) (http://www.andyh.co.uk)
      Space: disk usage analysis tool (http://www.andyhsoftware.co.uk/space)

      Comment

      • Kevin Thorpe

        #4
        Re: Question about SQL table definitions

        David Mackenzie wrote:[color=blue]
        > On Wed, 26 Nov 2003 14:48:58 +0000, Kevin Thorpe <kevin@pricetra k.com>
        > wrote:
        >
        >[color=green]
        >>I think I know the answer to this, but suggestions welcomed.
        >>
        >>I have to store lots of information. An electricity reading every half
        >>hour for hundreds of meters for several years.
        >>
        >>The ways I thought of storing this in SQL were:
        >>1. One table row per day with 48 fields for the readings.
        >> 144M table, 3M index
        >> saves space and having a long skinny table
        >> however, getting the min/max reading in a day is a pain. I
        >> think I'll have to store it which goes against the grain[/color]
        >
        >
        > Ouch!
        >
        >[color=green]
        >>2. One table row per reading.
        >> 376M table, 231M index
        >> big time long skinny table with more control information than
        >> content, huge waste of disk space. However it is first normal
        >> form so queries are easier.
        >>3. Two tables, one to hold control information and one for the readings.
        >> added complications because of the join and creation of an
        >> internal link field.[/color]
        >
        >
        > It really ought to be option 3 (which is in 3NF), as you're
        > duplicating your control info with option 2. Indexing the date and
        > control id should help performance.[/color]

        That's what I thought. I started off using option 2 because my control
        information was fairly small (meter number, date, reading number) but
        I've now had to add extra dimensions to it (some readings are estimated
        or modelled) so a migration to two tables seems required.
        [color=blue]
        > Depending on what kinds of reports you want to run, you may be able to
        > archive off old data with option 3 thereby keeping the database to a
        > manageable size. We do that with some of our customer's history
        > records whenever they notice the system becoming slow.[/color]

        I try not to archive information. Disk space is cheap and properly
        indexed speed shouldn't be an issue. I've found empirically that the
        databases I've used fill up only at the rate hardware speeds up.
        [color=blue]
        > If most queries will concern, say, the previous week's readings you
        > could create a separate database for those and have a scheduled job
        > move the records into an archive database each day/week.
        >
        > Are your controls grouped (say, geographically, by purpose, etc)? Will
        > you want to combine the readings of several controls?[/color]

        The control fields are orthogonal so indexing is important. Lots of
        EXPLAIN queries I think.

        Comment

        Working...