How to store annual hourly data in a database

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Johnny Denver
    New Member
    • Feb 2011
    • 3

    How to store annual hourly data in a database

    I'm not that experienced with database design and I cannot figure out the best way to design this. I have annual hourly data (8760 values) for a million sites. The total size of the data is not a problem, but how would one store the data such that I can manipulate annual sets of data?

    I often read that designing tables with 100s of columns is a mistake. Clearly, I cannot store the data as I would in a program, which is in an array with 8760 columns. So, would monthly tables work (744 columns) or tables with one week of data (168 columns)? Or perhaps 365 tables with a single day of data per record?

    Any advice is appreciated.
  • NetDynamic
    New Member
    • Feb 2011
    • 27

    #2
    did you mean columns or rows? I see no reason not to store 1 table with millions of rows.

    Comment

    • Johnny Denver
      New Member
      • Feb 2011
      • 3

      #3
      To clarify, I have a million sets of data, and each set contains 8760 values. This is hourly data for an entire year for 1 million sites.

      The question is, how to efficiently store the data in a way that allows me to quickly "view" the data for particular set of sites? My application needs to load all the hourly data for a subset of rows. The simplest thing to do would be to store the data in a table with 8760 columns, one for every hour of the year. Apparently, this is not only bad practice, it's not even possible in most database architectures.

      So, I could store the data in 365 tables, one for each day of the year and each with 24 columns (plus a column for the ID of course). There would still be 1 million rows in each table. To load the data for one site, however, I would have to query 365 tables. That seems painful. Is there a standard way of dealing with data of these dimensions?

      Comment

      • NetDynamic
        New Member
        • Feb 2011
        • 27

        #4
        I'm posting from my phone so ill help more later but ill just say there is no reason to have separate tables for each day just have a field in the row for the day and sort them out after.

        Comment

        • Rabbit
          Recognized Expert MVP
          • Jan 2007
          • 12517

          #5
          To elaborate on what NetDynamic said, you only need one table with 3 fields. Site, DateTimeField, and ValueField.

          Comment

          • Johnny Denver
            New Member
            • Feb 2011
            • 3

            #6
            Ahhhh ... I hadn't thought of that solution. So, instead of a wide table with a million rows, I'd have a narrow table with 8.7 billion rows. I have to admit, the idea of "billions and billions" of rows (with a nod to Carl Sagan) sort of scares me, but that's probably my problem, not postgre's problem.

            Since the current Site ID is a long character string, I imagine a Site ID table should be created with an auto-increment index. The index would be used in the large data table.

            Hmmm, makes sense. Thanks for the clue.

            Comment

            • NetDynamic
              New Member
              • Feb 2011
              • 27

              #7
              If you have a ton of tiny data sets just use a delimiter and use only 1 column for each set, then in your script just explode it into an array or something and use as you wish.

              Comment

              Working...