How can I structure my tables well to cope with this data ?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • jeddiki
    Contributor
    • Jan 2009
    • 290

    How can I structure my tables well to cope with this data ?

    I am struggling with how I should store this data.

    I get an xlm update every day which contains about 20,000 rows of data.
    There are some new rows, maybe 150 a week, and some of the existing rows data changes.

    Approximately 97 % of all the data is the same each day with just 3% changing.

    I am using Mysql database tables.
    So I was thinking that the best way to store the data would be:

    Base_Data_Table - 20,000 rows (static)
    Structure:
    id, title, desc, data1, data2, ... data18

    Current_Data_Ta ble - 20,000 rows ( replaced daily)
    Structure:
    id, title, desc, data1, data2, ... data18

    Transaction_Dat a_Table - 60 rows per day ( grows daily )
    Structure:
    date, id, record, new_data

    --- record is the record that is changed
    --- data is the new data fro that record

    Although the above is nice and efficient for data storage, it is going to be
    a nightmare to extract the data.

    Everyday I need to be able to display the current data
    ( no problem with the Current Data table) but I also want to be able to calculate:

    For every row ( could be done dynamically when needed)
    a 2-day average for today
    a 2-day average for 7 days ago
    a 2-day average for 30 days ago

    and grab the last 120 days data points for two records ( data8 and data9)
    so that I can chart them.

    The easiest way to extract the data would be to just write a new table
    for each day, but that would mean duplicating 97% of data - big waste of disc space.

    I think that there must be a good compormise - but I am not sure what to do.

    Does anyone have any suggestions on how to organize the data.

    Thanks.
  • dlite922
    Recognized Expert Top Contributor
    • Dec 2007
    • 1586

    #2
    20,000 is nothing for MySQL to update, even daily. I don't understand why you have 3 tables and over complicating it.

    Just have another TimeStamp field in your table that gets updated when the row is inserted/updated.

    Data_Table
    id, title, data1,...data18 ,last_modified_ date

    Now, let's understand your reporting needs.

    What do you mean by a "2-day average for today"? Do you mean, get the number of records that have changed in the last 1 day, 7 days, and 30 days?

    The 120 day query is easy. Just query for last_modified_d ata < 120 days ago and Select data8, and data9.





    Dan

    Comment

    • jeddiki
      Contributor
      • Jan 2009
      • 290

      #3
      Thanks for your reply.

      I just looked at the size of my daily table.

      A daily file takes up 2.4 Mb

      That means a year is 876 Mb - isn't that quite a lot ?

      I must admit that it would make the reporting and charting
      much easier if I only have the one table.

      It just seems odd that 95% of that data is redundant..
      But maybe it doesn't matter ?

      My server has over 200Gb and I've only used 17Gb.

      BUT - doesn't a big file take a lot longer to search through?

      Or maybe its quicker to search one large table rather than do several searches on smaller tables ??

      Any experience with this ?


      Thanks

      Comment

      • Frinavale
        Recognized Expert Expert
        • Oct 2006
        • 9749

        #4
        I've moved this question to the MySql forum since it's more of a database question than a PhP question. I think you'll get more help with your questions from the experts there.

        Happy coding,

        -Frinny

        Comment

        • jeddiki
          Contributor
          • Jan 2009
          • 290

          #5
          Thanks,

          Would really appreciate any input from experienced database
          coders.

          Comment

          • RedSon
            Recognized Expert Expert
            • Jan 2007
            • 4980

            #6
            Bigger files do not necessarily take longer to search through for a database, no. A database creates indexes, indexes are designed to be small and easy to use to lookup stuff.

            I would expect mysql to easily and quickly perform it's tasks on files that are in the 10GB range as well so don't worry about that.

            You should try to limit redundant data. Are any of the suggestions dlite made valuable to you?

            Comment

            • MMcCarthy
              Recognized Expert MVP
              • Aug 2006
              • 14387

              #7
              OK my understanding of what you are asking is this. You have a "Flat" file of data (By this I mean you haven't organised your data into relational tables as per a standard database. This article will give you more information on data structures and normalisation of data.

              Now I understand you want to keep a transaction log of all updates to the data in your table. Exactly what do you need to store in that log.
              1. Do you need to keep a complete history of all changes or just the last change made to the data.
              2. Do you need to record all changes to a record (i.e. all the new data values) or just record the fact that a change was made and the date of that change.

              Comment

              Working...