Smarter Table build - might split the table

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • rcamarda

    Smarter Table build - might split the table

    Hello,
    I am using SQL 2005 and Cognos' Data Manager. It is an ETL tool for
    data warehousing.
    I have a problem with time it takes to load new changes, and I am
    seeking advice on a better way to manage the data.

    I have a table that tracks student attendance and it contains about 13
    million records. On a daily basis, there are 5,000 - 20,000 inserts and
    10,000 - 50,000 updates.
    The daily data comes for two different text files from my operation
    system; current and historical (CLSFIL and CLSHIS).
    The data is loaded into a staging area from the operational system,
    where data cleansing and other fields are added to the table.

    The final step is delivering the table to my target database, which is
    used for reporting.
    Heres the situation: I find it takes 45 minutes to do a relational
    update, where only the records that changed in the last day will be
    loaded. However, if I choose the native API load instead of a
    Relational Load, it can load all 13M records in 7 minutes. The table is
    heavly indexed
    At some point, the API load will take more time than the relational
    load, (the changes and new records will remain a constant, but the file
    will continue to grow).
    I'm seeking another solution is more efficient. I'm considering two
    tables for history and current and creating a view for reporting via a
    union.

    This a good idea? How can I make the view effeicent to use the where
    clause? Looking to bounce around ideas.

    Other Ideas?
    Thanks in Advance
    Rob

    (I maintain the key relationships in the tool, not the tables. I know
    I have lots to learn and improvments)
    CREATE TABLE "dbo"."F_BI_Cla ss_Attendance_D etail"
    (
    "CLASS_ATTENDAN CE_ID" VARCHAR(50) NULL,
    "CLASSES_OFFERE D_ID" VARCHAR(26) NULL,
    "CLASS_CAMPUS_I D" VARCHAR(10) NULL,
    "STUDENT_ID " CHAR(20) NULL,
    "FULL_CLASS _ID" CHAR(15) NOT NULL,
    "SESSION_ID " CHAR(10) NULL,
    "SECTION_ID " VARCHAR(5) NULL,
    "MEET_DT" DATETIME NULL,
    "MEETING" SMALLINT NULL,
    "PRESENT" CHAR(2) NOT NULL,
    "SESSION_SK EY" BIGINT NULL,
    "STUDENT_SK EY" BIGINT NULL,
    "CLASS_CAMPUS_S KEY" BIGINT NULL,
    "CLASSES_OFFERE D_SKEY" BIGINT NULL,
    "LOAD_DT" DATETIME NULL,
    "COMPUTED_D T" DATETIME NULL
    )
    ;

  • rcamarda

    #2
    Re: Smarter Table build - might split the table

    I have not partitioned the table. Here are the indexes:
    CREATE INDEX CLASS_ATTENDANC E_ID ON
    "dbo"."F_BI_Cla ss_Attendance_D etail" ( "CLASS_ATTENDAN CE_ID" );
    CREATE INDEX CLASS_CAMPUS_ID ON "dbo"."F_BI_Cla ss_Attendance_D etail" (
    "CLASS_CAMPUS_I D" );
    CREATE INDEX STUDENT_ID ON "dbo"."F_BI_Cla ss_Attendance_D etail" (
    "STUDENT_ID " );
    CREATE INDEX FULL_CLASS_ID ON "dbo"."F_BI_Cla ss_Attendance_D etail" (
    "FULL_CLASS _ID" );
    CREATE INDEX SESSION_ID ON "dbo"."F_BI_Cla ss_Attendance_D etail" (
    "SESSION_ID " );
    CREATE INDEX MEETING ON "dbo"."F_BI_Cla ss_Attendance_D etail" (
    "MEETING" );
    CREATE INDEX PRESENT ON "dbo"."F_BI_Cla ss_Attendance_D etail" (
    "PRESENT" );
    CREATE INDEX SESSION_SKEY ON "dbo"."F_BI_Cla ss_Attendance_D etail" (
    "SESSION_SK EY" );
    CREATE INDEX STUDENT_SKEY ON "dbo"."F_BI_Cla ss_Attendance_D etail" (
    "STUDENT_SK EY" );
    CREATE INDEX CLASS_CAMPUS_SK EY ON "dbo"."F_BI_Cla ss_Attendance_D etail"
    ( "CLASS_CAMPUS_S KEY" );

    Comment

    • rcamarda

      #3
      Re: Smarter Table build - might split the table

      Upon more research in this news group, it looks like I should look at
      horizontal partitioning.
      I am using SQL 2005 Enterprise, so I should be able to use partitioned
      data and indexes.
      If I partition on year, what should I consider to make this as
      maintenance free as possible (what happens where a new year comes?)
      TIA
      Rob

      Comment

      • Hugo Kornelis

        #4
        Re: Smarter Table build - might split the table

        On 10 Jul 2006 07:31:10 -0700, rcamarda wrote:

        (snip)
        >Heres the situation: I find it takes 45 minutes to do a relational
        >update, where only the records that changed in the last day will be
        >loaded.
        Hi Rob,

        What column(s) is/are used to relate rows in the staging tables to the
        matching rows in the target table? Are there any indexes on the staging
        table? Can you post the query you use for this relational update?

        --
        Hugo Kornelis, SQL Server MVP

        Comment

        Working...