DTS that does HouseHolding with Millions of records

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • rperetz@gmail.com

    DTS that does HouseHolding with Millions of records

    Hi all,
    I was given a task to create a houseHolding logic under a table that
    have millions records.
    first let me explain what is a house holding:

    let's say I have 2 records that have the same phone number, that mean
    that both records are under the same household, but this can get more
    complicated
    this article explain it


    if anyone worked with household he knows that you need to scan the
    table many time to get all the house holds, I used a dts to do it.

    I tested the dts on 11 records like the article did and that work
    great, but once I went to million records each loop is taking me 2 hour
    or so....a and I have no idea how how many loops I will have to do.

    if anyone out there worked with household queries and used sql, your
    imput would help me allot

    thanks.

  • John Bell

    #2
    Re: DTS that does HouseHolding with Millions of records

    Hi

    I assume there is an index on this column (or a covering index on the ones
    you are using)? Have you checked the query plan to see if other indexing is
    needed, or have passed it through the index tuning wizard?

    John


    <rperetz@gmail. comwrote in message
    news:1157148982 .155829.212030@ p79g2000cwp.goo glegroups.com.. .
    Hi all,
    I was given a task to create a houseHolding logic under a table that
    have millions records.
    first let me explain what is a house holding:
    >
    let's say I have 2 records that have the same phone number, that mean
    that both records are under the same household, but this can get more
    complicated
    this article explain it

    >
    if anyone worked with household he knows that you need to scan the
    table many time to get all the house holds, I used a dts to do it.
    >
    I tested the dts on 11 records like the article did and that work
    great, but once I went to million records each loop is taking me 2 hour
    or so....a and I have no idea how how many loops I will have to do.
    >
    if anyone out there worked with household queries and used sql, your
    imput would help me allot
    >
    thanks.
    >

    Comment

    • rperetz@gmail.com

      #3
      Re: DTS that does House Holding with Millions of records

      Hi John,
      Thanks for your replay, but yes I push data to the table with no index
      in order to make the insert faster and then I created Indexes on all
      the fields.

      the dts is a process of 4 queries that pump data to 4 tables, I start
      with table called Household2 and process it all the way to table Chain4
      (view article). and I redo the process by moving the results I got in
      table chain4 back to Household2 table and rerun the whole thing until
      both table are the same (household2 and chain4)

      if you test this on a small number of records (say 20) it would take
      seconds but for 2 million well that is still running, and I have no
      idea when it would end.

      Now that process take 2 hours for one scan, but you have to re do the
      same process until you don't find anymore households...so far I running
      it for 40 hours and it did about 19 loops on the 4


      I will look into index tuning wizard, I never heard of it.
      can you give me more info about it?


      John Bell wrote:
      Hi
      >
      I assume there is an index on this column (or a covering index on the ones
      you are using)? Have you checked the query plan to see if other indexing is
      needed, or have passed it through the index tuning wizard?
      >
      John
      >
      >
      <rperetz@gmail. comwrote in message
      news:1157148982 .155829.212030@ p79g2000cwp.goo glegroups.com.. .
      Hi all,
      I was given a task to create a houseHolding logic under a table that
      have millions records.
      first let me explain what is a house holding:

      let's say I have 2 records that have the same phone number, that mean
      that both records are under the same household, but this can get more
      complicated
      this article explain it


      if anyone worked with household he knows that you need to scan the
      table many time to get all the house holds, I used a dts to do it.

      I tested the dts on 11 records like the article did and that work
      great, but once I went to million records each loop is taking me 2 hour
      or so....a and I have no idea how how many loops I will have to do.

      if anyone out there worked with household queries and used sql, your
      imput would help me allot

      thanks.

      Comment

      • John Bell

        #4
        Re: DTS that does House Holding with Millions of records

        Hi

        Maintaining indexes will slow down inserts, but as your process is also
        doing alot of reads it could be justify to either re-create the index after
        loading data and before sorting, or leaving it there.

        If you want to post the queries and DDL (see
        http://www.aspfaq.com/etiquette.asp?id=5006 on how to do this) someone may
        be able to see a way to improve the matching.

        John




        <rperetz@gmail. comwrote in message
        news:1157317318 .437902.8430@i3 g2000cwc.google groups.com...
        Hi John,
        Thanks for your replay, but yes I push data to the table with no index
        in order to make the insert faster and then I created Indexes on all
        the fields.
        >
        the dts is a process of 4 queries that pump data to 4 tables, I start
        with table called Household2 and process it all the way to table Chain4
        (view article). and I redo the process by moving the results I got in
        table chain4 back to Household2 table and rerun the whole thing until
        both table are the same (household2 and chain4)
        >
        if you test this on a small number of records (say 20) it would take
        seconds but for 2 million well that is still running, and I have no
        idea when it would end.
        >
        Now that process take 2 hours for one scan, but you have to re do the
        same process until you don't find anymore households...so far I running
        it for 40 hours and it did about 19 loops on the 4
        >
        >
        I will look into index tuning wizard, I never heard of it.
        can you give me more info about it?
        >
        >
        John Bell wrote:
        >Hi
        >>
        >I assume there is an index on this column (or a covering index on the
        >ones
        >you are using)? Have you checked the query plan to see if other indexing
        >is
        >needed, or have passed it through the index tuning wizard?
        >>
        >John
        >>
        >>
        ><rperetz@gmail .comwrote in message
        >news:115714898 2.155829.212030 @p79g2000cwp.go oglegroups.com. ..
        Hi all,
        I was given a task to create a houseHolding logic under a table that
        have millions records.
        first let me explain what is a house holding:
        >
        let's say I have 2 records that have the same phone number, that mean
        that both records are under the same household, but this can get more
        complicated
        this article explain it

        >
        if anyone worked with household he knows that you need to scan the
        table many time to get all the house holds, I used a dts to do it.
        >
        I tested the dts on 11 records like the article did and that work
        great, but once I went to million records each loop is taking me 2 hour
        or so....a and I have no idea how how many loops I will have to do.
        >
        if anyone out there worked with household queries and used sql, your
        imput would help me allot
        >
        thanks.
        >
        >

        Comment

        Working...