selecting changed rows only

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • remote89@hotmail.com

    selecting changed rows only

    Hi experts,

    I have been trying to limit the table rows in the following situation,
    any suggestions will be appreciated.

    we have table called tempTb has columns id, c_id, c_name, rating, date
    columns.
    id is an identity column.
    date is a datetime column, the rest are varchar datatype.
    Here is the table structure with sample data,

    id c_id c_name rating date
    1 ao amer onli 1 1/1/2002
    2 ao amer onli 1 3/1/2002
    3 ao amer onli 1 6/1/2002
    4 ao amer onli 3 9/1/2002
    5 ao amer onli 3 12/1/2002
    6 ao amer onli 3 3/1/2003
    7 ao amer onli 3 6/1/2003
    8 ao amer onli 3 9/1/2003
    9 ao amer onli 2 12/1/2003
    10 ao amer onli 1 6/1/2004
    11 ao amer onli 1 12/1/2004
    12 xy xabs yasd 1 1/1/2002
    13 xy xabs yasd 2 3/1/2002
    14 xy xabs yasd 2 6/1/2002
    15 xy xabs yasd 2 9/1/2002
    16 xy xabs yasd 1 12/1/2002
    17 xy xabs yasd 1 3/1/2003
    18 xy xabs yasd 3 6/1/2003
    19 xy xabs yasd 3 9/1/2003
    20 xy xabs yasd 2 12/1/2003
    21 xy xabs yasd 1 6/1/2004
    22 xy xabs yasd 1 12/1/2004
    [color=blue]
    >From this table I need to select the rows with rating changes only,[/color]
    i.e if two or three consecutive rows have same rating only the first
    row should be selected.

    the selection should look like...
    id c_id c_name rating date
    1 ao amer onli 1 1/1/2002
    4 ao amer onli 3 9/1/2002
    9 ao amer onli 2 12/1/2003
    10 ao amer onli 1 6/1/2004
    12 xy xabs yasd 1 1/1/2002
    13 xy xabs yasd 2 3/1/2002
    16 xy xabs yasd 1 12/1/2002
    18 xy xabs yasd 3 6/1/2003
    20 xy xabs yasd 2 12/1/2003
    21 xy xabs yasd 1 6/1/2004

    I was trying to do this by self-joining the table like....

    select t1.* from tempTb t1, tempTb t2
    where t1.id!=t2.id,
    t1.c_id=t2.c_id ,
    t1.c_name=t2.c_ name,
    t1.rating!=t2.r ating.
    But this is generating cartesian products,
    I have tried some other combinations after where clause with date colmn
    wtc,
    but none seems to give the required result.

    so if anybody can guide me in the right direction I would appreciate
    it.

    Thanks alot,
    Remote

  • Carl Kayser

    #2
    Re: selecting changed rows only

    Off the top of my head, I would conjecture

    select min (id), c_id, c_name, rating, min (rating_date)
    from tempTb
    group by c_id, c_name, rating

    By "first" I assume that you mean "lowest id value" - else Joe Celko will
    point out the error of your terminology. You may also need "select
    distinct" instead of "select". Upon further reflection, the above will
    include rows without a rating change. So ...

    select distinct min (T1.id), T1.c_id, T1.c_name, T1.rating, min
    (T1.rating_date )
    from tempTb T1
    where exists (select T2.id
    from tempTb T2
    where T2.c_id = T1.c_id
    and T2.rating <> T1.rating)
    group by T1.c_id, T1.c_name, T1.rating
    having exists (select T2.id
    from tempTb T2
    where T2.c_id = T1.c_id
    and T2.rating <> T1.rating)

    I'm not sure as to whether "having exists" is valid or not. "distinct" is
    definitely needed with this SQL.

    <remote89@hotma il.com> wrote in message
    news:1106362103 .355644.244670@ c13g2000cwb.goo glegroups.com.. .[color=blue]
    > Hi experts,
    >
    > I have been trying to limit the table rows in the following situation,
    > any suggestions will be appreciated.
    >
    > we have table called tempTb has columns id, c_id, c_name, rating, date
    > columns.
    > id is an identity column.
    > date is a datetime column, the rest are varchar datatype.
    > Here is the table structure with sample data,
    >
    > id c_id c_name rating date
    > 1 ao amer onli 1 1/1/2002
    > 2 ao amer onli 1 3/1/2002
    > 3 ao amer onli 1 6/1/2002
    > 4 ao amer onli 3 9/1/2002
    > 5 ao amer onli 3 12/1/2002
    > 6 ao amer onli 3 3/1/2003
    > 7 ao amer onli 3 6/1/2003
    > 8 ao amer onli 3 9/1/2003
    > 9 ao amer onli 2 12/1/2003
    > 10 ao amer onli 1 6/1/2004
    > 11 ao amer onli 1 12/1/2004
    > 12 xy xabs yasd 1 1/1/2002
    > 13 xy xabs yasd 2 3/1/2002
    > 14 xy xabs yasd 2 6/1/2002
    > 15 xy xabs yasd 2 9/1/2002
    > 16 xy xabs yasd 1 12/1/2002
    > 17 xy xabs yasd 1 3/1/2003
    > 18 xy xabs yasd 3 6/1/2003
    > 19 xy xabs yasd 3 9/1/2003
    > 20 xy xabs yasd 2 12/1/2003
    > 21 xy xabs yasd 1 6/1/2004
    > 22 xy xabs yasd 1 12/1/2004
    >[color=green]
    > >From this table I need to select the rows with rating changes only,[/color]
    > i.e if two or three consecutive rows have same rating only the first
    > row should be selected.
    >
    > the selection should look like...
    > id c_id c_name rating date
    > 1 ao amer onli 1 1/1/2002
    > 4 ao amer onli 3 9/1/2002
    > 9 ao amer onli 2 12/1/2003
    > 10 ao amer onli 1 6/1/2004
    > 12 xy xabs yasd 1 1/1/2002
    > 13 xy xabs yasd 2 3/1/2002
    > 16 xy xabs yasd 1 12/1/2002
    > 18 xy xabs yasd 3 6/1/2003
    > 20 xy xabs yasd 2 12/1/2003
    > 21 xy xabs yasd 1 6/1/2004
    >
    > I was trying to do this by self-joining the table like....
    >
    > select t1.* from tempTb t1, tempTb t2
    > where t1.id!=t2.id,
    > t1.c_id=t2.c_id ,
    > t1.c_name=t2.c_ name,
    > t1.rating!=t2.r ating.
    > But this is generating cartesian products,
    > I have tried some other combinations after where clause with date colmn
    > wtc,
    > but none seems to give the required result.
    >
    > so if anybody can guide me in the right direction I would appreciate
    > it.
    >
    > Thanks alot,
    > Remote
    >[/color]


    Comment

    • Lee Tudor

      #3
      Re: selecting changed rows only

      >>From this table I need to select the rows with rating changes only,[color=blue]
      > i.e if two or three consecutive rows have same rating only the first
      > row should be selected.[/color]

      From your brief it looks like you want rows where there is no matching row
      preceeding it.

      SELECT
      id, c_id, c_name, rating, date
      FROM tempTb out
      WHERE NOT EXISTS(SELECT * FROM tempTb pre WHERE pre.id=(out.id-1) AND
      pre.c_id=out.c_ id AND pre.c_name=out. c_name AND pre.rating=out. rating)

      there should be an index(preferabl y clustered) on the id column for best
      performance.

      Mr Tea
      Blogger ist ein Veröffentlichungs-Tool von Google, mit dem du ganz einfach deine Gedanken der Welt mitteilen kannst. Mit Blogger kannst du problemlos Texte, Fotos und Videos in deinem persönlichen Blog oder deinem Team-Blog veröffentlichen.


      <remote89@hotma il.com> wrote in message
      news:1106362103 .355644.244670@ c13g2000cwb.goo glegroups.com.. .[color=blue]
      > Hi experts,
      >
      > I have been trying to limit the table rows in the following situation,
      > any suggestions will be appreciated.
      >
      > we have table called tempTb has columns id, c_id, c_name, rating, date
      > columns.
      > id is an identity column.
      > date is a datetime column, the rest are varchar datatype.
      > Here is the table structure with sample data,
      >
      > id c_id c_name rating date
      > 1 ao amer onli 1 1/1/2002
      > 2 ao amer onli 1 3/1/2002
      > 3 ao amer onli 1 6/1/2002
      > 4 ao amer onli 3 9/1/2002
      > 5 ao amer onli 3 12/1/2002
      > 6 ao amer onli 3 3/1/2003
      > 7 ao amer onli 3 6/1/2003
      > 8 ao amer onli 3 9/1/2003
      > 9 ao amer onli 2 12/1/2003
      > 10 ao amer onli 1 6/1/2004
      > 11 ao amer onli 1 12/1/2004
      > 12 xy xabs yasd 1 1/1/2002
      > 13 xy xabs yasd 2 3/1/2002
      > 14 xy xabs yasd 2 6/1/2002
      > 15 xy xabs yasd 2 9/1/2002
      > 16 xy xabs yasd 1 12/1/2002
      > 17 xy xabs yasd 1 3/1/2003
      > 18 xy xabs yasd 3 6/1/2003
      > 19 xy xabs yasd 3 9/1/2003
      > 20 xy xabs yasd 2 12/1/2003
      > 21 xy xabs yasd 1 6/1/2004
      > 22 xy xabs yasd 1 12/1/2004
      >[color=green]
      >>From this table I need to select the rows with rating changes only,[/color]
      > i.e if two or three consecutive rows have same rating only the first
      > row should be selected.
      >
      > the selection should look like...
      > id c_id c_name rating date
      > 1 ao amer onli 1 1/1/2002
      > 4 ao amer onli 3 9/1/2002
      > 9 ao amer onli 2 12/1/2003
      > 10 ao amer onli 1 6/1/2004
      > 12 xy xabs yasd 1 1/1/2002
      > 13 xy xabs yasd 2 3/1/2002
      > 16 xy xabs yasd 1 12/1/2002
      > 18 xy xabs yasd 3 6/1/2003
      > 20 xy xabs yasd 2 12/1/2003
      > 21 xy xabs yasd 1 6/1/2004
      >
      > I was trying to do this by self-joining the table like....
      >
      > select t1.* from tempTb t1, tempTb t2
      > where t1.id!=t2.id,
      > t1.c_id=t2.c_id ,
      > t1.c_name=t2.c_ name,
      > t1.rating!=t2.r ating.
      > But this is generating cartesian products,
      > I have tried some other combinations after where clause with date colmn
      > wtc,
      > but none seems to give the required result.
      >
      > so if anybody can guide me in the right direction I would appreciate
      > it.
      >
      > Thanks alot,
      > Remote
      >[/color]


      Comment

      • remote89@hotmail.com

        #4
        Re: selecting changed rows only

        Thank you very much Mr Tea, I appreciate your help.
        I was looking for something that can compare the present row data with
        its neighbors
        like pre.id=out.id-1
        you got the bullseye right away.

        Thanks for trying Carl,
        I tried the query you have suggested.
        Its eliminating the duplicate records but it doesn't detect the change
        as I needed.
        just for the record, it gives out the resuts like....

        id c_id c_name rating ratingdate
        1 ao amer onli 1 2002-01-01
        9 ao amer onli 2 2003-12-01
        4 ao amer onli 3 2002-09-01
        12 xy xabs yasd 1 2002-01-01
        13 xy xabs yasd 2 2002-03-01
        18 xy xabs yasd 3 2003-06-01

        Comment

        • DBAGAL

          #5
          Re: selecting changed rows only

          Hi All,

          Here is a more generalized approach ... you assume that id is
          sequential (it may not be as identity columns can contain gaps) ... I
          would also worry about the ordering of the data as well (I agree with
          Mr Tea on the index, for sure) ... the query works with gaps ... let me
          know if you see a flaw ...

          Good Luck ...

          Sara (DBAGAL 4 ever!!)

          ....

          SELECT
          id, c_id, c_name, rating, date1
          FROM test1 t1
          WHERE NOT EXISTS (SELECT 1
          FROM test1 t2
          WHERE t2.id=(select max(t3.id)
          from test1 t3
          where t2.c_id=t1.c_id
          AND t2.c_name=t1.c_ name
          AND t2.rating=t1.ra ting
          and t3.id < t1.id)
          and t2.c_name = t1.c_name
          and t2.rating = t1.rating)

          Comment

          Working...