MSSQL - DTS Package - Find distinct rows - Output to TXT file - ActiveX?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • jb1

    MSSQL - DTS Package - Find distinct rows - Output to TXT file - ActiveX?

    Hello All,

    I am trying to create a DTS package.
    I have two tables tbl_A and tbl_B with similar data/rows but no
    primary keys.
    tbl_A is master.

    I would like this package to query tbl_A and tbl_B and find
    1)all rows in tbl_A that are different in tbl_B, 2)all rows in tbl_A
    that are not present in tbl_B and
    3)all rows in tbl_B that are not present in tbl_A, and then just show
    those rows.

    Can this be done with a simple UNION?

    Perhaps this could produce a temp Table that can be dropped once the
    DTS package exists successfully.

    The 2nd part after all the above rows are retrieved is that I would
    like to add an addional Column to the retrieved data called STATUS
    which has 3 possible values(letters) at the end of each row...

    M (modified) means that row exists in tbl_B but has 1 or more
    different columns
    A (add) means this row exists in tbl_A but not in tbl_B
    D (delete) means this row exists in tbl_B but not in tbl_A

    I'm hopping this DTS package would output a nice comma seperated TXT
    file with only...
    1) rows from tbl_A that are different in tbl_B (STATUS M)
    2) rows from tbl_A that are not present in tbl_B (STATUS A)
    3) rows from tbl_B that are not present in tbl_A (STATUS D)


    Can a DTS package in MS SQL be used to perfom all of the above tasks?
    I would very much appreciate any help or any advise.

    Thanks in advance :-)

  • Ed Murphy

    #2
    Re: MSSQL - DTS Package - Find distinct rows - Output to TXT file- ActiveX?

    jb1 wrote:
    I am trying to create a DTS package.
    I have two tables tbl_A and tbl_B with similar data/rows but no
    primary keys.
    tbl_A is master.
    >
    I would like this package to query tbl_A and tbl_B and find
    1)all rows in tbl_A that are different in tbl_B, 2)all rows in tbl_A
    that are not present in tbl_B and
    3)all rows in tbl_B that are not present in tbl_A, and then just show
    those rows.
    In the absence of primary keys, what conceptual rule do you use to
    associate rows in tbl_A with rows in tbl_B? For instance, consider
    this hypothetical data:

    [tbl_A]

    col1 | col2
    -----+-----
    1 | 2
    3 | 4

    [tbl_B]

    col1 | col2
    -----+-----
    1 | 4
    3 | 2

    What associations would you draw here, and why?
    Can this be done with a simple UNION?
    >
    Perhaps this could produce a temp Table that can be dropped once the
    DTS package exists successfully.
    Why not a view?

    Comment

    • undercups

      #3
      Re: MSSQL - DTS Package - Find distinct rows - Output to TXT file - ActiveX?

      The answer is to write 3 queries, 1 for each situation you describe
      and link the output using the UNION ALL statement

      Something like

      SELECT a.id, a.col1, a.col2, a.col3, b.col4 FROM table1 a JOIN table2
      b on a.id = b.id
      WHERE a.col1 <b.col1
      UNION ALL
      SELECT a.id, a.col1, a.col2, a.col3, b.col4 FROM table1 a LEFT JOIN
      table2 b on a.id = b.id
      WHERE b.col4 IS NULL
      UNION ALL
      SELECT b.id, b.col1, b.col2, b.col3, a.col4 FROM table2 b LEFT JOIN
      table1 a on b.id = a.id
      WHERE a.col4 IS NULL

      Note that the col1, col2, col3 and col4 in each of the queries must be
      of the same datatypes or can be converted to the same datatypes e.g.
      VARCHAR(20) is the same as CHAR(5). The where clauses can be more
      specific and varied than I have shown but the important part is the
      test for IS NULL in the last 2 queries as this is the way to find rows
      in one table and that are not in another.

      Hope this helps


      Comment

      • jb1

        #4
        Re: MSSQL - DTS Package - Find distinct rows - Output to TXT file - ActiveX?

        On 12 Jun, 16:24, undercups <d...@woodace.c o.ukwrote:
        The answer is to write 3 queries, 1 for each situation you describe
        and link the output using the UNION ALL statement
        >
        Something like
        >
        SELECT a.id, a.col1, a.col2, a.col3, b.col4 FROM table1 a JOIN table2
        b on a.id = b.id
        WHERE a.col1 <b.col1
        UNION ALL
        SELECT a.id, a.col1, a.col2, a.col3, b.col4 FROM table1 a LEFT JOIN
        table2 b on a.id = b.id
        WHERE b.col4 IS NULL
        UNION ALL
        SELECT b.id, b.col1, b.col2, b.col3, a.col4 FROM table2 b LEFT JOIN
        table1 a on b.id = a.id
        WHERE a.col4 IS NULL
        >
        That's great! Thank you. I'll have a look into this.
        Do you have any ideas how I can add the above mentioned STATUS column
        after each returned row? eg. If row is different in tbl_B from tbl_A
        then add 'M' at the end of the row, as has Master data. something
        like.. ("col1.value"," col2.value","co l3.value","M")

        Thanks again :-)
        Note that the col1, col2, col3 and col4 in each of the queries must be
        of the same datatypes or can be converted to the same datatypes e.g.
        VARCHAR(20) is the same as CHAR(5). The where clauses can be more
        specific and varied than I have shown but the important part is the
        test for IS NULL in the last 2 queries as this is the way to find rows
        in one table and that are not in another.
        >
        Hope this helps

        Comment

        • Erland Sommarskog

          #5
          Re: MSSQL - DTS Package - Find distinct rows - Output to TXT file - ActiveX?

          jb1 (james.benson1@ googlemail.com) writes:
          On 12 Jun, 16:24, undercups <d...@woodace.c o.ukwrote:
          >The answer is to write 3 queries, 1 for each situation you describe
          >and link the output using the UNION ALL statement
          >>
          >Something like
          >>
          >SELECT a.id, a.col1, a.col2, a.col3, b.col4 FROM table1 a JOIN table2
          >b on a.id = b.id
          >WHERE a.col1 <b.col1
          Note here that you should probably compare all columns. If columns can
          be nullable, this can be quite hairy.
          That's great! Thank you. I'll have a look into this.
          Do you have any ideas how I can add the above mentioned STATUS column
          after each returned row? eg. If row is different in tbl_B from tbl_A
          then add 'M' at the end of the row, as has Master data. something
          like.. ("col1.value"," col2.value","co l3.value","M")
          Just att the desired status letter to the column list of the queries,
          for instance:

          SELECT a.id, a.col1, a.col2, a.col3, b.col4, 'M' AS status



          --
          Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

          Books Online for SQL Server 2005 at

          Books Online for SQL Server 2000 at

          Comment

          • Ed Murphy

            #6
            Re: MSSQL - DTS Package - Find distinct rows - Output to TXT file- ActiveX?

            undercups wrote:
            The answer is to write 3 queries, 1 for each situation you describe
            and link the output using the UNION ALL statement
            >
            Something like
            >
            SELECT a.id, a.col1, a.col2, a.col3, b.col4 FROM table1 a JOIN table2
            b on a.id = b.id
            WHERE a.col1 <b.col1
            UNION ALL
            (etc.)

            The problem is that, if the tables don't have primary keys, then they
            may not have a column like 'id'. (Or maybe they do, and it just isn't
            defined as a primary key - in which case, why not?)

            Comment

            • jb1

              #7
              Re: MSSQL - DTS Package - Find distinct rows - Output to TXT file - ActiveX?

              On 13 Jun, 06:25, Ed Murphy <emurph...@soca l.rr.comwrote:
              undercups wrote:
              The answer is to write 3 queries, 1 for each situation you describe
              and link the output using the UNION ALL statement
              >
              Something like
              >
              SELECT a.id, a.col1, a.col2, a.col3, b.col4 FROM table1 a JOIN table2
              b on a.id = b.id
              WHERE a.col1 <b.col1
              UNION ALL
              >
              (etc.)
              >
              The problem is that, if the tables don't have primary keys, then they
              may not have a column like 'id'. (Or maybe they do, and it just isn't
              defined as a primary key - in which case, why not?)
              Well there are about 5 or 6 cols in each table and I need to check
              each one to see if anything has changed. There is 1 column in each
              that is kind of like the ID you are talking about though it is not set
              to Primary Key.

              Anyway, this is what I have for the 1st query to find Modified rows in
              TableMaster...

              SELECT a.id,
              a.col2, a.col3, a.col4, a.col5,
              b.id, b.col2, b.col3, b.col4, b.col5
              'M' AS status

              FROM TableMaster a JOIN TableSlave b on a.id = b.id

              WHERE (a.col2 <b.col2) OR (a.col3 <b.col3) OR (a.col4 <b.col4)
              OR (a.col5 <b.col5)
              UNION ALL

              Now does the UNION ALL command at the end imply that another query
              will follow and that the results from this query and the other query
              be joined?


              Will the above view query return what I'm looking for? - rows in
              TableMaster that are different in TableSlave? so tha later I can
              update TableSlave with these new modified row.


              How could I continue and query Rows that are present in TableMaster
              but not in TableSlave? - Status 'A' (add)


              Finally, I would like to put a 3rd query in for TableMaster and
              TableGrandMaste r, which tells me which rows are present in
              TableGrandMaste r but missing in TableMaster - Status 'D' (delete)

              Thank you again everyone for all your help and advise! :-)

              JB

              Comment

              • Ed Murphy

                #8
                Re: MSSQL - DTS Package - Find distinct rows - Output to TXT file- ActiveX?

                jb1 wrote:
                On 13 Jun, 06:25, Ed Murphy <emurph...@soca l.rr.comwrote:
                >undercups wrote:
                >>The answer is to write 3 queries, 1 for each situation you describe
                >>and link the output using the UNION ALL statement
                >>Something like
                >>SELECT a.id, a.col1, a.col2, a.col3, b.col4 FROM table1 a JOIN table2
                >>b on a.id = b.id
                >>WHERE a.col1 <b.col1
                >>UNION ALL
                >(etc.)
                >>
                >The problem is that, if the tables don't have primary keys, then they
                >may not have a column like 'id'. (Or maybe they do, and it just isn't
                >defined as a primary key - in which case, why not?)
                >
                Well there are about 5 or 6 cols in each table and I need to check
                each one to see if anything has changed. There is 1 column in each
                that is kind of like the ID you are talking about though it is not set
                to Primary Key.
                Like I said, why isn't it? Is there a good reason not to set it
                to Primary Key now?
                Anyway, this is what I have for the 1st query to find Modified rows in
                TableMaster...
                >
                SELECT a.id,
                a.col2, a.col3, a.col4, a.col5,
                b.id, b.col2, b.col3, b.col4, b.col5
                'M' AS status
                >
                FROM TableMaster a JOIN TableSlave b on a.id = b.id
                >
                WHERE (a.col2 <b.col2) OR (a.col3 <b.col3) OR (a.col4 <b.col4)
                OR (a.col5 <b.col5)
                UNION ALL
                >
                Now does the UNION ALL command at the end imply that another query
                will follow and that the results from this query and the other query
                be joined?
                Yes. ALL means to include duplicates, which is unnecessary in this
                case (the first section of the query will contribute rows with
                status = M, the second will contribute rows with status = A, etc.).
                Will the above view query return what I'm looking for? - rows in
                TableMaster that are different in TableSlave? so tha later I can
                update TableSlave with these new modified row.
                Yes, assuming that 'id' values are not changed or duplicated.
                How could I continue and query Rows that are present in TableMaster
                but not in TableSlave? - Status 'A' (add)
                For parallelism with the part before UNION ALL:

                select a.id, a.col2, a.col3, a.col4, a.col5,
                b.id, b.col2, b.col3, b.col4, b.col5
                'A' AS status
                from TableMaster a
                left join TableSlave b on a.id = b.id
                where b.id is null
                Finally, I would like to put a 3rd query in for TableMaster and
                TableGrandMaste r, which tells me which rows are present in
                TableGrandMaste r but missing in TableMaster - Status 'D' (delete)
                select a.id, a.col2, a.col3, a.col4, a.col5,
                b.id, b.col2, b.col3, b.col4, b.col5
                'D' AS status
                from TableSlave b
                left join TableMaster a on a.id = b.id
                where a.id is null

                There's also RIGHT JOIN, but I avoid it because it's confusing, and
                one of the main packages I work on doesn't allow it anyway.

                Comment

                Working...