Finding duplicates rows in a table

**ck9663** · Nov 5 '08, 07:33 PM

Originally posted by Gayatree

I have to concatenate 2 colimns in a table and find duplicates in them.
I already used the method
select * from tableA a
where (select count(*) from TableA b
where acol1+ +col2 = b.col1+ +col2)>1
But the performance is very bad.
Data is also huge
Can you help me
Thanks in advance

try this to show the duplicates.

Code:

select a.*
from tableA a
inner join (select col1, col2 from tableA b group by col1, col2 having count(*) > 1) s on a.col1 = s.col1 and s.col2 = a.col2

Happy coding!

-- CK

**Gayatree** · Nov 5 '08, 07:54 PM

Hi ck9663,
Thank u for u r immediate reply.
But when I run the below query

select Col1 + + Col2
from TableA
group by Col1 + + Col2
having(count(*) )>1

I should get the duplicates .
But the the duplicate which I got from the query which u suggested is
different from the above qurey.
Is the above query wrong?
can u please help me
Thanks in advance

**ck9663** · Nov 5 '08, 08:14 PM

try:

Code:

select Col1, Col2
from TableA
group by Col, Col2
having count(*) >1

By the way, use the tags when posting.

-- CK

**Gayatree** · Nov 5 '08, 08:32 PM

Hi CK
Thank u for u r reply.
The 2 query's which u have suggested r working well.
But The count which both of them return should be the same.
But I am getting 1000 more records with the first one.

Next time when i post a query I will use tags

Can u please help me
If I am not clear with the question me please let me know
Thank u in advance

**ck9663** · Nov 5 '08, 09:34 PM

This query will give the duplicate:

Code:

select Col1, Col2, count(*) as cnt
from TableA
group by Col, Col2
having count(*) >1

Look at cnt column. If it says "2" it means you have two records on your table with the same col1 and col2. This means it will return the unique combination of the two columns and the number of rows that are duplicate. Not the actual duplicate rows. Make sense?

-- CK

**Gayatree** · Nov 5 '08, 10:02 PM

YUP ,GOT U

Thank u So much.

**Gayatree** · Nov 6 '08, 06:32 PM

Sorry for troubling you again.
I am loading the data using SSIS.
If there r 2 duplicates only one should be loaded into the
target table ,the remaining should go into another table.
Is this possible in SSIS.

Thank you for helping me and giving me all the above query's
Thank you

**ck9663** · Nov 6 '08, 09:15 PM

Am not sure if it's possible in SSIS.

What you could do is to create a trigger on your target table. If the value you're trying to insert on your table is already existing, insert it to another table and rollback the transaction.

On T-SQL, it would've been different. Depending on your case, a DISTINCT might be sufficient enough. If not, you might need to further analyze your table and establish an algorithm on which record to keep and which record to throw.

-- CK

**Gayatree** · Nov 6 '08, 10:22 PM

Thank you for u r immedaite answer.
I will try creating trigger.

In SSMS,when we right click on the table
we will get options as script table as create to ...ect
But for me alter to is disabled,what should I do to enable that.
why will that be disabled when I have permissions to devlop in that box
and even execute is disabled.

Can u please help me

Thank you in advance

**ck9663** · Nov 6 '08, 10:49 PM

Permission to see the table and do a SELECT on it, yes. Other rights and permissions, no. Call your admin to raise your permission rights.

-- CK

**Gayatree** · Nov 12 '08, 07:29 PM

Hi Ck,
Sorry for asking u the same question again.
the query above which u suggested will return all the duplicates in a table
But I want a select statement that will return only one duplicate.
Can u please help me
Thanks in advance

**ck9663** · Nov 12 '08, 08:53 PM

Which query?

This:

Code:

select Col1, Col2
from TableA
group by Col, Col2
having count(*) >1

will only return one row for each pair.

-- CK

**Gayatree** · Nov 12 '08, 09:33 PM

Thank you for u r immediate reply.

But i need to select all the fileds in the table.
For that I got the query like

Code:

CREATE TABLE [dbo].[Employee] (
            [id] [int] NULL ,
            [name] [Varchar] (50) COLLATE Latin1_General_CI_AS NULL ,
            [salary] [Numeric](18, 2) NULL 
 ) ON [PRIMARY]

Insert into employee values (1,'Ram', 1000.00)
Insert into employee values (1,'Ram', 1000.00)
Insert into employee values (2,'Joe', 2000.00)
Insert into employee values (2,'Joe', 1000.00)
Insert into employee values (3,'Mary', 1000.00)
Insert into employee values (4,'Julie', 5000.00)
Insert into employee values (2,'Joe', 1000.00)
Insert into employee values (1,'Ram', 1000.00)

--The above is the table and I am trying to find out duplicates

--My query is
SELECT *  FROM employee
WHERE ID IN  (SELECT ID  FROM employee   
WHERE EXIST(SELECT NULL FROM employee AS tmptable     
 WHERE employee.id = tmptable.id
 group by id          
HAVING (employee.id > MIN(tmptable.id))))

But it throwing me errors
Sorry for troubling u again

Thanks in advance

**ck9663** · Nov 13 '08, 03:12 AM

Joe (ID = 2) has three records with 2 salary values (1000.00 and 2000.00). Which one do you want to return? Is there a pattern?

-- CK

Finding duplicates rows in a table

Finding duplicates rows in a table

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment