finding data for duplicates in one field only

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Sandra Walsh
    New Member
    • Nov 2011
    • 62

    finding data for duplicates in one field only

    Hello =

    I have a very large table of addresses. I want to find only those records where the company name is the same but the address is different. In other words I want to eliminate records that only have one address for a company name.

    I could put the data into Excel and group by company name, then eliminate all records that have a count of only 1. Trouble is, I have almost 150,000 records and this won't fit in an Excel table.

    Thank you for your help,
    Sandra
    Last edited by Sandra Walsh; Nov 8 '11, 05:18 PM. Reason: spelling
  • Sandra Walsh
    New Member
    • Nov 2011
    • 62

    #2
    I figured it out.

    I created a sub-query called "q_CompanyCount " that had just the "company name" and "count of company name".

    Code:
    SELECT   LISTING.COMPANY_NM
           , Count(LISTING.COMPANY_NM) AS CountOfCOMPANY_NM
    FROM     LISTING
    GROUP BY LISTING.COMPANY_NM
    ORDER BY Count(LISTING.COMPANY_NM) DESC;
    I added this sub-query to the main query, linked by company name, and criteria
    Code:
    HAVING (((q_CompanyCount.CountOfCOMPANY_NM)>1))
    Worked great!
    Last edited by NeoPa; Nov 11 '11, 05:46 PM. Reason: Added mandatory [CODE] tags for you

    Comment

    • NeoPa
      Recognized Expert Moderator MVP
      • Oct 2006
      • 32648

      #3
      Nice answer Sandra, but you can simplify the code somewhat, and/or you could even include it in the main SQL as an embedded sub-query (See Subqueries in SQL) instead.

      Code:
      SELECT tL.*
      FROM   [tLISTING] AS [tL]
             INNER JOIN
             (SELECT   [COMPANY_NM]
              FROM     [tLISTING]
              GROUP BY [COMPANY_NM]
              HAVING   (Count(*) > 1)
             ) AS subQ
        ON   tL.COMPANY_NM = subQ.COMPANY_NM
      BTW: Sorting a sub-query is unnecessary and is ignored.

      PS. I had to use a different table name as the site treats[LISTING] as recognised tags and loses all spaces prior to it for some reason.
      Last edited by NeoPa; Nov 11 '11, 05:58 PM.

      Comment

      Working...