Finding duplicates in an array

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Thekid
    New Member
    • Feb 2007
    • 145

    Finding duplicates in an array

    I'm trying to figure out a way to find if there are duplicates in an array. My idea was to take the array as 'a' and make a second array as 'b' and remove the duplicates from 'b' using 'set' and then compare a to b. If they're different then it will print out 'duplicates found'. The problem is that even after trying different arrays, some with duplicates some without, that 'b' rearranges the numbers. Here's an example:

    Code:
    a='1934, 2311, 1001, 4056, 1001, 3459, 9078'
    b=list(set(a))
    if a != b:
        print "duplicates found"
    else:
       print "nothing found"
    Is there a simpler way to find if there are duplicates?
    Thanks
  • bvdet
    Recognized Expert Specialist
    • Oct 2006
    • 2851

    #2
    In your code, you have assigned variable 'a' to a string.
    Code:
    >>> list(set(a))
    [' ', ',', '1', '0', '3', '2', '5', '4', '7', '6', '9', '8']
    >>>
    To see if there are any duplicates, let's start a list. Sets are unordered, but you can compare the length of the list to the length of the set.
    Code:
    >>> a=[1934, 2311, 1001, 4056, 1001, 3459, 9078]
    >>> b = set(a)
    >>> len(b)
    6
    >>> len(a)
    7
    >>>

    Comment

    • bvdet
      Recognized Expert Specialist
      • Oct 2006
      • 2851

      #3
      To find the items that have duplicates:
      Code:
      >>> for item in a:
      ... 	dd[item] = dd.get(item, 0) + 1
      ... 	
      >>> dd
      {3459: 1, 2311: 1, 1001: 2, 1934: 1, 9078: 1, 4056: 1}
      >>>
      OR (less efficient)
      Code:
      >>> for item in set(a):
      ... 	if a.count(item) > 1:
      ... 		print "Duplicate found: %s" % (item)
      ... 		
      Duplicate found: 1001
      >>>

      Comment

      • Thekid
        New Member
        • Feb 2007
        • 145

        #4
        Thanks! I went with your first suggestion, which was along the lines of what I was thinking but I didn't consider comparing the lengths since set() is unordered.

        Comment

        Working...