Removing duplicates from a dict

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • vjjoshua
    New Member
    • Feb 2012
    • 1

    Removing duplicates from a dict

    Hi I have a dictionary that contains data like this

    dict = {'file1.txt': ['A', 'B' , 'C' , 'D' , 'E' ] , 'file2.txt': ['A', 'F' , 'C' , 'G' , 'E' ] , 'file3.txt': ['T', 'F' , 'C']}

    Could some one please help me write a code that could remove the duplicate values and change the file to

    dict = {'file1.txt': ['B' , 'D' ] , 'file2.txt': [ 'G' ] , 'file3.txt': [ 'T' ]}

    (only the unique values should remain)
  • bvdet
    Recognized Expert Specialist
    • Oct 2006
    • 2851

    #2
    Don't use dict as an identifier. It will mask built-in function dict().

    Following are a couple of ways:
    Code:
    dd = {'file1.txt': ['A', 'B' , 'C' , 'D' , 'E' ] ,
          'file2.txt': ['A', 'F' , 'C' , 'G' , 'E' ] ,
          'file3.txt': ['T', 'F' , 'C']}
    
    # Create a dictionary with a count of labels
    dd1 = {}
    for seq in dd.values():
        for label in seq:
            v = dd1.get(label, 0)
            dd1[label] = v+1
    
    for key in dd:
        dd[key] = [label for label in dd[key] if dd1[label] == 1]
    print dd
    
    
    dd = {'file1.txt': ['A', 'B' , 'C' , 'D' , 'E' ] ,
          'file2.txt': ['A', 'F' , 'C' , 'G' , 'E' ] ,
          'file3.txt': ['T', 'F' , 'C']}
    
    # Create an extended list from dd.values()
    extended = dd.values()[0][:]
    for seq in dd.values()[1:]:
        extended.extend(seq[:])
    for key in dd:
        dd[key] = [label for label in dd[key] if extended.count(label) == 1]
    print dd
    Output:
    Code:
    >>> {'file1.txt': ['B', 'D'], 'file3.txt': ['T'], 'file2.txt': ['G']}
    {'file1.txt': ['B', 'D'], 'file3.txt': ['T'], 'file2.txt': ['G']}
    >>>

    Comment

    Working...