find repetitions in text file and delete repeated lines

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Mehr21
    New Member
    • Jul 2010
    • 2

    find repetitions in text file and delete repeated lines

    Hello

    I am new to python. I have merged two text files and I am trying to write a python script to find repetitions in these 2 files and delete the repeated lines. Can anyone give me some help on that? Thanks.
  • bvdet
    Recognized Expert Specialist
    • Oct 2006
    • 2851

    #2
    Let's assume you have read in the file as a list of lines and assigned it to the identifier lineList. This can be done with file object method readlines(). You can remove the duplicate lines with a for loop and the in operator.
    Code:
    output = []
    for item in lineList:
        if item not in output:
            output.append(item)
    This list comprehension works in Python 2.3:
    Code:
    [s for s in lineList if s not in locals()['_[1]'].__self__]
    If order is not important, you can use set() or dict(). Example using dict():
    Code:
    >>> lineList = ["A", "B", "C", "C"]
    >>> dict.fromkeys(lineList).keys()
    ['A', 'C', 'B']
    >>>
    Example using set():
    Code:
    >>> lineList = ["A", "B", "C", "C"]
    >>> list(set(lineList))
    ['A', 'C', 'B']
    >>>

    Comment

    • Mehr21
      New Member
      • Jul 2010
      • 2

      #3
      Thank you so much for the your help.

      Comment

      Working...