Removing duplicate rows in a textfile then add to datatable.

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • starlight849
    New Member
    • Jun 2009
    • 82

    Removing duplicate rows in a textfile then add to datatable.

    Hi, I am looking for advice on the best way to achieve my current goal.
    I am reading a textfile and extracting information from each line. For example string1 and string2. I am then writing these string to a datatable.
    However, sometimes there will be repeats in string1 and string2.
    If there are repeated lines I would like to be able to delete the repeating line and add a counter to find the total number of repeating lines.
    so if a string repeates 3 times I would like there to be a column with a 3 displayed next to the line in my datatable.
    If anyone has any suggestion on the best approach or some example code I would very much appreciate this.

    Best,
    Starlight849
  • Joseph Martell
    Recognized Expert New Member
    • Jan 2010
    • 198

    #2
    Take a look at the dictionary collection. This data structure may be what you are looking for. You have to set a key and a value. In your case, the key would be a string and the value would be an integer to count the number of occurrences:
    Code:
    Dim myLines As New System.Collections.Generic.Dictionary(Of String, Integer)()

    Comment

    • starlight849
      New Member
      • Jun 2009
      • 82

      #3
      I will look into this but I'm comparing two strings. If string 2 matches string 1 more than once then I delete the repeating strings in both. I then add a counter to see how many times the matches appear.

      Example:
      string1 string2
      apple orange
      apple banana
      apple pear
      mango banana
      mango banana

      I only want to delete the second occurence of mango banana. I then need a total count of mango banana (2).

      I've added the strings to a datatable and am going to loop through the datatable to do this search. If anyone has any example about comparing and declaring a previous and current row in a datatable I would appreciate it.

      Comment

      • Joseph Martell
        Recognized Expert New Member
        • Jan 2010
        • 198

        #4
        The dictionary collection maintains a unique set of keys. In other words, the dictionary collection will not allow you to add the same key more than once.

        In your example, your logic would look something like this:

        Code:
        Dim myLines As New System.Collections.Generic.Dictionary(Of String, Integer)()
        Dim key As String = "Mango Banana"
        If (myLines.ContainsKey(key)) Then
            myLines(key) += 1
        Else
            myLines(key) = 1
        End If

        Comment

        • starlight849
          New Member
          • Jun 2009
          • 82

          #5
          Ok, I figured it out. Thanks for your help, Joseph.

          Comment

          Working...