Algorithm to merge ip address

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • jota69
    New Member
    • Mar 2010
    • 9

    Algorithm to merge ip address

    I am developing application that manage a plain text, containing ip address ranges.
    User will be able to add/delete ip from ranges.

    Example to delete ip:
    Ip entered by user through a text box : 25.2.0.58
    Ip Range container: 25.2.0.10 - 25.255.255.255

    the Algorithm should determine whether there is in text file, a range that may contain the ip entered by the user. If so, remove it from container ip range.

    Results, (using previous example )
    25.2.0.10 - 25.2.0.57
    25.2.0.59 - 25.255.255.255

    As you can see, Ip entered by user it has been extracted from, and two
    new ranges has been created.

    Any Algorithm that can be able to do so?

    Thanks in advance.
    Jota69
  • jkmyoung
    Recognized Expert Top Contributor
    • Mar 2006
    • 2057

    #2
    This seems like a search and split algorithm. I'm not sure how you're implementing it, thus I'm not sure where the problem is.

    1. How are you holding the ranges? Are they always sorted?
    2. How are you searching the ranges?

    Comment

    • jota69
      New Member
      • Mar 2010
      • 9

      #3
      1. How are you holding the ranges? Are they always sorted?
      Explanation:
      There are 10 lists,(lis1.txt , list2.txt, list3.txt and so on.) involved in the process.
      They all contain IP ranges.
      Now you imagine such an application like a firewall, that uses these lists to filter harmful ips for network, and its users.
      Well, Ipfile.txt file is the sum of all these lists.
      Obviously ip ranges have to order, when added to the final file. (Ipfile.txt).

      Example:

      List1.txt
      0.0.0.1-0.0.0.255
      3.0.0.0-3.0.0.255
      6.1.125.0-6.255.255.255

      List2.txt
      2.0.0.0-2.0.0.255
      3.0.0.0-3.0.0.255
      5.1.125.0-6.255.255.255

      List3.txt
      4.0.0.125-4.0.0.255
      7.0.0.0-7.0.0.155
      8.1.125.0-8.15.255.255

      Ipfile.txt
      0.0.0.1 - 0.0.0.255 - List1.txt
      2.0.0.0 - 2.0.0.255 - List2.txt
      3.0.0.0 - 3.0.0.255 - List2.txt
      4.0.0.125 - 4.0.0.255 - List3.txt
      5.1.125.0 -6.255.255.255 - List2.txt
      6.1.125.0 - 6.255.255.255 - List1.txt
      7.0.0.0 - 7.0.0.155 - List3.txt
      8.1.125.0 - 8.15.255.255 - List3.txt

      As you can see the ranges in the final file, are properly sorted.
      For you to have an idea, read this:
      The user of my application, they will can search, they can also add and remove IP ranges.
      To do this, only you must enter the ip you want to delete, and press a button.
      When you press the button "delete", this button should execute an algorithm capable of:
      1-check the file, if there is a range of values that includes user-entered ip .
      2 - In case if there is, then algorithm it should remove the ip range of values. As explained above.(my firts post).


      2. How are you searching the ranges?
      That my friend, is the origin of the question. I need an algorithm to do it for me.

      Comment

      • jota69
        New Member
        • Mar 2010
        • 9

        #4
        adding something...

        Since....

        IPv4 addresses are usually represented in dot-decimal notation (four numbers, each ranging from 0 to 255, separated by dots, e.g. 208.77.188.166) . Each part represents 8 bits of the address, and is therefore called an octet. In less common cases of technical writing, IPv4 addresses may be presented in hexadecimal, octal, or binary representations . In most representations each octet is converted individually.
        I get octets splitting (as you point below) ip address, entered by user. and same with each file line. (delimited file(-)). e.g.

        208.77.188.166 - 208.77.188.183
        208.77.188.195 - 208.77.188.206
        208.77.188.210 - 208.77.188.255

        Ip to excluded from range, (entered by user) : 208.77.188.236
        Range container: 208.77.188.210 - 208.77.188.255
        Notice that we can to merge overlapping and adjacent ranges. (I need logaritm do it so)
        e.g. 208.77.188.166 - 208.77.188.255.

        Note all ranges are included in only one line, saving file size. (We are talking about 500.000 lines)

        But, lets come back to mean target = Find possible range container, extract user ip from, and delete it.

        How..?
        Creating two new ranges, e.g.

        Ip to excluded from range, (entered by user) : 208.77.188.236
        Range container: 208.77.188.210 - 208.77.188.255

        Results (I want to get, using any algoritm)
        208.77.188.166 - 208.77.188.235
        208.77.188.237 - 208.77.188.255

        Also well be nice, if algoritm locate and delete duplicate ranges.

        Firts at all, split file lines.
        208.77.188.166 - 208.77.188.235, you get, IpStart and IpEnd.
        IpStart = 208.77.188.166
        IpEnd = 208.77.188.235

        Code:
        Split(currentField, "-")
        'MsgBox("Ip: " & currentField)
        pos = InStrRev(currentField, "-")
        IpEnd = Strings.Mid(currentField, pos + 1)
        IpStart = Strings.Left(currentField, currentField.Length -IpEnd.Length -1)
        Getting ip address Octets...

        Code:
        sIP = IpStart
        arrIP = Split(sIP, ".")
        sFormattedIP = Format(CLng(arrIP(0)), "000") & "." & Format(CLng(arrIP(1)), "000") _
        & "." & Format(CLng(arrIP(2)), "000") & "." & Format(CLng(arrIP(3)), "000")
        IpStart = sFormattedIP
        Note: same process to get user ip and IpEnd.

        When you get ip address octets, you are able to compare to.
        208.77.188.236
        208
        77
        188
        236

        How..?
        Maybe we feel most comfortable working with long values instead of ip address value.

        OK, Lets converte ip address to long value.

        Firts, user ip:

        Code:
        user0 = CLng(CLng(arrIP2(0)) * (256 ^ 3))
        user1 = CLng(CLng(arrIP2(1)) * (256 ^ 2))
        user2 = CLng(CLng(arrIP2(2)) * (256 ^ 1))
        user3 = CLng(CLng(arrIP2(3)) * (256 ^ 0))
        user = CLng(user0 + user1 + user2 + user3)
        IpStart:

        Code:
        arr0 = CLng(CLng(arrIP(0)) * (256 ^ 3))
        arr1 = CLng(CLng(arrIP(1)) * (256 ^ 2))
        arr2 = CLng(CLng(arrIP(2)) * (256 ^ 1))
        arr3 = CLng(CLng(arrIP(3)) * (256 ^ 0))
        arr = CLng(arr0 + arr1 + arr2 + arr3)

        IpEnd:

        Code:
        arra0 = CLng(CLng(arrIP1(0)) * (256 ^ 3))
        arra1 = CLng(CLng(arrIP1(1)) * (256 ^ 2))
        arra2 = CLng(CLng(arrIP1(2)) * (256 ^ 1))
        arra3 = CLng(CLng(arrIP1(3)) * (256 ^ 0))
        arra = CLng(arra0 + arra1 + arra2 + arra3)
        Well, now we are able to CompareTo..
        Is user>IpStart and user<IpEnd...?? ???

        Code:
        If CULng(user) > CULng(arr) And CULng(user) < CULng(arra) Then
        gString = String.Concat(IpStart, Dash, IpEnd)
        MsgBox("gString: " & gString)
        If so, then...

        Code:
        If (MaxValue - (CLng(arrIP(3))) < MaxValue) Then
        sFormattedIP2 = Format(CLng(arrIP2(0)), "000") & "." & Format(CLng(arrIP2(1)), "000") _
        & "." & Format(CLng(arrIP2(2)), "000") & "." & Format(CLng(arrIP2(3) - 1), "000")
        SearchLine = sFormattedIP2
        LineFile = String.Concat(IpStart & " - " & SearchLine & " " & Level & " " & Description)
        
        ElseIf (MaxValue - (CLng(arrIP(3))) = MaxValue) And _
        (MaxValue - (CLng(arrIP(2))) < MaxValue) Then
        sFormattedIP2 = Format(CLng(arrIP2(0)), "000") & "." & Format(CLng(arrIP2(1)), "000") _
        & "." & Format(CLng(arrIP2(2) - 1), "000") & "." & Format(CLng(arrIP2(3) + 9), "000")
        SearchLine = sFormattedIP2
        LineFile = String.Concat(IpStart & " - " & SearchLine & " " & Level & " " & Description)
        
        ElseIf (MaxValue - (CLng(arrIP(3))) = MaxValue) And _
        (MaxValue - (CLng(arrIP(2))) = MaxValue) And (MaxValue - (CLng(arrIP(1))) < MaxValue) Then
                                        
        sFormattedIP2 = Format(CLng(arrIP2(0)), "000") & "." & Format(CLng(arrIP2(1) - 1), "000") _
        & "." & Format(CLng(arrIP2(2)), "000") & "." & Format(CLng(arrIP2(3)), "000")
        SearchLine = sFormattedIP2
        LineFile = String.Concat(IpStart & " - " & SearchLine & " " & Level & " " & Description)
        End If
        
        sFormattedIP3 = Format(CLng(arrIP2(0)), "000") & "." & Format(CLng(arrIP2(1)), "000") _
        & "." & Format(CLng(arrIP2(2)), "000") & "." & Format(CLng(arrIP2(3) + 1), "000")
        SearchLine = sFormattedIP3
        LineFile1 = String.Concat(SearchLine & " - " & IpEnd)
        Note: MaxValue is a Const = 255
        It work ok.

        So, what I'm looking for?
        Although this code works, I'm not satisfied, I am sure that somewhere there is an algorithm that does the same, but in a more technical and elegant way.

        Any idea??

        Thanks in advance.

        Comment

        • jkmyoung
          Recognized Expert Top Contributor
          • Mar 2006
          • 2057

          #5
          What language are you using? Is it ASP, or .NET C++?

          Note, when you compare, you need to check the endpoints too:
          Code:
          You have:
           CULng(user) > CULng(arr) And CULng(user) < CULng(arra)
          Should be:
          CULng(arr) <= CULng(user)  And CULng(user) <= CULng(arra)
          Also rearranged the direction to use <= so it's easier to understand.

          Determining overlapping of two ranges:
          Code:
          1st range: start1 - end1
          2nd range: start2 - end2
          
          Pseudo: (because I'm lazy)
          if (start1 <= start2 <= end1) || (start1 <= end2 <= end1) we overlap
          Merging 2 ranges:
          Code:
          min = (start1 < start2 ? start1 : start2)
          max = (end1 < end2 ? end2 : end1)
          New range: (min --- max)
          I don't understand why the author has chosen to do it that way in the last part of the code. If there is an intersection, we should still use the longs we had before! Convert it backwards.
          eg.
          Code:
          long ip = some value
          ip0 = ip >> 24;
          ip1 = (ip >> 16) %256
          ip2 = (ip >> 8) % 256
          ip3 = ip % 256
          Since you're dealing exclusively within ranges, creating the new ranges will never put you out of range.

          Removing ip from range:
          Code:
          long ip = some value to be removed
          long start
          long end
          
          // do range before ip
          if(! (ip == start)){
            start1 = start;
            end1 = ip--;
            // save this range.
          }
          //do range after ip
          if (!(ip == end)){
            start2 = ip++;
            end2 = end;
            //save this range too.
          }

          Comment

          • jota69
            New Member
            • Mar 2010
            • 9

            #6
            I use Visual basic (vs 2005) and WXP sp3.
            I have no knowledge of c #. I'm trying to adapt your code to mine.

            I must assume that this is correct?
            1st range = Ip range entered by user, (208.77.188.236 - 208.77.188.243)
            2nd range = Ip range in textline
            start1 = userIpStart = 208.77.188.236
            end1= userIpEnd = 208.77.188.243
            start2 = arr
            end2 = arra

            Now, adapting your code to mine, would be as follows:

            if (start1 <= start2 <= end1) || (start1 <= end2 <= end1) we overlap
            Code:
             if (userIpStart <= arr <= userIpEnd) OrElse _
            (userIpStart <= arra <= userIpEnd) then
            
            End If
            Is this correct?

            Comment

            • jota69
              New Member
              • Mar 2010
              • 9

              #7
              Another one..

              Code:
              ' do range before ip
              If Not (ip = start) Then
              start1 = start
              ' save this range.
              end1 = System.Math.Max(System.Threading.Interlocked.Decrement(ip),ip + 1)
              End If
              
              'do range after ip
              If Not (ip = [end]) Then
              start2 = System.Math.Max(System.Threading.Interlocked.Increment(ip),ip - 1)
              'save this range too.
              end2 = [end]
              End If
              is it correct..?

              Comment

              • jkmyoung
                Recognized Expert Top Contributor
                • Mar 2006
                • 2057

                #8
                My Vb is rusty. I think that looks right, although not sure why you have [end] in brackets.
                Also when I say 'Save this range, I mean: actually put the range back into your array of ranges. call a function to do it with the 2 arguments, if needed. eg
                AddRange(start1 , end1)
                ...
                ...
                AddRange(start2 , end2)

                Comment

                • jota69
                  New Member
                  • Mar 2010
                  • 9

                  #9
                  About brackets, You are right.
                  I modified the code, but forgot to change it in the post.

                  Now I could not find the equivalent of these operators in vb. (?) and (:) and (---)
                  Can you help me, with this block of code?

                  Code:
                  min = (start1 < start2 ? start1 : start2)
                  max = (end1 < end2 ? end2 : end1)
                  New range: (min --- max)
                  I am storing start,end,start 1,end1 values, in SortedList.

                  My knowledge about the language C # are zero, and your knowledge of vb is rusty, I think we makes a good team. :)

                  Comment

                  • jkmyoung
                    Recognized Expert Top Contributor
                    • Mar 2006
                    • 2057

                    #10
                    Seriously? That's pseudocode. I'm done here.

                    Comment

                    • jota69
                      New Member
                      • Mar 2010
                      • 9

                      #11
                      Seriously, :)
                      Well, it is something like that:

                      Code:
                      Dim NewRangeList As New SortedList()
                      
                      If start1<start2
                      min=start1
                      else: min=start2
                      end if
                      
                      If end1<end2
                      max=end2
                      else: max=end1
                      end if
                      
                      NewRangeList.add (min,max)
                      PrintKeysAndValues(NewRangeList)
                      
                      End sub
                      
                      'Getting final txt file, with sorted ranges.
                      
                      Public Shared Sub PrintKeysAndValues(ByVal myList As SortedList)
                              Dim Path1 As String = "C:\ipFile.txt"
                              Dim i As Integer
                              For i = 0 To myList.Count - 1
                                  My.Computer.FileSystem.WriteAllText(Path1, myList.GetKey(i) & " - " & myList.GetByIndex(i) & vbCrLf, True)
                              Next i
                          End Sub
                      at least I think so...
                      Thanks for you support!!!

                      cheers

                      Comment

                      • Glenton
                        Recognized Expert Contributor
                        • Nov 2008
                        • 391

                        #12
                        It seems to me that this is less of an algorithm problem and more of an implementation problem.

                        Creating a class structure to deal with ip's and with ranges would probably simplify your life.

                        I don't think this is a terribly difficult problem, but perhaps using something like perl or python would make it easier for you, since they are high-level and simple to use with text and files.

                        In terms of an algorithm, you've got something like this:
                        -IPLIST is a list of numbers in ascending order, where the odd ones represent the bottom of a range, and the even ones represent the top of a range.
                        -NEWIP is a number.
                        1. Do a bisect operation to find where NEWIP sits in the odd values of IPLIST.
                        2. Check whether it's less than or equal to the next value in IPLIST:
                        a. if yes: split the range and insert the next range.
                        b. if no: return not available.

                        In terms of the merging of the different files, something like this might do
                        (assume the files are, as before, an ascending list of an even number of numbers, with odd numbers as the bottom of the range, and even numbers as the top of the range):
                        1. Look at the smallest numbers at the top of each list (hint: it's the first number in each list!).
                        2. Put the smallest number into the main file, and save the second number (the upper part of the range) as UPPER. Remove those top two numbers from the list they originally came from.
                        3. Compare UPPER with the smallest numbers in the rest of the lists. Let's say we're comparing UPPER with N1, and the next number is N2.
                        a. if UPPER is bigger than or equal to N1 minus 1, then set UPPER equal to the maximum of UPPER and N2. Also remove N1 and N2 from the list they came from.
                        b. if not, then move on to the next file.
                        4. Put UPPER into the main file.
                        5. Repeat from 1 until the files are all empty.

                        Good luck!

                        Comment

                        • jota69
                          New Member
                          • Mar 2010
                          • 9

                          #13
                          Continue...

                          Work with IP addresses at least in visual basic, is a difficult task, given that there are no types to classify an ip address, since it is not a natural number, either integer or decimal.
                          That is why in most cases, you must convert ip address to an integer type Long, to work more comfortably.

                          But in other cases, you can work directly with the ip address.
                          Let's suppose that the first two lines of the file, containing the following IP ranges:
                          0.0.0.15 - 0.0.0.26
                          0.0.0.27 - 0.0.0.52
                          As you can see, they are correlated values.
                          Whereas correlative value as a value plus the unit, with respect to another one.
                          The approach in this case, may be easier. Returning to the example above:

                          start1 = 0.0.0.15
                          end1 = 0.0.0.26
                          start2 = 0.0.0.27
                          end2 = 0.0.0.52

                          Note that, in this case:
                          0.0.0.15 - 0.0.0.26
                          0.0.0.27 - 0.0.0.52

                          This is true:
                          If (end1 + 1 = start2 <= end2) Then
                          First of all we consider the following factors:
                          In a sorted list of ips ranges upwards, as is the case, never can occur any of the following cases:

                          a- start1>=end2
                          b- end1>end2
                          And of course, "start1" never can to be greater than "end1".

                          Considering the above, here are some rules that must be met. And we can use to our advantage.

                          Summarizing the above.
                          You have a plain text file which contains all the ip ranges.
                          You open the txt file and with a loop through all lines of file, comparing the first with the second, the second to third, and so on.
                          When find two lines which meet the standard ranges mentioned above, will replace the two lines, with the new range.

                          Example: 0.0.0.15 - 0.0.0.52

                          If (end1 + 1 = start2 <= end2) Then
                          NewRange = String.Concat(s tart1 & "-" & end2)
                          Now you only have to replace :
                          0.0.0.15 - 0.0.0.26
                          0.0.0.27 - 0.0.0.52

                          By NewRange
                          0.0.0.15 - 0.0.0.52, and..
                          Continue For..

                          Edit: adding something...

                          Comment

                          Working...