Ignoring similar entries in a column of many rows

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Atrisa
    New Member
    • Sep 2010
    • 22

    #16
    Unfortunately that didn't work either. I changed few lines in the previously sent code this way:
    Code:
    .
    .
    .
    port_table[port[0]]=line1[0]
            keycount +=1
            content=f.readline()
    
    ports_name={}
    get_port_name ('Ports', ports_name)
    and now the dictionary looks like this:
    -----------------------------------------------
    {'8732': 'dtp-net', '4026': 'as-debug', '4448': 'asc-slmd', '4024': 'tnp1-port', '4025': 'partimage', '38203': 'agpolicy', '4023': 'esnm-zoning', '4020': 'trap', '4021': 'nexus-portal', '9418': 'git', '4028': 'dtserver-port', '4029': 'ip-qsig'}
    -----------------------------------------------
    The output line is like this:

    Code:
    outfile.write("\n".join(["%s %s %s" % (key, port_dic[key], ports_name[key]) for key in port_dic]))
    but it gives this error message: KeyError: '62084', which doesn't tell me anything.

    Comment

    • bvdet
      Recognized Expert Specialist
      • Oct 2006
      • 2851

      #17
      "KeyError" indicates that you are attempting to retrieve a dictionary value using a dictionary key that is not in the dictionary. In order for the script to work, both dictionaries must have the same keys. The alternative would be to create one dictionary instead. Assuming the 'keycode' is in column 4 and the 'description' is in column 5:
      Code:
      outfile = open('capture25000-column3.txt', 'w')
      
      f = open('capture25000.txt')
      # initialize a dictionary
      dd = {}
      # iterate on the file object
      for line in f:
          # get the port number from the third item in line
          lineList = line.strip().split()
          column3 = lineList[2].split(".")[-1]
          keycode = lineList[3]
          description = lineList[4]
          # if column3 not in dd, add to dd and set quantity to 0
          # only first occurring value of keycode and description will be saved
          dd.setdefault(column3, [0, keycode, description])
          # increment dd[port number][0] by one
          dd[column3][0] += 1
      
      f.close()
      # write the dictionary to disk
      outfile.write("\n".join(["%s %s %s" % (key, dd[key][0], dd[key][1]) for key in dd]))
      outfile.close()

      Comment

      • Atrisa
        New Member
        • Sep 2010
        • 22

        #18
        Now I get the port description in the third column of the output file 'capture25000-column3.txt', but each description is repeated for few times, not just once as the port numbers are. Some output:

        25910 1 pangolin-laser
        55149 1 pangolin-laser
        4024 4 www-http
        13734 2 www-http
        2451 2 www-http
        55617 6 iapp
        61510 1 iapp

        which is not correct. The few last lines of the following code is supposed to do that, but doesn't work properly. I guess something I have done wrong with looping?

        Code:
        def get_port_name (name, port_table):
        
            pf= open('ports.txt','r')
            content=pf.readline()
            keycount=1
            while content:
                line1=content.split()  
                if len(line1)>3 and line1[0]!="#":
                    port=line1[1].split('/')
                    port_table[port[0]]=line1[0]
                keycount +=1
                content=pf.readline()
        
        # initialize a dictionary
        dd = {}
        
        # iterate on the file object
        for line in f:
            # get the port number from the third item in line
            column3 = line.strip().split()[2].split(".")[-1]
            # if column3 not in dd, add to dd and set quantity to 0
            dd.setdefault(column3, 0)
            # increment dd[port number] by one
            dd[column3] += 1
        
        ports_name={}
        get_port_name ('Ports', ports_name)
        # if the key in dd equals the key (item) in ports_name, insert the value of that item to the third column in the output file:
        for key in dd.keys():
            for item in ports_name.keys():
                if key == item:
        	    outfile.write("\n".join(["%s %s %s" % (key, dd[key], ports_name[item]) for key in dd]))            
        
        #f.close()
        #outfile.write("\n".join(["%s %s %s" % (key, dd[key], ports_name[item]) for key in dd]))
        #outfile.close()

        Comment

        • bvdet
          Recognized Expert Specialist
          • Oct 2006
          • 2851

          #19
          Your code doesn't look anything like the code I posted. It looks like you are reading the data file twice.

          Comment

          • Atrisa
            New Member
            • Sep 2010
            • 22

            #20
            Finally these are the line that did it:

            Code:
            for key in dd.keys():
                for item in ports_name.keys():
                    if key == item:
                        outfile.write("\n".join(["%s %s %s" % (key, dd[key], ports_name[item])])+ "\n")
            Thanks bvdet for your help.

            Comment

            • bvdet
              Recognized Expert Specialist
              • Oct 2006
              • 2851

              #21
              You are welcome, and I'm glad that you got it working!

              BV

              Comment

              Working...