File Processing in Python

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • electric916
    New Member
    • Feb 2008
    • 7

    File Processing in Python

    I have just started using python and am stuck on a particular project. I have been stuck staring at my computer not knowing what to do. I will post details below.

    What I am trying to do is use a CSV file named country.csv in which information about countries is recorded. The first line in the file is the header line and describes the content of each column. The first value in a line is the name of the country, the second is an abbreviation for the country’s name, the third is the name of the capital, the fourth is the total population, and the last one gives the total area of the country. The program has to determine the names and capital of all those countries whose population is above a number entered by the user. The names of all those countries including their capital needs to be written to a file named results.txt.
    Here is what it should look like:
    Please enter the minimum population: 200000000
    4 Records found.
    Please open the file results.txt to see the results.
    Based on the above input, the program should create the file results.txt with the following
    content.
    China Beijing
    India New Delhi
    Indonesia Jakarta
    United States Washington DC

    The file I am reading from looks like this:
    NAME,ISO,CAPITA L,POPULATION,AR EA
    Afghanistan ,AF ,Kabul,28513677 ,647500
    Albania ,AL ,Episkopi,35448 08,28748
    Algeria ,AG ,Tirana,3212932 4,2381740
    American Samoa ,AQ ,Algiers,57902, 199
    Andorra ,AN ,Pago Pago,69865,468
    Angola ,AO ,Andorra la Vella,10978552, 1246700


    I believe I should be using a for loop and if statement. Here is what I have so far, but totally stumped.

    Code:
    p = input("Please enter the minimum population: ")
    infile = open("country.csv", "r")
    outfile = open("results.txt", "w")
    
    #
    for line in infile:
        if "NAME" in line:
            continue
        line = line.rstrip()
        fields = line.split(',')
        name, iso, capital, population, area = fields
        if int(population) >= p:
            print name, capital
    I believe that I have figured it out to give me the names with population higher or equal to population inputed by user. Couple of questions:

    1. What code do I use to tell the user how many records were found?
    2. How do i write results to another file?

    THANKS IN ADVANCE FOR HELP
  • bvdet
    Recognized Expert Specialist
    • Oct 2006
    • 2851

    #2
    Originally posted by electric916
    I have just started using python and am stuck on a particular project. I have been stuck staring at my computer not knowing what to do. I will post details below.

    What I am trying to do is use a CSV file named country.csv in which information about countries is recorded. The first line in the file is the header line and describes the content of each column. The first value in a line is the name of the country, the second is an abbreviation for the country’s name, the third is the name of the capital, the fourth is the total population, and the last one gives the total area of the country. The program has to determine the names and capital of all those countries whose population is above a number entered by the user. The names of all those countries including their capital needs to be written to a file named results.txt.
    Here is what it should look like:
    Please enter the minimum population: 200000000
    4 Records found.
    Please open the file results.txt to see the results.
    Based on the above input, the program should create the file results.txt with the following
    content.
    China Beijing
    India New Delhi
    Indonesia Jakarta
    United States Washington DC

    The file I am reading from looks like this:
    NAME,ISO,CAPITA L,POPULATION,AR EA
    Afghanistan ,AF ,Kabul,28513677 ,647500
    Albania ,AL ,Episkopi,35448 08,28748
    Algeria ,AG ,Tirana,3212932 4,2381740
    American Samoa ,AQ ,Algiers,57902, 199
    Andorra ,AN ,Pago Pago,69865,468
    Angola ,AO ,Andorra la Vella,10978552, 1246700


    I believe I should be using a for loop and if statement. Here is what I have so far, but totally stumped.

    Code:
    p = input("Please enter the minimum population: ")
    infile = open("country.csv", "r")
    outfile = open("results.txt", "w")
    
    #
    for line in infile:
        if "NAME" in line:
            continue
        line = line.rstrip()
        fields = line.split(',')
        name, iso, capital, population, area = fields
        if int(population) >= p:
            print name, capital
    I believe that I have figured it out to give me the names with population higher or equal to population inputed by user. Couple of questions:

    1. What code do I use to tell the user how many records were found?
    2. How do i write results to another file?

    THANKS IN ADVANCE FOR HELP
    To skip the first line and avoid the if statement, you can do this:[code=Python]infile.readline ()
    for line in infile:
    ............... ..[/code]One way is to save your results in a list. Example (untested):[code=Python]resultsList = []
    for line in infile:
    if condition evaluates True: # pseudocode
    resultsList.app end([country, capital])
    # Build the output string for printing or writing
    outStr = '\n'.join([' '.join(item) for item in resultsList])
    print outStr
    outfile.write(o utStr)
    # Close open files!
    infile.close()
    outfile.close()[/code]

    Comment

    • electric916
      New Member
      • Feb 2008
      • 7

      #3
      So i got it to print perfectly in python, but the file i am writing to is messed up. It looks like this:

      China BeijingIndia New Delhi

      It should look like:
      China Beijing
      India New Delhi

      And how do i tell the user how many results are found? I'm so close to the end, I can smell it. Just a little more help.
      Code:
      p = input("Please enter the minimum population: ")
      infile = open("country.csv", "r")
      outfile = open("results.txt", "w")
      
      infile.readline()
      resultsList = []
      for line in infile:
          line = line.rstrip()
          fields = line.split(',')
          name, iso, capital, population, area = fields
          if int(population) >= p:
              resultsList =([name, capital])
              outStr ='  '.join([''.join(item) for item in resultsList])
              outfile.write(outStr)
              print outStr
      infile.close()
      outfile.close()

      Comment

      • dshimer
        Recognized Expert New Member
        • Dec 2006
        • 136

        #4
        Originally posted by electric916
        Code:
        resultsList =([name, capital])
        outStr ='  '.join([''.join(item) for item in resultsList])
        outfile.write(outStr)
        print outStr
        How about processing each answer as it comes using string formatting, by replacing the code above with this.
        Code:
        outfile.write('%s %s\n'%(name, capital))
        print '%s %s\n'%(name, capital)

        Comment

        • bvdet
          Recognized Expert Specialist
          • Oct 2006
          • 2851

          #5
          Originally posted by electric916
          So i got it to print perfectly in python, but the file i am writing to is messed up. It looks like this:

          China BeijingIndia New Delhi

          It should look like:
          China Beijing
          India New Delhi

          And how do i tell the user how many results are found? I'm so close to the end, I can smell it. Just a little more help.
          Code:
          p = input("Please enter the minimum population: ")
          infile = open("country.csv", "r")
          outfile = open("results.txt", "w")
          
          infile.readline()
          resultsList = []
          for line in infile:
              line = line.rstrip()
              fields = line.split(',')
              name, iso, capital, population, area = fields
              if int(population) >= p:
                  resultsList =([name, capital])
                  outStr ='  '.join([''.join(item) for item in resultsList])
                  outfile.write(outStr)
                  print outStr
          infile.close()
          outfile.close()
          Per my suggested code, resultsList should end up looking like this:[code=Python]resultsList = [['China', 'Beijing'],['India', 'New Delhi'], ['Indonesia', 'Jakarta'], ['United States', 'Washington DC']]
          outStr = '\n'.join([' '.join(item) for item in resultsList])
          print outStr[/code]Output:

          >>> China Beijing
          India New Delhi
          Indonesia Jakarta
          United States Washington DC
          >>>

          I noticed you are not using the list append method to compile resultsList. You will end up with a list with at most one element.

          Comment

          • electric916
            New Member
            • Feb 2008
            • 7

            #6
            Thanks so much for all of your help! I am slowly starting to learn. I eneded up getting it to work. One last question. How would I print to the other file "Sorry, no records found" if there were no results? I am done with this project, just thought id add a little bonus

            Comment

            • bvdet
              Recognized Expert Specialist
              • Oct 2006
              • 2851

              #7
              Originally posted by electric916
              Thanks so much for all of your help! I am slowly starting to learn. I eneded up getting it to work. One last question. How would I print to the other file "Sorry, no records found" if there were no results? I am done with this project, just thought id add a little bonus
              If resultsList is empty, it evaluates False. Therefore:[code=Python]if resultsList:
              outStr = '\n'.join([' '.join(item) for item in resultsList])
              else:
              outStr = 'Sorry, no results were found.'
              print outStr[/code]

              Comment

              • anonymous
                Banned
                New Member
                • Sep 2005
                • 99

                #8
                Originally posted by electric916
                I have just started using python and am stuck on a particular project. I have been stuck staring at my computer not knowing what to do. I will post details below.

                What I am trying to do is use a CSV file named country.csv in which information about countries is recorded. The first line in the file is the header line and describes the content of each column. The first value in a line is the name of the country, the second is an abbreviation for the country’s name, the third is the name of the capital, the fourth is the total population, and the last one gives the total area of the country. The program has to determine the names and capital of all those countries whose population is above a number entered by the user. The names of all those countries including their capital needs to be written to a file named results.txt.
                Here is what it should look like:
                Please enter the minimum population: 200000000
                4 Records found.
                Please open the file results.txt to see the results.
                Based on the above input, the program should create the file results.txt with the following
                content.
                China Beijing
                India New Delhi
                Indonesia Jakarta
                United States Washington DC

                The file I am reading from looks like this:
                NAME,ISO,CAPITA L,POPULATION,AR EA
                Afghanistan ,AF ,Kabul,28513677 ,647500
                Albania ,AL ,Episkopi,35448 08,28748
                Algeria ,AG ,Tirana,3212932 4,2381740
                American Samoa ,AQ ,Algiers,57902, 199
                Andorra ,AN ,Pago Pago,69865,468
                Angola ,AO ,Andorra la Vella,10978552, 1246700


                I believe I should be using a for loop and if statement. Here is what I have so far, but totally stumped.

                Code:
                p = input("Please enter the minimum population: ")
                infile = open("country.csv", "r")
                outfile = open("results.txt", "w")
                
                #
                for line in infile:
                    if "NAME" in line:
                        continue
                    line = line.rstrip()
                    fields = line.split(',')
                    name, iso, capital, population, area = fields
                    if int(population) >= p:
                        print name, capital
                I believe that I have figured it out to give me the names with population higher or equal to population inputed by user. Couple of questions:

                1. What code do I use to tell the user how many records were found?
                2. How do i write results to another file?

                THANKS IN ADVANCE FOR HELP
                Wow...this is why our class has a Google Group!

                Comment

                Working...