For loop headache

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • TMS
    New Member
    • Sep 2006
    • 119

    For loop headache

    I'm working on the definitive igpay atinlay assignment. I've defined a module that has one function: def igpay(word): Its sole purpose is to process a word into pig latin. It seems to work well.

    Now, the 2nd part of the assignment: Define a module that takes an argument on the command line (thanks to previous questions, this is complete) and processes the entire file into pig latin.

    First I went through some basic tests to process the file: Find a space (for the delimiter), find punctuation and test for a capital. When I put this into a loop my logic doesn't seem to get it past the first word.

    Code:
    import sys
    import igpay
    
    filename = sys.argv[1]
    data = open( filename ) .read()
    print data
    
    def atinlay(data):
        for i in range(len(data)):             #begin the loop
            space = data.find(" ")            #find the first space
            period = data.find(".")            #determine punctuation to handle later
            comma = data.find(",")          
            temp = data[0:space]           #get the first slice
            a = temp[:1]                        #copy the first letter to see if it is capital
            capital = a.isupper()
            if capital:
                temp = temp.lower()         #make it lower case if it is capital
            newData = igpay.igpay(temp)    #new temp variable 
            if capital:                             #capital flag set? Handle it
                a = newData[:1]
                a = a.upper()
            newData = a + newData[1:]  #put new cap back on word
            space += 1                         #increment to new space?
            i = space                            #increment i?
            temp = newData[space:]      #thought I needed another variable here.... :(
        return newData    
    
    c = atinlay(data)
    print c
    I think part of my problem is the temp assignment at the end. But I could use a gentle nudge to get this loop going because all it will do is process the first word right now.

    Thank you
  • bartonc
    Recognized Expert Expert
    • Sep 2006
    • 6478

    #2
    I'll look at your code in a bit. In the mean time, something to consider is that file objects, themselves, are iterable. Text files appear as lists of end-of-line terminated strings. So you can write:
    Code:
    dateFile = open("filename")
    for line in dataFile:
        for word in line.split():    # split() defaults to any whitespace
            print word
    dataFile.close()

    Comment

    • TMS
      New Member
      • Sep 2006
      • 119

      #3
      Originally posted by bartonc
      I'll look at your code in a bit. In the mean time, something to consider is that file objects, themselves, are iterable. Text files appear as lists of end-of-line terminated strings. So you can write:
      Code:
      dateFile = open("filename")
      for line in dataFile:
          for word in line.split():    # split() defaults to any whitespace
              print word
      dataFile.close()
      But when its converted to a list python see's each word as 1 unit, not individual chars like a string. I tried something similar but wasn't able to process the word, so I gave up and went this direction. I appreciate your help.

      Comment

      • bartonc
        Recognized Expert Expert
        • Sep 2006
        • 6478

        #4
        Code:
        import sys
        import igpay
        
        filename = sys.argv[1]
        data = open( filename ) .read()
        print data
        
        def atinlay(data):
            pos = 0    # need to keep track of where you are in order to "move" through data
        ### you can then add pos to args of find()
            for i in range(len(data)):             #begin the loop
                space = data.find(" ")            #find the first space
                period = data.find(".")            #determine punctuation to handle later
                comma = data.find(",")          
        
        ### data[0] should be data[pos] or you'll always start at the beginning
                #### temp assigned at the end of the loop gets reassigned here ####
                temp = data[0:space]           #get the first slice
                a = temp[:1]                        #copy the first letter to see if it is capital
                capital = a.isupper()
        
        ### you can use str.capitalize() to upper the first letter
                if capital:
                    temp = temp.lower()         #make it lower case if it is capital
                newData = igpay.igpay(temp)    #new temp variable 
                if capital:                             #capital flag set? Handle it
                    a = newData[:1]
                    a = a.upper()
                newData = a + newData[1:]  #put new cap back on word
        
        ### This would be you next Pos
                pos = space + 1
        #        space += 1                         #increment to new space?
        
        ### Even if this works, it's bad practice to change for's variable
        ### I'm not actully sure what happens
        #        i = space                            #increment i?
        ### and besides, I don't see it used anywhere
        
        
                temp = newData[space:]      # this will be replaced at the top of the next loop
            return newData    
        
        c = atinlay(data)
        print c

        Comment

        • TMS
          New Member
          • Sep 2006
          • 119

          #5
          I am still only getting one word processed (at least printed).

          The text file I'm working with is nonsensical, intended for testing. The result I get is this:

          C:\Python25>ati nlay.py someFile.txt
          NewFile and, more new file.

          Ewfilenay

          C:\Python25>

          It is capitalizing appropriately the first letter of the only word it processes. This is the same problem I was having before. Any ideas? :)

          Comment

          • bartonc
            Recognized Expert Expert
            • Sep 2006
            • 6478

            #6
            This'll get you started. Create an empty list, then append() to it and return the list.
            Code:
            import sys
            import igpay
            
            filename = sys.argv[1]
            data = open( filename ) .read()
            print data
            
            def atinlay(data):
                pos = 0    # need to keep track of where you are in order to "move" through data
                resultList = []
            ### you can then add pos to args of find()
                for i in range(len(data)):             #begin the loop
                    space = data.find(" ")            #find the first space
                    period = data.find(".")            #determine punctuation to handle later
                    comma = data.find(",")          
            
            ### data[0] should be data[pos] or you'll always start at the beginning
                    #### temp assigned at the end of the loop gets reassigned here ####
                    temp = data[0:space]           #get the first slice
                    a = temp[:1]                        #copy the first letter to see if it is capital
                    capital = a.isupper()
            
            ### you can use str.capitalize() to upper the first letter
                    if capital:
                        temp = temp.lower()         #make it lower case if it is capital
                    newData = igpay.igpay(temp)    #new temp variable 
                    if capital:                             #capital flag set? Handle it
                        a = newData[:1]
                        a = a.upper()
                    newData = a + newData[1:]  #put new cap back on word
            
            ### This would be you next Pos
                    pos = space + 1
            #        space += 1                         #increment to new space?
            
            ### Even if this works, it's bad practice to change for's variable
            ### I'm not actully sure what happens
            #        i = space                            #increment i?
            ### and besides, I don't see it used anywhere
            
                    resultList.append(newData + " ")
            #        temp = newData[space:]      # this will be replaced at the top of the next loop
                return resultList
            
            c = atinlay(data)
            print c
            [/QUOTE]

            Comment

            • dshimer
              Recognized Expert New Member
              • Dec 2006
              • 136

              #7
              I haven't studied every snip of code, but based on what I understand, can I interject something as seen from another direction. As I understand it igpay() is supposed to take any word you send it and convert to the new string, and atinlay() should read through a whole file of text converting each word and capitalizing if the word falls after a period (or is already capitalized).

              Could it possibly be easier to

              1) write igpay so that if you send it a properly capitalized word it translates it to a properly capitalized word in the new language, or if you send it a word that has punctuation at the end it returns the translated word with the same punctuation.

              Then..

              2) Just take the whole data stream, split at white spaces (which will keep the puctuation with the word proceeding it), process it in a linear fashion from beginning to end. It could even test a word so that if a period is found in this string then make sure the next is capitalized before sending to igpay().

              It seems if it were approached from this directon igpay() would need a couple more lines, but atinlay would just be a simple..

              read data
              split it
              for each word in that list
              convert it and append to the output capitalizing if the previous word contained a period.

              Comment

              • TMS
                New Member
                • Sep 2006
                • 119

                #8
                OK, so, now I've changed igpay to do capitalization. The same problem still remains. I can't seem to process the list. It only does one word, the first word. If I use split() it will be made into a list (the text file) and I won't be able to process it because it will no longer be a string. Is there a way to change it back into a string after making it a list? Lists are tuples, right?

                Here is my code:
                Code:
                import sys
                import igpay
                
                filename = sys.argv[1]
                data = open( filename ) .read()
                print data
                
                def atinlay(data):
                    pos = 0                             # begin position
                    for i in range(len(data)):       # begin loop
                        space = data.find(" ")      # find space for delimiter
                        #period = data.find(".")      # set a flag for punctuation period
                        #comma = data.find(",")    # set a flag for punctuation comma
                        temp = data[0:space]      # slice the first word
                        newData = igpay.igpay(temp)   #place to put processed words
                        pos = space + 1
                        temp = newData[space:]
                    return newData    
                
                c = atinlay(data)
                print c

                Comment

                • bvdet
                  Recognized Expert Specialist
                  • Oct 2006
                  • 2851

                  #9
                  Originally posted by TMS
                  OK, so, now I've changed igpay to do capitalization. The same problem still remains. I can't seem to process the list. It only does one word, the first word. If I use split() it will be made into a list (the text file) and I won't be able to process it because it will no longer be a string. Is there a way to change it back into a string after making it a list? Lists are tuples, right?

                  Here is my code:
                  Code:
                  import sys
                  import igpay
                  
                  filename = sys.argv[1]
                  data = open( filename ) .read()
                  print data
                  
                  def atinlay(data):
                      pos = 0                             # begin position
                      for i in range(len(data)):       # begin loop
                          space = data.find(" ")      # find space for delimiter
                          #period = data.find(".")      # set a flag for punctuation period
                          #comma = data.find(",")    # set a flag for punctuation comma
                          temp = data[0:space]      # slice the first word
                          newData = igpay.igpay(temp)   #place to put processed words
                          pos = space + 1
                          temp = newData[space:]
                      return newData    
                  
                  c = atinlay(data)
                  print c
                  Lists and tuples are similar but different. Lists are mutable and tuples are not. To make a string from a list:
                  Code:
                  >>> lst = ['I', 'am', 'a', 'detailer']
                  >>> " ".join(lst)
                  'I am a detailer'
                  >>>
                  It looks like you are only processing the first word in each loop. I do not see where you are accumulating an output string. You could do something like this, but your igpay function would need to handle the capitalization and punctuation:
                  Code:
                  def process_file(fn):
                      f = open(fn)
                      outStr = ""
                      for line in f:
                          lineList = line.split(" ") # split on space character
                          lineListOut = []
                          for word in lineList:
                              lineListOut.append(igpay.igpay(word))
                          outStr += " ".join(lineListOut)
                      print outStr
                  HTH

                  Comment

                  • dshimer
                    Recognized Expert New Member
                    • Dec 2006
                    • 136

                    #10
                    Originally posted by bvdet
                    You could do something like this, but your igpay function would need to handle the capitalization and punctuation:
                    Very clean, now the for loop is simply doing what it does best, working through the sequence of words, and since split should send in the punctuation along with the single word that preceeds it, "handling" it could be as simple as...
                    Test if it's there.
                    If so remove it.
                    Process the string.
                    Replace punctuation and return.

                    Comment

                    • TMS
                      New Member
                      • Sep 2006
                      • 119

                      #11
                      Wow... very nice. It processes the list, but appends a bunch of stuff. The list after processing looks like this:

                      NewaywFiwaylewa yway wayawayd,way wyamovwayrewayw ay waynewaywway wayfiwayleway.

                      LOL, its an entirely new language. I should name it...

                      I need to go through and see what is happening, but it is processing the list. I think I can handle it from here. Thank you!

                      Comment

                      • dshimer
                        Recognized Expert New Member
                        • Dec 2006
                        • 136

                        #12
                        One more thing, it blows me away the little things you can miss as you go along. In case you didn't see it, look at the last few posts in the
                        how to convert gpr file to csv format: using python
                        thread. The fileinput tip is worth the price of admission all by itself and I had never looked at it before ghostdog74 mentioned it.

                        Originally posted by TMS
                        I need to go through and see what is happening, but it is processing the list. I think I can handle it from here. Thank you!

                        Comment

                        • bvdet
                          Recognized Expert Specialist
                          • Oct 2006
                          • 2851

                          #13
                          Originally posted by dshimer
                          One more thing, it blows me away the little things you can miss as you go along. In case you didn't see it, look at the last few posts in the
                          how to convert gpr file to csv format: using python
                          thread. The fileinput tip is worth the price of admission all by itself and I had never looked at it before ghostdog74 mentioned it.
                          The fileinput was new to me also. Good tip. One more thing - it is good practice to close each file you open when you are through with it:
                          Code:
                          f.close()

                          Comment

                          • TMS
                            New Member
                            • Sep 2006
                            • 119

                            #14
                            Its all done. Thank you so much for your help. It works very well, thanks to all your help!

                            Comment

                            • bartonc
                              Recognized Expert Expert
                              • Sep 2006
                              • 6478

                              #15
                              Originally posted by TMS
                              Its all done. Thank you so much for your help. It works very well, thanks to all your help!
                              Thanks for the update. I'm glad the experts here were of help to you, Keep posting.

                              Comment

                              Working...