searching for words in a file

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • texas22
    New Member
    • Jun 2007
    • 26

    searching for words in a file

    If I have a large number of files how would I search for a particular word in all of the files then have the system print out all the lines in those files that have the word that I am looking for.

    What would have to be different if I wanted to do another program that would search for two words of my choice and then only print out the lines in those files that contain both those words.
  • bartonc
    Recognized Expert Expert
    • Sep 2006
    • 6478

    #2
    Originally posted by texas22
    If I have a large number of files how would I search for a particular word in all of the files then have the system print out all the lines in those files that have the word that I am looking for.

    What would have to be different if I wanted to do another program that would search for two words of my choice and then only print out the lines in those files that contain both those words.
    1) Use the glob module to look through directory trees for the files. Open each file and read it into a buffer. Close the file. Use the re module to create a Regular Expression object. Print the (file name?) that glob just gave you (or the buffer if you meant "print the file") if your Regular Expression gets a match.

    2) build your Regular Expression with variables.

    Comment

    • texas22
      New Member
      • Jun 2007
      • 26

      #3
      I am not sure what you mean by the glob module or anything? What I am trying to do is to write a script that would search 60 files for a certain word and then print out all the lines that have that word, then I want to also in a different program write a script that given two words, would search for those words then print out the lines in those files that contain both those words.

      Comment

      • texas22
        New Member
        • Jun 2007
        • 26

        #4
        What I have are about 70 different files in a folder called "files" In that folder each file if opened in a program like word contains a list of sentences. So what I need to do is to be able to type in a search word and then have the program return all the sentences out of all the files that contain that word.

        Comment

        • bvdet
          Recognized Expert Specialist
          • Oct 2006
          • 2851

          #5
          Originally posted by texas22
          What I have are about 70 different files in a folder called "files" In that folder each file if opened in a program like word contains a list of sentences. So what I need to do is to be able to type in a search word and then have the program return all the sentences out of all the files that contain that word.
          Compile a list of file names in memory. The os module can be used for that. Create a for loop to iterate on the list of file names:[code=Python]for file_name in fileList:[/code]Open and iterate on each file:[code=Python]for line in file_object:[/code]Use the 'in' operator to test if the word is contained in line.[code=Python]if word in line:
          print line[/code]For multiple keywords, a regex solution can be implemented.[code=Python]import re
          keyList = ['word1', 'word2']
          patt = re.compile('|'. join(keyList), re.IGNORECASE)
          for fn in fileList:
          f = open(fn)
          for line in f:
          if patt.search(lin e.lower()):
          print line
          f.close()[/code]

          Comment

          • texas22
            New Member
            • Jun 2007
            • 26

            #6
            All of the files I need to search through are in a folder called 'files' on my c: drive so how would I tell it to look in the folder and then look through each of those text files and search for a word that I tell it to

            Comment

            • bvdet
              Recognized Expert Specialist
              • Oct 2006
              • 2851

              #7
              Originally posted by texas22
              All of the files I need to search through are in a folder called 'files' on my c: drive so how would I tell it to look in the folder and then look through each of those text files and search for a word that I tell it to
              This returns a list of entries in 'dir_name':[code=Python]>>> import os
              >>> dir_name = r'C:\files'
              >>> entryList = os.listdir(dir_ name)[/code]Subdirectories are included in the list. Iterate on the list to search for your keywords.

              Comment

              • bartonc
                Recognized Expert Expert
                • Sep 2006
                • 6478

                #8
                Originally posted by texas22
                All of the files I need to search through are in a folder called 'files' on my c: drive so how would I tell it to look in the folder and then look through each of those text files and search for a word that I tell it to
                Hi texas22. I'm glad to see that bvdet has provided the os module call needed for your purposes.

                I'd like to explain what I see going on in this thread:
                1) You have provided lots of words describing your goal; That's good.
                2) I have replied with lots of words (instead of code) describing a solution; That's due to #1, above
                3) bvdet has provide the bare minimum to get you going; That's the way things work around here. We help (not do it for you).
                4) You have not shown any attempt (by posting code) to make this work yourself; That's not good.

                So give it an honest try. You'll see the level of participation from members go up!

                Comment

                • grantstech
                  New Member
                  • Jun 2007
                  • 16

                  #9
                  I'm kind of in Texas22's stage here. You all know what you are talking about but it's all Greek to me. You assume we would know what a glob module is when I don't have the concept of what needs to happen. How can we try it? I don't mean to offend and I really appreciate your help but could you simplify your explanation for us python toddlers?


                  Originally posted by bartonc
                  Hi texas22. I'm glad to see that bvdet has provided the os module call needed for your purposes.

                  I'd like to explain what I see going on in this thread:
                  1) You have provided lots of words describing your goal; That's good.
                  2) I have replied with lots of words (instead of code) describing a solution; That's due to #1, above
                  3) bvdet has provide the bare minimum to get you going; That's the way things work around here. We help (not do it for you).
                  4) You have not shown any attempt (by posting code) to make this work yourself; That's not good.

                  So give it an honest try. You'll see the level of participation from members go up!

                  Comment

                  • texas22
                    New Member
                    • Jun 2007
                    • 26

                    #10
                    Thanks, for all the input and yes you are right I am trying to use the help that I am given so I will start posting my code in order to become more involved. My big problem is just getting started and then trying to figure out what each step means by seeing it so anyways here is what I got so far. Being new to programming and especially no prior knowledge whatsoever on python this is a big learning curve for me. The comments I put next to each line is what I am assuming each line does so if you could clarify for me.

                    (code)
                    import os

                    dir_name = r'C:\books' #This tells the program where to look#

                    entryList = os.listdir(dir_ name) # not sure what exactly this does#


                    for file_name in fileList: #not sure what this does but when I run the program says this is not defined#

                    for line in file_object: #what does this line exactly do#

                    if word in line:

                    print line

                    So guess my question is, is there a place in the program that will prompt the user for the word they want to search for. What does it take to prompt the user for such a thing cause as far as I can tell there doesn't seem to be a place that does that. Right now I guess all this will do is pretty much tell the program where to look for the files but it won't yet conduct the search. Am I seeing it right or am I way off base.
                    Thanks for your help

                    Comment

                    • grantstech
                      New Member
                      • Jun 2007
                      • 16

                      #11
                      I have the following code. It prints out the list of files inside the folder but doesn't seem to be searching the files.

                      #!C:\PYTHON25\P YTHON.EXE

                      import os
                      dir_name= r'C:\python25\b ooks\books\book s'
                      entryList=os.li stdir(dir_name)
                      print entryList
                      searchWord= "Moses"
                      for file_name in entryList:
                      for line in file_name:
                      if searchWord in line:
                      print line


                      Am I even close? Do I have to open and close each file?

                      Thanks for your help.

                      Comment

                      • ilikepython
                        Recognized Expert Contributor
                        • Feb 2007
                        • 844

                        #12
                        Originally posted by texas22
                        Thanks, for all the input and yes you are right I am trying to use the help that I am given so I will start posting my code in order to become more involved. My big problem is just getting started and then trying to figure out what each step means by seeing it so anyways here is what I got so far. Being new to programming and especially no prior knowledge whatsoever on python this is a big learning curve for me. The comments I put next to each line is what I am assuming each line does so if you could clarify for me.

                        (code)
                        import os

                        dir_name = r'C:\books' #This tells the program where to look#

                        entryList = os.listdir(dir_ name) # not sure what exactly this does#


                        for file_name in fileList: #not sure what this does but when I run the program says this is not defined#

                        for line in file_object: #what does this line exactly do#

                        if word in line:

                        print line

                        So guess my question is, is there a place in the program that will prompt the user for the word they want to search for. What does it take to prompt the user for such a thing cause as far as I can tell there doesn't seem to be a place that does that. Right now I guess all this will do is pretty much tell the program where to look for the files but it won't yet conduct the search. Am I seeing it right or am I way off base.
                        Thanks for your help
                        [code=python]
                        entryList = os.listdir(dir_ name)
                        [/code]
                        That line lists all the files and folders in dir_name.

                        In the next code you have your variable names mixed up a bit:
                        [code=python]
                        for file_name in fileList: # should be entryList instead of fileList
                        for line in file_object: # should be "for line in file(file_name) .readlines():"
                        if word in line:
                        print line
                        [/code]
                        So the corrected code would look like this:
                        [code=python]
                        dir_name = r"C:\Books"
                        entryList = os.listdirs(dir _name)

                        for file_name in entryList: # iterate over files in entryList
                        for line in file(file_name) .readlines(): # iterate over lines in each file
                        if word in line:
                        print line
                        [/code]

                        If you want to prompt the user for a word to search for you use raw_input().
                        The place where to put it depends on how often do you want to change the word to search for. If it's going to stay the whole program you can just add this line at the top of your file:
                        [code=python]
                        word = raw_input("Ente r a word to search for: ")
                        [/code]

                        Comment

                        • ilikepython
                          Recognized Expert Contributor
                          • Feb 2007
                          • 844

                          #13
                          Originally posted by grantstech
                          I have the following code. It prints out the list of files inside the folder but doesn't seem to be searching the files.

                          #!C:\PYTHON25\P YTHON.EXE

                          import os
                          dir_name= r'C:\python25\b ooks\books\book s'
                          entryList=os.li stdir(dir_name)
                          print entryList
                          searchWord= "Moses"
                          for file_name in entryList:
                          for line in file_name:
                          if searchWord in line:
                          print line


                          Am I even close? Do I have to open and close each file?

                          Thanks for your help.
                          You are iterating over the actual string of the file name. You want to iterate over the file object. Please see my previous reply and use python code tags ([code=python ][/code ]).

                          Comment

                          • texas22
                            New Member
                            • Jun 2007
                            • 26

                            #14
                            [code=python ]

                            import os

                            dir_name = r"C:\books"

                            word = raw_input("Ente r a word to search for: ")

                            entryList = os.listdirs(dir _name)


                            for file_name in entryList:

                            for line in file(file_name) .readlines():



                            if word in line:

                            print line

                            [/code ]

                            This is what I have so far when I run it I am getting the error
                            "AttributeError : 'module' object has no attribute 'listdirs'

                            What does this mean and am I on the right track?

                            Comment

                            • texas22
                              New Member
                              • Jun 2007
                              • 26

                              #15
                              sorry guys should not have troubled you with that one just had to take the 's' off of the line
                              [code = python]

                              entrylist=os.li stdir(dir_name) #no 's' on the end of list dir

                              The error I'm getting now is:
                              "for line in file(file_name) .readlines():

                              IO Error: [Errno2] No such file or directory: '1ch'

                              The '1ch' is the name of one of the files contained in the folder books

                              Comment

                              Working...