Iterating over a file in python

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • moconno5
    New Member
    • Jul 2007
    • 19

    Iterating over a file in python

    Hello everyone, I wrote a post awhile ago about automating a local client to access a BLAT webserver, but today I have a much easier one. I want to take a batch file and delete every odd line. See below:

    Sample File:
    >Musmusculusm iR-344
    UGAUCUAGCCAAAGC CUGACUGU
    >Musmusculusm iR-345
    UGCUGACCCCUAGUC CAGUGC
    >Musmusculusm iR-346
    UGUCUGCCCGAGUGC CUGCCUCU
    >Musmusculusm iR-350
    UUCACAAAGCCCAUA CACUUUCA

    I need to delete all the lines that start with '>' and end with a '\n'. I have some code below, but it just isolates the part of the string I want to delete. I need to do the reverse... I know there is some easy way that I am totally missing here!

    Code:
    #!/usr/bin/env python
    # written 7/28/2007
    # by Mark O'Connor
    
    def Resize( filename ):
        line = 0
        collect = []
        fp = file( filename )
        data = fp.read()
        fp.close()
        #print data
        while line != -1:
            start = data.find('>', line+1)
            end = data.find ('/n', start)
            chunk = data[start:end]
        return chunk
    Thanks,

    Mark
  • bartonc
    Recognized Expert Expert
    • Sep 2006
    • 6478

    #2
    Originally posted by moconno5
    Hello everyone, I wrote a post awhile ago about automating a local client to access a BLAT webserver, but today I have a much easier one. I want to take a batch file and delete every odd line. See below:

    Sample File:
    >Musmusculusm iR-344
    UGAUCUAGCCAAAGC CUGACUGU
    >Musmusculusm iR-345
    UGCUGACCCCUAGUC CAGUGC
    >Musmusculusm iR-346
    UGUCUGCCCGAGUGC CUGCCUCU
    >Musmusculusm iR-350
    UUCACAAAGCCCAUA CACUUUCA

    I need to delete all the lines that start with '>' and end with a '\n'. I have some code below, but it just isolates the part of the string I want to delete. I need to do the reverse... I know there is some easy way that I am totally missing here!

    Code:
    #!/usr/bin/env python
    # written 7/28/2007
    # by Mark O'Connor
    
    def Resize( filename ):
        line = 0
        collect = []
        fp = file( filename )
        data = fp.read()
        fp.close()
        #print data
        while line != -1:
            start = data.find('>', line+1)
            end = data.find ('/n', start)
            chunk = data[start:end]
        return chunk
    Thanks,

    Mark
    Hey Mark...
    I'd use something like this:[CODE=python]outList = []
    f = open(fileName)
    for line in f:
    if line.startswith ('>'):
    continue
    outList.append( line)
    f.close()
    f = open(newFileNam e, 'w') # or old one to replace it
    f.writelines(ou tLIst)
    f.close()[/CODE]Untested, but generally sound.

    Comment

    • moconno5
      New Member
      • Jul 2007
      • 19

      #3
      Thanks! That did the trick

      Mark

      Comment

      • bartonc
        Recognized Expert Expert
        • Sep 2006
        • 6478

        #4
        Originally posted by moconno5
        Thanks! That did the trick

        Mark
        Files, like all iterators have some pretty cool methods hung on them. That example just scratches the surface.

        Any time,
        Barton

        Comment

        • robin746
          New Member
          • Aug 2007
          • 5

          #5
          Or even:

          Code:
          out = []
          with open(fileName) as f:
          	for line in f:
          		if line.startswith('>'):
          			continue
          		out.append(line)
          
          with open(newFileName, 'w') as f:
          	f.writelines(out)

          Comment

          • twohot
            New Member
            • Dec 2010
            • 8

            #6
            @ bartonc

            Files, like all iterators have some pretty cool methods hung on them. That example just scratches the surface.
            Now I'm listening. What are those cool methods?

            Comment

            • Michael Colon
              New Member
              • Dec 2010
              • 2

              #7
              Personally, I love python for many reasons, especially when it comes to parsing things and shifting things. This next piece of code should take you to the next level in python development.

              Code:
              input = open("inputfile.txt")
              output = open("outputfile.txt", 'w')
              
              output.writelines([(line) for line in input if not line.startswith('>')])
              And thats it!

              Comment

              Working...