File Parsing Question

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Shankarjee Krishnamoorthi

    File Parsing Question

    Hi,
    I am new to Python. I am trying to do the following

    inp = open(my_file,'r ')

    for line in inp:
    # Perform some operations with line
    if condition something:
    # Start re reading for that position again
    for line in inp:
    if some other condition
    break
    # I need to go back one line and use that line value.
    # I need to perform the operations which are listed in the
    top with this line
    # value. I cannot push that operation here
    # I cannot do this with seek or tell.

    In Perl this is what I have
    while (<inp){
    # my_operations
    next if /pattern/
    while (<inp>) {
    operations again
    last if /pattern2/
    }
    seek(inp,(-1-length),1)
    }

    This works perfectly in Perl. Can I do the same in Python.

    Thanks
    Jee
  • Zentrader

    #2
    Re: File Parsing Question


    Save the previous line in a variable if you want the previous line
    only.
    for line in inp:
    # Perform some operations with line
    if condition something:
    print prev_line
    print line
    break
    # I need to go back one line and use that line value
    --prev_line = line

    If you want to do more than that, then use data=inp.readli nes() or you
    can use
    data = open(myfile), "r").readlines( ). The data will be stored in
    list format so you can access each line individually.

    Comment

    • Zentrader

      #3
      Re: File Parsing Question

      I'm assuming you know that python has a file.seek(), but you have to
      know the number of bytes you want to move from the beginning of the
      file or from the current location. You could save the length of the
      previous record, and use file seek to backup and then move forward,
      but it is simpler to save the previous rec or use readlines() if the
      file will fit into a reasonable amount of memory.

      Comment

      • Shankarjee Krishnamoorthi

        #4
        Re: File Parsing Question

        I would prefer to use something with seek. I am not able to use seek()
        with "for line in inp". Use tell and seek does not seem to do anything
        with the code. When I try to do

        for line in inp.readlines() :
        # Top of Loop
        if not condition in line:
        do_something
        else:
        for lines in inp.readlines() :
        if not condition
        do_something
        else:
        break
        pos = inp.tell()
        inp.seek(pos) ---This line has not effect in the program

        Not sure if Iam missing something very basic. Also the previous line
        needs to be used in the position I call # Top of Loop.

        Thanks


        On 9/12/07, Zentrader <zentraders@gma il.comwrote:
        I'm assuming you know that python has a file.seek(), but you have to
        know the number of bytes you want to move from the beginning of the
        file or from the current location. You could save the length of the
        previous record, and use file seek to backup and then move forward,
        but it is simpler to save the previous rec or use readlines() if the
        file will fit into a reasonable amount of memory.
        >
        --

        >

        Comment

        • Zentrader

          #5
          Re: File Parsing Question

          for line in inp.readlines() :

          If you are now using readlines() instead of readline(), then
          a) it is only used once to read all data into a container
          b) you can access each element/line by it's relative number

          data=open(filen ame, "r").readlines( )
          for eachline in data : (not readlines())

          so try
          print data[0] ## first rec
          print data[9] ## 10th rec, etc

          you can use
          ctr = 0
          for eachline in data:
          ##do something
          if ctr 0:
          print "this line is", eachline ## or data[ctr]
          print "prev_line = ", data[ctr-1]
          ctr += 1

          or a slightly different way
          stop = len(data)
          ctr=0
          while ctr < stop:
          ## do something
          if ctr 0 :
          this_line = data[ctr]
          prev_line = data[ctr-1]
          ctr += 1

          Sorry, I don't use file.seek() so can't help there

          Comment

          • Peter Otten

            #6
            Re: File Parsing Question

            Am Wed, 12 Sep 2007 17:28:08 -0500 schrieb Shankarjee Krishnamoorthi:
            I would prefer to use something with seek.
            Writing Perl in any language?
            I am not able to use seek()
            with "for line in inp". Use tell and seek does not seem to do anything
            with the code. When I try to do
            >
            for line in inp.readlines() :
            readlines() reads the whole file at once, so inp.tell() will give the
            position at the end of the file from now on.
            # Top of Loop
            if not condition in line:
            do_something
            else:
            for lines in inp.readlines() :
            if not condition
            do_something
            else:
            break
            pos = inp.tell()
            inp.seek(pos) ---This line has not effect in the program
            >
            Not sure if Iam missing something very basic. Also the previous line
            needs to be used in the position I call # Top of Loop.
            If you want to use seek/tell you can't iterate over the file directly
            because

            for line in inp:
            # ...

            reads ahead to make that iteration highly efficient -- so you will often
            get a position further ahead than the end of the current line.

            But you can use readline() (which doesn't read ahead) in conjunction with
            tell/seek; just replace all occurences of

            for line in inp:
            # ...

            with

            for line in iter(inp.readli ne, ""):
            # ...

            Peter

            Comment

            • Peter Otten

              #7
              Re: File Parsing Question

              Dennis Lee Bieber wrote:
              for line in inp:
              >
              will read one line at a time (I'm fairly sure the iterator doesn't
              attempt to buffer multiple lines behind the scenes)
              You are wrong:
              >>open("tmp.txt ", "w").writelines ("%s\n" % (9*c) for c in "ABCDE")
              >>instream = open("tmp.txt")
              >>for line in instream:
              .... print instream.tell() , line.strip()
              ....
              50 AAAAAAAAA
              50 BBBBBBBBB
              50 CCCCCCCCC
              50 DDDDDDDDD
              50 EEEEEEEEE
              >>>
              Here's the workaround:
              >>instream = open("tmp.txt")
              >>for line in iter(instream.r eadline, ""):
              .... print instream.tell() , line.strip()
              ....
              10 AAAAAAAAA
              20 BBBBBBBBB
              30 CCCCCCCCC
              40 DDDDDDDDD
              50 EEEEEEEEE
              >>>
              Peter

              Comment

              • Shankarjee Krishnamoorthi

                #8
                Re: File Parsing Question

                Great. That worked for me. I had some of my routines implemented in
                Perl earlier. Now that I started using Python I am trying to do all my
                automation scripts with Python. Thanks a ton

                Jee

                On 9/13/07, Peter Otten <__peter__@web. dewrote:
                Dennis Lee Bieber wrote:
                >
                for line in inp:

                will read one line at a time (I'm fairly sure the iterator doesn't
                attempt to buffer multiple lines behind the scenes)
                >
                You are wrong:
                >
                >open("tmp.txt" , "w").writelines ("%s\n" % (9*c) for c in "ABCDE")
                >instream = open("tmp.txt")
                >for line in instream:
                ... print instream.tell() , line.strip()
                ...
                50 AAAAAAAAA
                50 BBBBBBBBB
                50 CCCCCCCCC
                50 DDDDDDDDD
                50 EEEEEEEEE
                >>
                >
                Here's the workaround:
                >
                >instream = open("tmp.txt")
                >for line in iter(instream.r eadline, ""):
                ... print instream.tell() , line.strip()
                ...
                10 AAAAAAAAA
                20 BBBBBBBBB
                30 CCCCCCCCC
                40 DDDDDDDDD
                50 EEEEEEEEE
                >>
                >
                Peter
                --

                >

                Comment

                Working...