calling tar commands

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • tcurdts
    New Member
    • Jun 2007
    • 12

    calling tar commands

    I think what I’m trying to do is very basic but I’m a rank beginner and I’m stuck. I’m using Python 2.4.1 on WindowsNT. I’ve installed UnxUtils, which includes the tar commands for Windows.

    Immediate objective:
    I have a text file containing a list of tar files, one filename per line. For now I’m just trying to loop through my text file and untar each file in the list.

    Background/big picture:
    Ultimately I’d like to
    • Loop through the list and untar each tar file (~1700 files total),
    • import some of the un-tarred components of the original tar file into an image processing program (ERDAS Imagine),
    • zip the output from Imagine, and
    • delete the unneeded files from the original tar.

    So far, I’m able to loop through the list and print filenames. Also, the tar command is working from the command line (and batch file) but I’m missing something when I try to integrate the tar command onto python.

    Here’s what I’ve got so far:

    #####Loop through and print each line (works fine):
    Code:
     
    import fileinput
    
    path = r"C:\WorkSpace\LTC\MRLC_test\NZT\tmp"
    inFile = open(path + "\\tarList7.txt", 'r')
    
    for line in inFile:
        print line
    
    inFile.close()
    
    print "Done!"
    #####Trying to add tar component:
    Code:
    import fileinput
    import tarfile
    
    path = r"C:\WorkSpace\LTC\MRLC_test\NZT\tmp"
    inFile = open(path + "\\tarList2.txt", "r")
    
    line = inFile.readline()
    tar = tarfile.open(line)
    
    for line in inFile:
        print line
        tar.extractall()
        tar.close()
    
    inFile.close()
    Thanks in advance...
  • ilikepython
    Recognized Expert Contributor
    • Feb 2007
    • 844

    #2
    Originally posted by tcurdts
    I think what I’m trying to do is very basic but I’m a rank beginner and I’m stuck. I’m using Python 2.4.1 on WindowsNT. I’ve installed UnxUtils, which includes the tar commands for Windows.

    Immediate objective:
    I have a text file containing a list of tar files, one filename per line. For now I’m just trying to loop through my text file and untar each file in the list.

    Background/big picture:
    Ultimately I’d like to
    • Loop through the list and untar each tar file (~1700 files total),
    • import some of the un-tarred components of the original tar file into an image processing program (ERDAS Imagine),
    • zip the output from Imagine, and
    • delete the unneeded files from the original tar.

    So far, I’m able to loop through the list and print filenames. Also, the tar command is working from the command line (and batch file) but I’m missing something when I try to integrate the tar command onto python.

    Here’s what I’ve got so far:

    #####Loop through and print each line (works fine):
    Code:
     
    import fileinput
    
    path = r"C:\WorkSpace\LTC\MRLC_test\NZT\tmp"
    inFile = open(path + "\\tarList7.txt", 'r')
    
    for line in inFile:
        print line
    
    inFile.close()
    
    print "Done!"
    #####Trying to add tar component:
    Code:
    import fileinput
    import tarfile
    
    path = r"C:\WorkSpace\LTC\MRLC_test\NZT\tmp"
    inFile = open(path + "\\tarList2.txt", "r")
    
    line = inFile.readline()
    tar = tarfile.open(line)
    
    for line in inFile:
        print line
        tar.extractall()
        tar.close()
    
    inFile.close()
    Thanks in advance...
    What's the error?

    I haven't used this module at all but why are you closing the tar file in the for loop?

    Comment

    • ghostdog74
      Recognized Expert Contributor
      • Apr 2006
      • 511

      #3
      Originally posted by tcurdts
      I think what I’m trying to do is very basic but I’m a rank beginner and I’m stuck. I’m using Python 2.4.1 on WindowsNT. I’ve installed UnxUtils, which includes the tar commands for Windows.

      Immediate objective:
      I have a text file containing a list of tar files, one filename per line. For now I’m just trying to loop through my text file and untar each file in the list.

      Background/big picture:
      Ultimately I’d like to
      • Loop through the list and untar each tar file (~1700 files total),
      • import some of the un-tarred components of the original tar file into an image processing program (ERDAS Imagine),
      • zip the output from Imagine, and
      • delete the unneeded files from the original tar.

      So far, I’m able to loop through the list and print filenames. Also, the tar command is working from the command line (and batch file) but I’m missing something when I try to integrate the tar command onto python.

      Here’s what I’ve got so far:

      #####Loop through and print each line (works fine):
      Code:
       
      import fileinput
      
      path = r"C:\WorkSpace\LTC\MRLC_test\NZT\tmp"
      inFile = open(path + "\\tarList7.txt", 'r')
      
      for line in inFile:
          print line
      
      inFile.close()
      
      print "Done!"
      #####Trying to add tar component:
      Code:
      import fileinput
      import tarfile
      
      path = r"C:\WorkSpace\LTC\MRLC_test\NZT\tmp"
      inFile = open(path + "\\tarList2.txt", "r")
      
      line = inFile.readline()
      tar = tarfile.open(line)
      
      for line in inFile:
          print line
          tar.extractall()
          tar.close()
      
      inFile.close()
      Thanks in advance...
      extractall() takes in arguments...

      Comment

      • ilikepython
        Recognized Expert Contributor
        • Feb 2007
        • 844

        #4
        Originally posted by ghostdog74
        extractall() takes in arguments...
        Yea but they're optional. If a path is not given, it uses the current directory.

        Comment

        • ghostdog74
          Recognized Expert Contributor
          • Apr 2006
          • 511

          #5
          Originally posted by ilikepython
          Yea but they're optional. If a path is not given, it uses the current directory.
          oops, my bad for misreading.
          Code:
          for line in open("tarlist.txt"):
              line=line.strip()
              tar = tarfile.open(line)
              for t in tar:
                  tar.extract(t)

          Comment

          • tcurdts
            New Member
            • Jun 2007
            • 12

            #6
            Sorry... meant to include the error. I included my code in my original posting to show that I've at least been trying, but I'm so green that I suspect there's a better, cleaner way to do this.

            Here's the error msg (trailing newline):

            Error Message:
            Traceback (most recent call last):
            File "C:\Python24\Li b\site-packages\python win\pywin\frame work\scriptutil s.py", line 310, in RunScript
            exec codeObject in __main__.__dict __
            File "C:\PythonScrip ts\MyScripts\Sc ript3.py", line 9, in ?
            tar = tarfile.open(li ne)
            File "C:\Python24\li b\tarfile.py", line 916, in open
            return func(name, "r", fileobj)
            File "C:\Python24\li b\tarfile.py", line 959, in gzopen
            fileobj = file(name, mode + "b")
            IOError: [Errno 2] No such file or directory: 'NZT05013031111 6199500.tar\n'

            Comment

            • ilikepython
              Recognized Expert Contributor
              • Feb 2007
              • 844

              #7
              Originally posted by tcurdts
              Sorry... meant to include the error. I included my code in my original posting to show that I've at least been trying, but I'm so green that I suspect there's a better, cleaner way to do this.

              Here's the error msg (trailing newline):

              Error Message:
              Traceback (most recent call last):
              File "C:\Python24\Li b\site-packages\python win\pywin\frame work\scriptutil s.py", line 310, in RunScript
              exec codeObject in __main__.__dict __
              File "C:\PythonScrip ts\MyScripts\Sc ript3.py", line 9, in ?
              tar = tarfile.open(li ne)
              File "C:\Python24\li b\tarfile.py", line 916, in open
              return func(name, "r", fileobj)
              File "C:\Python24\li b\tarfile.py", line 959, in gzopen
              fileobj = file(name, mode + "b")
              IOError: [Errno 2] No such file or directory: 'NZT05013031111 6199500.tar\n'
              Try:
              [code=python]
              tar = tarfile.open(li ne[:-1])
              [/code]
              That will remove the trailing new line.

              Comment

              • bartonc
                Recognized Expert Expert
                • Sep 2006
                • 6478

                #8
                Originally posted by ilikepython
                Try:
                [code=python]
                tar = tarfile.open(li ne[:-1])
                [/code]
                That will remove the trailing new line.
                [CODE=python]As will tar = tarfile.open(li ne.strip())[/CODE]In case there is no newline at the end.

                Comment

                • tcurdts
                  New Member
                  • Jun 2007
                  • 12

                  #9
                  Thanks much!! I've got the code working (untarring files contained in a txt file) when the python script is located in the same directory as the tar files.

                  Two more questions:
                  1) How do I specify different input and output directories so I can run the script from a different location and put my output (untarred files) in yet another directory?
                  2) Is there redundancy in lines 5 and 8?

                  Here's the code that's working as described above:
                  Code:
                  import fileinput
                  import tarfile
                  path = r"C:\WorkSpace\LTC\MRLC_test\NZT\tmp"
                  tfile = "\\tarList2.txt"
                  inFile = open(path + tfile)
                  line = inFile.readline()
                  
                  for line in open(path + tfile):
                      line = line.strip()
                      print line
                      tar = tarfile.open(line)
                      for t in tar:
                          tar.extract(t)

                  Comment

                  • ilikepython
                    Recognized Expert Contributor
                    • Feb 2007
                    • 844

                    #10
                    Originally posted by tcurdts
                    Thanks much!! I've got the code working (untarring files contained in a txt file) when the python script is located in the same directory as the tar files.

                    Two more questions:
                    1) How do I specify different input and output directories so I can run the script from a different location and put my output (untarred files) in yet another directory?
                    2) Is there redundancy in lines 5 and 8?

                    Here's the code that's working as described above:
                    Code:
                    import fileinput
                    import tarfile
                    path = r"C:\WorkSpace\LTC\MRLC_test\NZT\tmp"
                    tfile = "\\tarList2.txt"
                    inFile = open(path + tfile)
                    line = inFile.readline()
                    
                    for line in open(path + tfile):
                        line = line.strip()
                        print line
                        tar = tarfile.open(line)
                        for t in tar:
                            tar.extract(t)
                    You don't need to open the file every line. Also it is a good idea to use the os module for files and directories for portability:
                    [code=python]
                    import os, os.path
                    import tarfile
                    import fileinput

                    path = raw_input("Ente r a directory: ")
                    tfile = raw_input("Ente r the file: ")

                    for line in open(os.path.jo in(path, tfile)):
                    line = line.strip()
                    print line
                    tar = tarfile.open(li ne)
                    for t in tar:
                    tar.extract(t)
                    [/code]
                    I replaced your path with raw_input() calls, is that you need?

                    Also, it might be a good idea to check if the path exists so your program doesn't raise an exception.

                    Comment

                    • tcurdts
                      New Member
                      • Jun 2007
                      • 12

                      #11
                      Nope, I don't really want interactive raw_input, but thanks anyway.

                      As i've got it set up now, the textfile (tarList2.txt) just has a list of file names w/out paths:
                      NZT070830130627 200100.tar
                      NZT070830140710 200000.tar
                      etc...

                      When I have the txt file and the script in the same directory as the tar files, it works fine and extracts the files in the same directory.

                      What I'd like to do is
                      1) specify a different output directory for the extracted tar files
                      and
                      2) (of less importance) be able to store the script in a different location from the text and tar files and reference the directory where the tar files reside. I added the full path to the entries in the text file:
                      C:\WorkSpace\tm p\NZT0708301306 27200100.tar
                      C:\WorkSpace\tm p\NZT0708301407 10200000.tar

                      but I'm having trouble managing my strings. I get a "no such file" error with double backslashes in the path when it tries to read the txt file. I also tried using the r"path\file.tar " format in the text file:
                      r"C:\WorkSpace\ tmp\NZT07083013 0627200100.tar"
                      r"C:\WorkSpace\ tmp\NZT07083014 0710200000.tar"

                      but get the same "double-slash" error:
                      IOError: [Errno 2] No such file or directory: 'r"C:\\WorkSpac e\\tmp\\NZT0708 30130627200100. tar"

                      How do I get rid of the extra back-slash? I appreciate your help.

                      Comment

                      • ilikepython
                        Recognized Expert Contributor
                        • Feb 2007
                        • 844

                        #12
                        Originally posted by tcurdts
                        Nope, I don't really want interactive raw_input, but thanks anyway.

                        As i've got it set up now, the textfile (tarList2.txt) just has a list of file names w/out paths:
                        NZT070830130627 200100.tar
                        NZT070830140710 200000.tar
                        etc...

                        When I have the txt file and the script in the same directory as the tar files, it works fine and extracts the files in the same directory.

                        What I'd like to do is
                        1) specify a different output directory for the extracted tar files
                        and
                        2) (of less importance) be able to store the script in a different location from the text and tar files and reference the directory where the tar files reside. I added the full path to the entries in the text file:
                        C:\WorkSpace\tm p\NZT0708301306 27200100.tar
                        C:\WorkSpace\tm p\NZT0708301407 10200000.tar

                        but I'm having trouble managing my strings. I get a "no such file" error with double backslashes in the path when it tries to read the txt file. I also tried using the r"path\file.tar " format in the text file:
                        r"C:\WorkSpace\ tmp\NZT07083013 0627200100.tar"
                        r"C:\WorkSpace\ tmp\NZT07083014 0710200000.tar"

                        but get the same "double-slash" error:
                        IOError: [Errno 2] No such file or directory: 'r"C:\\WorkSpac e\\tmp\\NZT0708 30130627200100. tar"

                        How do I get rid of the extra back-slash? I appreciate your help.
                        1. The tar functions extract() and extractall() take an optional parameter that determines the path. If you don't give one it extracts to the same path as the script.
                        [code=python]
                        for t in tar:
                        tar.extract(t, path)
                        [/code]
                        2. There are several ways you can fix this. Either you can have the path in the text file like:
                        Code:
                        C:\\WorkSpace\\tmp\\NZT070830140710200000.tar
                        (Remember that text files aren't python files, You don't need to inclose it in quotes.)
                        Then you will just open it normally.
                        Another way you, is you can call the open function like this:
                        [code=python]
                        tar = tarfile.open(`l ine`)
                        [/code]
                        That makes it a raw string.
                        For that way the lines in the text file should be:
                        Code:
                        C:\WorkSpace\tmp\NZT070830140710200000.tar
                        Hope that helps.

                        Comment

                        • tcurdts
                          New Member
                          • Jun 2007
                          • 12

                          #13
                          Dear ilikepython,

                          Thanks for hanging in there w/ me. I'm still screwing up the strings; now I'm getting quadruple back-slashes!

                          I tried to make your changes and here's the code followed by the error msg and my text file example:

                          Code:
                          import os, os.path
                          import fileinput
                          import tarfile
                          
                          path = r"C:\WorkSpace\LTC\MRLC_test\NZT\tmp"
                          tfile = r"tarList2p.txt"
                          
                          for line in open(os.path.join(path, tfile)):
                              line = line.strip()
                              print line
                              tar = tarfile.open(`line`)
                              for t in tar:
                                  tar.extract(t, r"C:\WorkSpace\LTC\MRLC_test\NZTunzipped")
                          IOError: [Errno 2] No such file or directory: "'C:\\\\WorkSpa ce\\\\LTC\\\\MR LC_test\\\\NZT\ \\\tmp\\\\NZT07 083013062720010 0.tar'"

                          Example of C:\WorkSpace\LT C\MRLC_test\NZT \tmp\tarList2p. txt:
                          C:\WorkSpace\LT C\MRLC_test\NZT \tmp\NZT0708301 30627200100.tar
                          C:\WorkSpace\LT C\MRLC_test\NZT \tmp\NZT0708301 40710200000.tar

                          What am I overlooking??

                          Comment

                          • ilikepython
                            Recognized Expert Contributor
                            • Feb 2007
                            • 844

                            #14
                            Originally posted by tcurdts
                            Dear ilikepython,

                            Thanks for hanging in there w/ me. I'm still screwing up the strings; now I'm getting quadruple back-slashes!

                            I tried to make your changes and here's the code followed by the error msg and my text file example:

                            Code:
                            import os, os.path
                            import fileinput
                            import tarfile
                            
                            path = r"C:\WorkSpace\LTC\MRLC_test\NZT\tmp"
                            tfile = r"tarList2p.txt"
                            
                            for line in open(os.path.join(path, tfile)):
                                line = line.strip()
                                print line
                                tar = tarfile.open(`line`)
                                for t in tar:
                                    tar.extract(t, r"C:\WorkSpace\LTC\MRLC_test\NZTunzipped")
                            IOError: [Errno 2] No such file or directory: "'C:\\\\WorkSpa ce\\\\LTC\\\\MR LC_test\\\\NZT\ \\\tmp\\\\NZT07 083013062720010 0.tar'"

                            Example of C:\WorkSpace\LT C\MRLC_test\NZT \tmp\tarList2p. txt:
                            C:\WorkSpace\LT C\MRLC_test\NZT \tmp\NZT0708301 30627200100.tar
                            C:\WorkSpace\LT C\MRLC_test\NZT \tmp\NZT0708301 40710200000.tar

                            What am I overlooking??
                            Oops, I'm really sorry, I gave you bad information. When you read from a file, the strings stays the same as it is from the file. Try it without the apostrophies:
                            [code=python]
                            tar = tarfile.open(li ne)
                            [/code]
                            If it doesn't work, please post back here, I'm interested.

                            Comment

                            • ilikepython
                              Recognized Expert Contributor
                              • Feb 2007
                              • 844

                              #15
                              Originally posted by ilikepython
                              Oops, I'm really sorry, I gave you bad information. When you read from a file, the strings stays the same as it is from the file. Try it without the apostrophies:
                              [code=python]
                              tar = tarfile.open(li ne)
                              [/code]
                              If it doesn't work, please post back here, I'm interested.
                              Ok, I did some tests and it works, so I'm pretty sure it will work for you.

                              Comment

                              Working...