Perl code for parsing a text file and output a text file

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • jhamb
    New Member
    • Feb 2008
    • 18

    Perl code for parsing a text file and output a text file

    Hi,
    This code is in Perl (just a trial, not tested) to parse a text file and output to another file.
    It is used to delete lines that are not required and output lines that the user wants, to a new file.

    The problems are:
    1 This script is giving blank lines in the new file.
    2 I have to remove a whole delimitter (#if FOR_LAB) from the text file.
    3.The tool should be able to parse the file, and remove FOR_LAB’ed code, with out affecting the surrounding code, i.e.
    I have to remove whole of the sequence of lines:

    Code:
    #if FOR_LAB
       DWORD dwSize = file.getFileSize();
       DWORD dwMinFileSize = sizeof(CVTableCfgFileHeader) +
                sizeof(CVTableCfgFileFracHdr) + sizeof(CVTableCfgFileSigDesc);
          if (dwSize < dwMinFileSize)
          {
             Error.Log(__FILE__, "CVT config file %s is too short.", pszFileName);
             return false;     // need at least one signal description to be useful
          }
    #endif
    not just the first 1..



    Code:

    Code:
    #!/usr/local/bin/perl
    
    #program to read cvtablebuild.cpp
    #and write to cvtablebuild_final.cpp
    #
    
    $file = '/users/aj/files/cvtablebuild.cpp';
    open(INFO, $file);                                   #opens file cvtablebuild.cpp
    open(DATA, ">cvtablebuild_final.cpp");     #file to write data to
    @lines = <INFO>;                                  #assigns lines to array
    
    foreach $line (@lines) #go through each line in file
    {
    	if ($line ^#if FOR_LAB)
    	{
    		$line = ~ s/$line//;
    	}
    
    	DATA == $line;
    	print DATA "\n";
    }
    
    close(INFO);                                             #closes file
    close(DATA);
    Thankq in advance, for any kind of help... :):)


    Regards,
    Anubhav Jhamb
    Last edited by eWish; Feb 21 '08, 02:10 PM. Reason: Please use [CODE][CODE] tags
  • rohitbasu77
    New Member
    • Feb 2008
    • 89

    #2
    [CODE=perl]#!/usr/bin/perl

    $file = 'read';
    $wrt = 'write';
    open(info, $file);
    open(data, ">>$wrt");

    @lines = <info>;

    foreach $lin (@lines){
    #.............. ............... ......correct the search condition
    if($lin =~ /^#FOR_LAB/){
    #.............. ............... .....make the line blank
    $lin = "";
    }

    print data "$lin";
    }[/CODE]


    regards
    rohit
    Last edited by eWish; Feb 21 '08, 02:09 PM. Reason: Added Code Tags - Removed Email Address

    Comment

    • nithinpes
      Recognized Expert Contributor
      • Dec 2007
      • 410

      #3
      Originally posted by jhamb
      Hi,
      This code is in Perl (just a trial, not tested) to parse a text file and output to another file.
      It is used to delete lines that are not required and output lines that the user wants, to a new file.
      The problems are:
      1 This script is giving blank lines in the new file.
      2 I have to remove a whole delimitter (#if FOR_LAB) from the text file.
      3.The tool should be able to parse the file, and remove FOR_LAB’ed code, with out affecting the surrounding code, i.e.
      I have to remove whole of the sequence of lines:

      #if FOR_LAB
      DWORD dwSize = file.getFileSiz e();
      DWORD dwMinFileSize = sizeof(CVTableC fgFileHeader) +
      sizeof(CVTableC fgFileFracHdr) + sizeof(CVTableC fgFileSigDesc);
      if (dwSize < dwMinFileSize)
      {
      Error.Log(__FIL E__, "CVT config file %s is too short.", pszFileName);
      return false; // need at least one signal description to be useful
      }
      #endif
      not just the first 1..



      Code:

      #!/usr/local/bin/perl

      #program to read cvtablebuild.cp p
      #and write to cvtablebuild_fi nal.cpp
      #

      $file = '/users/aj/files/cvtablebuild.cp p';
      open(INFO, $file); #opens file cvtablebuild.cp p
      open(DATA, ">cvtablebuild_ final.cpp"); #file to write data to
      @lines = <INFO>; #assigns lines to array

      foreach $line (@lines) #go through each line in file
      {
      if ($line ^#if FOR_LAB)
      {
      $line = ~ s/$line//;
      }

      DATA == $line;
      print DATA "\n";
      }

      close(INFO); #closes file
      close(DATA);

      Thankq in advance, for any kind of help... :):)


      Regards,
      Anubhav Jhamb
      If you want to skip blank lines, you need not print them. And also, you need not put a "\n" while printing since you are not chopping newlines after reading input(using chomp). This will add unnecessary blank lines.
      See the modified foreach loop:
      [CODE=text]
      foreach $line (@lines) #go through each line in file
      {
      next if ($line=~/^#if FOR_LAB/);
      print DATA $line;
      }
      [/CODE]

      This will remove lines begining with #if FOR_LAB. If your requirement is to remove all lines from '#if FOR_LAB' to '#endif', use the following method:

      [CODE=text]
      foreach $line (@lines) #go through each line in file
      {
      if ($line=~/^#if FOR_LAB/) {
      $i=1; next;
      }
      if($i==1)
      {
      next unless($line=~/#endif/); ## skip upto #endif
      $i=0; ##re-initialize
      next; ## to skip #endif
      }

      print DATA $line;
      }
      [/CODE]
      Last edited by nithinpes; Feb 21 '08, 02:05 PM. Reason: edited text

      Comment

      • KevinADC
        Recognized Expert Specialist
        • Jan 2007
        • 4092

        #4
        Originally posted by rohitbasu77
        [CODE=perl]#!/usr/bin/perl

        $file = 'read';
        $wrt = 'write';
        open(info, $file);
        open(data, ">>$wrt");

        @lines = <info>;

        foreach $lin (@lines){
        #.............. ............... ......correct the search condition
        if($lin =~ /^#FOR_LAB/){
        #.............. ............... .....make the line blank
        $lin = "";
        }

        print data "$lin";
        }[/CODE]


        regards
        rohit

        You should read the question more carefully.

        Comment

        • jhamb
          New Member
          • Feb 2008
          • 18

          #5
          Nithinpes,
          Thanx for that loop code, I was just looking for that kind of job to be done. :)

          Rohit,
          Well, thank you too for correcting the search condition. Now I gotto know about it. :)

          Comment

          • jhamb
            New Member
            • Feb 2008
            • 18

            #6
            One more thing I had to look upon:
            The code should be able to do it for multiple files and directories, recursively.

            But I would like to see working for single file first, before going further.
            I will try writing some code by self. If get stuck, I'll post...

            Anubhav Jhamb

            Comment

            • rohitbasu77
              New Member
              • Feb 2008
              • 89

              #7
              Originally posted by KevinADC
              You should read the question more carefully.
              What wrong you find there.

              Comment

              • KevinADC
                Recognized Expert Specialist
                • Jan 2007
                • 4092

                #8
                Originally posted by rohitbasu77
                What wrong you find there.

                I have to remove whole of the sequence of lines:

                not just the first 1..

                Comment

                • rohitbasu77
                  New Member
                  • Feb 2008
                  • 89

                  #9
                  Originally posted by KevinADC
                  I have to remove whole of the sequence of lines:

                  not just the first 1..
                  Your question is not clear to me.
                  write the original text and the modified text.

                  Comment

                  • KevinADC
                    Recognized Expert Specialist
                    • Jan 2007
                    • 4092

                    #10
                    Originally posted by rohitbasu77
                    Your question is not clear to me.
                    write the original text and the modified text.
                    I did not write a question. It really does not matter anyway, nithinpes already posted a working solution to the original question.

                    Comment

                    • jhamb
                      New Member
                      • Feb 2008
                      • 18

                      #11
                      Sorry, for repetition of same post..
                      Please ignore this post... :):)

                      Comment

                      • jhamb
                        New Member
                        • Feb 2008
                        • 18

                        #12
                        Originally posted by rohitbasu77
                        Your question is not clear to me.
                        write the original text and the modified text.
                        Rohit,
                        KevinADC was just telling you to add lines in the loop to make up the task of removing all the sequence of lines in FOR_LAB'ed code. :):)

                        For instance the Input File content is as below:

                        Code:
                        if (file.open(pszFileName))
                           {
                                                pby = (char*)file.getBufferPointer();
                        #if FOR_LAB
                           DWORD dwSize = file.getFileSize();
                           DWORD dwMinFileSize = sizeof(CVTableCfgFileHeader) +
                                    sizeof(CVTableCfgFileFracHdr) + sizeof(CVTableCfgFileSigDesc);
                              if (dwSize < dwMinFileSize)
                              {
                                 Error.Log(__FILE__, "CVT config file %s is too short.", pszFileName);
                                 return false;     // need at least one signal description to be useful
                              }
                        #endif
                        
                              pFileHdr = (CVTableCfgFileHeader*)pby;
                              pby += sizeof(CVTableCfgFileHeader);
                        #if FOR_LAB
                              if (strcmp(pFileHdr->szSignature, "CVT-CFG") != 0)
                              {
                                 Error.Log(__FILE__, "CVT cfg file %s has wrong signature.", pszFileName);
                                 return false;
                              }
                              if (pFileHdr->version != iCVTCFGVERSION)
                              {
                                 Error.Log(__FILE__, "Wrong version of CVT cfg. Found %d expecting %d in %s",
                                    pFileHdr->version, iCVTCFGVERSION, pszFileName);
                                 return false;
                              }
                              if (pFileHdr->iNumFrac > CVTableIndex::iMAX_FRACTIONS)
                              {
                                 Error.Log(__FILE__, "CVT cfg file %s has too many fractions", pszFileName);
                                 return false;
                              }
                        #endif
                                 // --- ok... ready to build the whole table ---
                        
                           CVTableIndex *pIdx = m_hTblIdx.Init(CVTableIndex::pszTBLIDXNAME, 
                                                               SharedMemObjHandle<CVTableIndex>::omCREATE_RO);
                        #if FOR_LAB
                              if (pIdx == NULL)
                              {
                                 Error.Log(__FILE__, "Unable to create CVT table index.");
                                 return false;
                              }
                        #endif
                        
                           int iTotalSignalCt = 0;
                              for (int iFrac=0; iFrac < pFileHdr->iNumFrac; ++iFrac)
                              {
                                 pFracHdr = (CVTableCfgFileFracHdr*)pby;
                                 pby += sizeof(CVTableCfgFileFracHdr);
                        
                        #if FOR_LAB
                                 if (*pFracHdr->szFracName == 0 ||
                                      pFracHdr->iNumSigs > 5000    )
                                 {
                                    Error.Log(__FILE__, "CVT cfg file '%s' fraction header %d malformed.", 
                                              pszFileName, iFrac);
                                    break;
                                 }
                        #endif

                        After parsing, the output file should be devoid of FOR_LAB code:

                        Code:
                        if (file.open(pszFileName))
                           {
                                                pby = (char*)file.getBufferPointer();
                        
                              pFileHdr = (CVTableCfgFileHeader*)pby;
                              pby += sizeof(CVTableCfgFileHeader);
                                 // --- ok... ready to build the whole table ---
                        
                           CVTableIndex *pIdx = m_hTblIdx.Init(CVTableIndex::pszTBLIDXNAME, 
                                                               SharedMemObjHandle<CVTableIndex>::omCREATE_RO);
                        
                           int iTotalSignalCt = 0;
                              for (int iFrac=0; iFrac < pFileHdr->iNumFrac; ++iFrac)
                              {
                                 pFracHdr = (CVTableCfgFileFracHdr*)pby;
                                 pby += sizeof(CVTableCfgFileFracHdr);
                        -- Jhamb
                        Last edited by eWish; Feb 25 '08, 02:24 PM. Reason: Please use code tags

                        Comment

                        • rohitbasu77
                          New Member
                          • Feb 2008
                          • 89

                          #13
                          ok. thanks. As its a hassed statement. i thought you need to remove the hassed statement only.........

                          #if FOR_LAB.... this one in the whole code reperting any no of time.

                          ok. Any way thanks to every one.....
                          Last edited by eWish; Feb 25 '08, 02:25 PM. Reason: Removed Quote

                          Comment

                          • jhamb
                            New Member
                            • Feb 2008
                            • 18

                            #14
                            Hey all,
                            I am done with the complete task..! :)

                            Code file name: perl_code.pl
                            Code:
                            $Start_Str = shift @ARGV;                  # to pass starting string
                            $End_Str = shift @ARGV;                    # to pass ending string
                            foreach $file (@ARGV)                       # to repeat the task for each file
                            {
                               $Flag = 0;
                               $filename = "Final_".$file;
                               open(DATA, ">$filename");                    # file to write data to
                               if (open(INFO, $file)) {
                                  @line = <INFO>;                             # assigns lines to array
                                  foreach $lines (@line) {                      # go through each line in file
                                     if (($lines=~ /$Start_Str/) || ($Flag == 1))
                                     {
                                        $Flag = 1;
                                        if ($lines=~ /$End_Str/ )
                                        {
                                           $Flag = 0;
                                        }
                                     }
                                     else
                                     {
                                        print DATA ($lines);
                                     }
                                  }
                               }
                            }
                            print "Parsing is completed";
                            close(INFO);                                  # closes file
                            close(DATA);
                            This is command line and can be used for any input, for multiple numbers of files, without affecting the surrounding code.

                            I tried writing the following lines in command prompt:

                            C:\>PERL perl_code.pl "#if FOR_LAB" endif cvtablebuild.cp p 1.txt
                            Parsing is completed

                            Or can try for any input:

                            C:\>PERL perl_code.pl ANYTHING_ELSE endif cvtablebuild.cp p 1.txt
                            Parsing is completed

                            Output files are created as Final_cvtablebu ild.cpp and Final_1.txt

                            1.text:
                            Code:
                               if (file.open(pszFileName))
                               {
                                                    pby = (char*)file.getBufferPointer();
                            #if FOR_LAB
                               DWORD dwSize = file.getFileSize();
                               DWORD dwMinFileSize = sizeof(CVTableCfgFileHeader) +
                                        sizeof(CVTableCfgFileFracHdr) + sizeof(CVTableCfgFileSigDesc);
                                  if (dwSize < dwMinFileSize)
                                  {
                                     Error.Log(__FILE__, "CVT config file %s is too short.", pszFileName);
                                     return false;     // need at least one signal description to be useful
                                  }
                            #endif
                            
                                  pFileHdr = (CVTableCfgFileHeader*)pby;
                                  pby += sizeof(CVTableCfgFileHeader);
                            #if FOR_LAB
                                  if (strcmp(pFileHdr->szSignature, "CVT-CFG") != 0)
                                  {
                                     Error.Log(__FILE__, "CVT cfg file %s has wrong signature.", pszFileName);
                                     return false;
                                  }
                                  if (pFileHdr->version != iCVTCFGVERSION)
                                  {
                                     Error.Log(__FILE__, "Wrong version of CVT cfg. Found %d expecting %d in %s",
                                        pFileHdr->version, iCVTCFGVERSION, pszFileName);
                                     return false;
                                  }
                                  if (pFileHdr->iNumFrac > CVTableIndex::iMAX_FRACTIONS)
                                  {
                                     Error.Log(__FILE__, "CVT cfg file %s has too many fractions", pszFileName);
                                     return false;
                                  }
                            #endif
                                     // --- ok... ready to build the whole table ---
                            
                               CVTableIndex *pIdx = m_hTblIdx.Init(CVTableIndex::pszTBLIDXNAME, 
                                                                   SharedMemObjHandle<CVTableIndex>::omCREATE_RO);
                            #if FOR_LAB
                                  if (pIdx == NULL)
                                  {
                                     Error.Log(__FILE__, "Unable to create CVT table index.");
                                     return false;
                                  }
                            #endif
                            
                               int iTotalSignalCt = 0;
                                  for (int iFrac=0; iFrac < pFileHdr->iNumFrac; ++iFrac)
                                  {
                                     pFracHdr = (CVTableCfgFileFracHdr*)pby;
                                     pby += sizeof(CVTableCfgFileFracHdr);
                            
                            #if FOR_LAB
                                     if (*pFracHdr->szFracName == 0 ||
                                          pFracHdr->iNumSigs > 5000    )
                                     {
                                        Error.Log(__FILE__, "CVT cfg file '%s' fraction header %d malformed.", 
                                                  pszFileName, iFrac);
                                        break;
                                     }
                            #endif
                            Final_1.txt:
                            Code:
                               if (file.open(pszFileName))
                               {
                                                    pby = (char*)file.getBufferPointer();
                            
                                  pFileHdr = (CVTableCfgFileHeader*)pby;
                                  pby += sizeof(CVTableCfgFileHeader);
                                     // --- ok... ready to build the whole table ---
                            
                               CVTableIndex *pIdx = m_hTblIdx.Init(CVTableIndex::pszTBLIDXNAME, 
                                                                   SharedMemObjHandle<CVTableIndex>::omCREATE_RO);
                            
                               int iTotalSignalCt = 0;
                                  for (int iFrac=0; iFrac < pFileHdr->iNumFrac; ++iFrac)
                                  {
                                     pFracHdr = (CVTableCfgFileFracHdr*)pby;
                                     pby += sizeof(CVTableCfgFileFracHdr);
                            Thanks to everyone who helped.. :):)
                            Please let me know if you find something more to add in the code.
                            Attached Files

                            Comment

                            Working...