Problem with regex in the script ???

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • vijayarl
    New Member
    • Sep 2008
    • 65

    Problem with regex in the script ???

    Hi All,

    Thanks in Advance..

    Problem statement: Need to display the raw stats file report but script count the stat's file which has *.xls

    Am reading the input raw stats files from a directory & comparing the stats type & pushing it to array then later
    counting the occurance.

    now the problem is if that directory contains raw stats excel file, it reads that file & increment the count which is not correct.

    all i need to print only raw stats file count (file which doesn't have any file type extension)

    how to achieve this ???

    i know problem is with regex "if" stmts in the script

    directory content look like this:
    prstat-Ls-20080118-1800
    prstat-Ls-20080118-1900
    prstat-Ls-20080118-1900.xls
    prstat-Lvs-20080118-1800
    prstat-Lvs-20080118-1900


    Output should look like this:
    There are totally 4 files present in C:\Performance_ svap\INPUT_FILE S\
    There are totally 2 prstat_Ls files present (even though we have prstat*.xls file..script should discard this file)
    There are totally 2 prstat_Lvs files present


    But i get this output:
    There are totally 5 files present in C:\Performance_ svap\INPUT_FILE S\
    There are totally 3 prstat_Ls files present (even though we have prstat*.xls file..script counted *.xls file too)
    There are totally 2 prstat_Lvs files present

    As per the script what ever the ouput am getting is correct but i just want script to count only raw files but not the *.xls
    how to do this ???

    Plz can anyone help me on this ???

    Script goes like this:
    Code:
    my $dir = "C:\\Performance_svap\\INPUT_FILES\\";
    
    &raw_stats_report;
    
    #######function to display the raw stats type report ###################
    
    sub raw_stats_report(){
    		my $f;
    		opendir(D, "$dir") || die "Can't opendir $dir: $!\n";
    		my @list = readdir(D);
    		closedir(D);
    		foreach my $f (@list){
    			if ($f =~ /sar-d/){
    				push (@sar_d,$f);
    				}
    			if ($f =~ /sar-g/){
    				push (@sar_g,$f);
    				}
    			if ($f =~ /sar-u/){
    				push (@sar_u,$f);
    				}
    			if ($f =~ /sar-r/){
    				push (@sar_r,$f);
    				}
    			if ($f =~ /vmstat/){
    				push (@vmstat,$f);
    				}
    			if ($f =~ /mpstat/){
    				push (@mpstat,$f);
    				}
    			if ($f =~ /prstat-mLV/){
    				push (@prstat_mLV,$f);
    				}
    			if ($f =~ /prstat-Ls/){
    				push (@prstat_Ls,$f);
    				}
    				if ($f =~ /prstat-Lvs/){
    				push (@prstat_Lvs,$f);
    				}
    			if ($f =~ /netstat/){
    				push (@netstat,$f);
    				}
    			if ($f =~ /iostat/){
    				push (@iostat,$f);
    				}
    		}		
    	chdir $dir;
    	@files =<*>;
    	print "############# Raw_Stats_Report ################\n \n ";
    	print "There are totally ",scalar(@files)," files present in $dir \n";
    	print "\n \n There are totally ",scalar(@iostat)," iostat files present \n";
    	print "\n There are totally ",scalar(@netstat)," netstat files present \n";
    	print "\n There are totally ",scalar(@prstat_Ls)," prstat_Ls files present \n";
    	print "\n There are totally ",scalar(@prstat_Lvs)," prstat_Lvs files present \n";
    	print "\n There are totally ",scalar(@sar_d)," sar-d files present \n";
    	print "\n There are totally ",scalar(@sar_g)," sar-g files present \n";
    	print "\n There are totally ",scalar(@sar_u)," sar-u files present \n";
    	print "\n There are totally ",scalar(@sar_r)," sar-r files present \n";
    	print "\n There are totally ",scalar(@prstat_mLV)," prstat_mLV files present \n";
    	print "\n There are totally ",scalar(@mpstat),"  mpstat files present \n";
    	print "\n There are totally ",scalar(@vmstat)," vmstat files present \n\n";
    	print "################################################\n \n ";
    	print "Do you want me to continue ? Y | N \n";
    	chomp(my $pick = <STDIN>);
    	if($pick =~/y/){
    	print "Process execution will continue !!! \n";
    	}
    	else{
    	print "Process execution stopped !!! \n";die;
    	}
    }
    Regards,
    Vijayarl
  • KevinADC
    Recognized Expert Specialist
    • Jan 2007
    • 4092

    #2
    Code:
            foreach my $f (@list){
                next if ($f =~ /\.xls$/i); #<-- skip files with a .xls extension
    As a side note, a hash would be better to count the files instead of using arrays for each different filetype.

    Comment

    • vijayarl
      New Member
      • Sep 2008
      • 65

      #3
      Thanks Kevin !!!!

      Would like to implement hash as you said...
      but am still learning perl, as jeff told me to go through the hash method in my another post...

      i will be very greatful if you can assist me on how to implement hash method to count the files..
      one example would be sufficient for me or just explaination step by step..

      i would like to try by self..just tell me how to go head...
      hope you won't mind...

      anyway's thanks once again...

      Regards,
      Vijayarl

      Comment

      • KevinADC
        Recognized Expert Specialist
        • Jan 2007
        • 4092

        #4
        Here is a general rewrite of your code including using a hash to store the counts and other changes. Notably if/elsif/elsif instead of if/if/if. When a string or line can have only one true value don't use if/if/if as perl has to evaluate all the 'if' conditions even after it finds the only true one. if/elsif enables perl to stop executing the conditions after the first true value if found. If you ever neeeded a fall-through condition you add an 'else' condition to the end to catch exceptions. In your case there is no need that I can see for a fall-through condition. I also cleaned up your regexp, mostly just to show you ways of writing them to check for patterns. You were really checking for substrings instead of patterns, in which case index() would have been better to use than regular expressions. But since we want to capture the value of the pattern match and use it as the hash key I went with pure regexps instead of index() and predefined keys, which is also a good possible way to do what you are doing.

        Untested code:

        Code:
        use strict;
        use warnings;
        my $dir = 'C:/Performance_svap/INPUT_FILES';#<-- windows supports forward slashes in directory paths
        my %count = (); #<-- hash to store counts
        raw_stats_report($dir);#<-- call the function with $dir as its argument
         
        #######function to display the raw stats type report ###################
         
        sub raw_stats_report {
            my $dir = $_[0] or die "No start directory defined\n";
            chdir($dir) or die "Can't chdir to $dir: $!\n";
            opendir(D, '.') or die "Can't opendir $dir: $!\n";
            my @list = readdir(D);
            closedir(D);
            foreach my $f (@list){
                next if ($f =~ /\.xls$/i);
                if ($f =~ /(sar-[dgur])/){
                   $count{$1}++;
                }
                elsif ($f =~ /(vmstat|mpstat)/){
                    $count{$1}++;
                }
                elsif ($f =~ /(prstat-(?:mLV|Ls|Lvs))/){
                   $count{$1}++;
                }
                elsif ($f =~ /((?:net|io)stat)/){
                   $count{$1}++;
                }
            }        
            print "############# Raw_Stats_Report ################\n \n ";
            print 'There are total ',scalar(@list)," files present in $dir\n";
            foreach my $c (sort keys %count) {
                print "There are total $count{$c} $c files present in $dir\n";
            }
            print "################################################\n   \n ";
            print "Do you want me to continue ? Y | N \n";
            chomp(my $pick = <STDIN>);
            if ($pick =~ /y/i){
                print "Process execution will continue !!! \n";
            }
            else{
                print "Process execution stopped !!! \n";
                exit(0); # <-- use exit instead of 'die' to end a script early
            }
        }

        Comment

        • KevinADC
          Recognized Expert Specialist
          • Jan 2007
          • 4092

          #5
          Another thing to keep in mind is that the scalar value of @list:

          scalar(@list)

          will include '.' and '..' in the count/length of the array. If you don't want those you can substract 2 from the length:

          scalar(@list)-2

          Comment

          • vijayarl
            New Member
            • Sep 2008
            • 65

            #6
            Thanks Kevin !!!!!

            It worked successfully... thank you very much..

            last one question:
            can we skip for any filetype extension instead of only skipping *.xls

            i did change this part in the script
            Code:
            next if ($f =~ /\.xls$/i); to next if ($f =~ /\.*$/i);
            but didn't get desired result..

            what i thought was, in the script we skip file which has *.xls but whatif i have other than *.xls in the directory like
            Code:
            prstat-Ls-20080118-1800
            prstat-Ls-20080118-1800.doc
            prstat-Ls-20080118-1900
            prstat-Ls-20080118-1900.txt
            prstat-Ls-20080118-1900.xls
            prstat-Lvs-20080118-1800
            prstat-Lvs-20080118-1900
            prstat-Lvs-20080118-1900.txt
            script still count all the *.txt & *.doc entry. so thought instead of telling the script to skip *.xls can't we just skip all the occurance of file which has file type extension ???

            i know it's too much of asking.. just to know that can we do this ??

            anyway's thanks for you patience reply...
            your just too good.. lots N lots left to learn from you ppl :-)

            Regards,
            Vijayarl

            Comment

            • nithinpes
              Recognized Expert Contributor
              • Dec 2007
              • 410

              #7
              Originally posted by vijayarl
              last one question:
              can we skip for any filetype extension instead of only skipping *.xls

              i did change this part in the script
              Code:
              next if ($f =~ /\.xls$/i); to next if ($f =~ /\.*$/i);
              This does not work as the character before '*' quantifier is \.(literal .). That would mean 0 or more occurence of '.'. Hence it matches files with/without extensions.


              script still count all the *.txt & *.doc entry. so thought instead of telling the script to skip *.xls can't we just skip all the occurance of file which has file type extension ???

              Regards,
              Vijayarl
              You may use:
              Code:
              next if ($f =~ /\..+$/i);

              Comment

              • vijayarl
                New Member
                • Sep 2008
                • 65

                #8
                Thanks nithinpes !!!!

                It worked fine...

                Regards,
                Vijayarl

                Comment

                • vijayarl
                  New Member
                  • Sep 2008
                  • 65

                  #9
                  Another one:

                  As we printing the total number of files persent in the directory,
                  Code:
                  print 'There are total ',scalar(@list)-2," files present in $dir\n \n";
                  this line give correct value but what i thought is to print only the total number of
                  raw stat file count. the above line prints count of all the files

                  i did change the script :
                  Code:
                  print "############# Raw_Stats_Report ################\n \n "; 
                      print 'There are total ',scalar(@list)-2," files present in $dir\n \n";
                  	print "#############***********################\n \n";
                  	my @rawstat; 
                      foreach my $c (sort keys %count) {
                  		@rawstat = %count;
                          print "\n There are total $count{$c} $c files present in $dir\n \n ";
                      }
                  	print 'There are totally ',scalar(@rawstat)," raw stats files present in $dir\n \n";
                  but i get the incorrect ouput count it gives only 18 raw stat file count even though we have 36 raw stat file count.
                  is this correct way to do :
                  Code:
                  @rawstat = %count; ## added inside the for loop
                  
                  
                  print 'There are totally ',scalar(@rawstat)," raw stats files present in $dir\n \n"; ## kept out side the for loop
                  ouput look like this:
                  Code:
                  C:\Performance_svap\misc>perl chkempty.pl
                  ############# Raw_Stats_Report ################
                  
                   There are total 46 files present in C:/Performance_svap/INPUT_FILES
                  
                  #############***********################
                  
                  
                   There are total 4 iostat files present in C:/Performance_svap/INPUT_FILES
                  
                  
                   There are total 4 netstat files present in C:/Performance_svap/INPUT_FILES
                  
                  
                   There are total 4 prstat-Ls files present in C:/Performance_svap/INPUT_FILES
                  
                  
                   There are total 4 prstat-Lvs files present in C:/Performance_svap/INPUT_FILES
                  
                  
                   There are total 4 sar-d files present in C:/Performance_svap/INPUT_FILES
                  
                  
                   There are total 4 sar-g files present in C:/Performance_svap/INPUT_FILES
                  
                  
                   There are total 4 sar-r files present in C:/Performance_svap/INPUT_FILES
                  
                  
                   There are total 4 sar-u files present in C:/Performance_svap/INPUT_FILES
                  
                  
                   There are total 4 vmstat files present in C:/Performance_svap/INPUT_FILES
                  
                   There are totally 18 raw stats files present in C:/Performance_svap/INPUT_FILES
                  
                  
                  ################################################
                  Regards,
                  Vijayarl

                  Comment

                  • nithinpes
                    Recognized Expert Contributor
                    • Dec 2007
                    • 410

                    #10
                    In your script, the %count has has the type of file(vmstat,... ) as key and it's count as values. Hence, to get the total count of files, you should be summing up values of all the keys in the hash.
                    Code:
                    my $raw_count=0;
                    foreach (keys %count)  {
                     $raw_count+ = $count{$_} ; ## sum up the values
                    }

                    Comment

                    • vijayarl
                      New Member
                      • Sep 2008
                      • 65

                      #11
                      Thanks nithinpes !!!!

                      It worked fine...thanks once again...

                      Working Code:
                      Code:
                      use strict; 
                      use warnings; 
                      my $dir = 'C:/Performance_svap/INPUT_FILES';#<-- windows supports forward slashes in directory paths 
                      my %count = (); #<-- hash to store counts 
                      raw_stats_report($dir);#<-- call the function with $dir as its argument 
                        
                      #######function to display the raw stats type report ################### 
                      sub raw_stats_report { 
                          my $dir = $_[0] or die "No start directory defined\n"; 
                          chdir($dir) or die "Can't chdir to $dir: $!\n"; 
                          opendir(D, '.') or die "Can't opendir $dir: $!\n"; 
                          my @list = readdir(D); 
                          closedir(D);
                          foreach my $f (@list){ 
                              next if ($f =~ /\..+$/i); 
                              if ($f =~ /(sar-[dgur])/){ 
                                 $count{$1}++;
                      			
                              } 
                              elsif ($f =~ /(vmstat|mpstat)/){ 
                                  $count{$1}++; 
                      			
                              } 
                              elsif ($f =~ /(prstat-(?:mLV|Ls|Lvs))/){ 
                                 $count{$1}++; 
                      		   
                              } 
                              elsif ($f =~ /((?:net|io)stat)/){ 
                                 $count{$1}++; 
                      		   
                              }
                          }         
                          print "############# Raw_Stats_Report ################\n \n "; 
                          print 'There are total ',scalar(@list)-2," files present in $dir\n \n";
                      	print "#############***********################\n \n";
                      	
                          foreach my $c (sort keys %count) {
                              print "\n There are total $count{$c} $c files present in $dir\n \n ";
                          }
                      	my $raw_count=0;
                      	foreach (keys %count){ 
                      		$raw_count+= $count{$_} ; ## sum up the values 
                      	} 
                      	print 'There are totally ',scalar($raw_count)," raw stats files present in $dir\n \n";
                          print "################################################\n \n "; 
                          print "Do you want me to continue ? Y | N \n"; 
                          chomp(my $pick = <STDIN>); 
                          if ($pick =~ /y/i){ 
                              print "Process execution will continue !!! \n"; 
                          } 
                          else{ 
                              print "Process execution stopped !!! \n"; 
                              exit(0); # <-- use exit instead of 'die' to end a script early 
                          } 
                      }
                      Regards,
                      Vijayarl

                      Comment

                      Working...