average by user defined cutoff

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • kumarboston
    New Member
    • Sep 2007
    • 55

    average by user defined cutoff

    Hi all,
    I was trying to calculate the average value from different parts of the same data file. For example, if suppose we have number 1 - 10 and i was trying to calculate the average of only first 3 values and then 4 values and then last 3 value and then calculate the three averages. I have written a code but I guess it is very good way to calculate it and sometimes i get garbage values also.

    Here is a data file:
    [CODE=perl]
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    [/CODE]

    so when the user specifies the different parts as arguments, the average should be calculated.

    Here is my perl code:
    [CODE=perl]
    #!/usrbin/perl

    use strict;
    use warnings;

    my $file = $ARGV[0];
    my $cut1 = $ARGV[1];
    my $cut2 = $ARGV[2];
    my $cut3 = $ARGV[3];
    my (@tt,$result1,$ result2,$result 3);

    open (A,$file);
    my ($count,$total) = 0;
    my ($val,$result);

    while (<A>)
    {
    my @temp = split (/\s+/,$_);
    $val = $temp[1];
    $total = $total + $val;
    $count++;

    if ($count == $cut1)
    {
    $result1 = sprintf("%.3f", $total/$count);
    push(@tt,$resul t1);
    my $s1= $total;
    my $r1 = $total/$count;
    print "$s1\t$count\t$ r1\n";
    }

    if ($count == ($cut2+ $cut1))
    {

    $result2 = sprintf("%.3f", ($total-($result1*$cut1 ))/($count-$cut1));
    push(@tt,$resul t2);
    my $s2=($total -($result1*$cut1 ));
    my $c2 = ($count - $cut1);
    my $r2 = $s2/$c2;
    print "$s2\t$c2\t$r2\ n";

    }
    if ($count == ($cut3+$cut2+$c ut1))
    {
    $result3 = $total- (($result1*$cut 1) + ($result2*$cut2 )) / ($count- ($cut1+$cut2));
    push(@tt,$resul t3);
    my $s3 = ($total- (($result1*$cut 1) + ($result2*$cut2 )));
    my $c3 = $count - ($cut1+$cut2);
    my $r3 = $s3/$c3;
    print "$s3\t$c3\t$r3\ n";
    }
    }
    my $add = 0;
    foreach my $r(@tt)
    {
    $add = $add +$r;
    }
    print "$add/scalar(@tt)\t";
    my $final = sprintf("%.3f", $add/scalar(@tt));
    print "$final\n";
    [/CODE]

    Right now i can take 3 user cutoffs but if i want to make this program take any number of cutoffs to calculate the averages.
    I guess there must a better way to calculate the average from different section of same data file.
    Here I have used just 1 to 10 numbers as examples but my actual data files have 16900 lines and i have to calculate the average by using different parts of the file.
    Any help in this regard is appreciated.
    Thanks
    Kumar
  • KevinADC
    Recognized Expert Specialist
    • Jan 2007
    • 4092

    #2
    If you have a file with just numbers on each line, why are you using split?

    Code:
     my @temp = split (/\s+/,$_);
    Anyway, something like this seems easier:

    Code:
    #!/usr/bin/perl                                                                                                                                              
     
    use strict;                                                                                                                                                 
    use warnings;                                                                                                                                               
     
    my $file = $ARGV[0];
    my $cut1 = $ARGV[1];
    my $cut2 = $ARGV[2];
    my $cut3 = $ARGV[3];
    
    my (@cut1, @cut2, @cut3);
    
    open (my $IN, $file) or die "$!";
    push @cut1, chomp <$IN> for (1..$cut1);
    push @cut2, chomp <$IN> for ($cut1+1..$cut2);
    push @cut3, chomp <$IN> for ($cut2+1..$cut3);
    close $IN;
    
    average(\@cut1,\@cut2,\@cut3);
    
    sub average {
       my @arrays = @_;
       my $sum;
       foreach my $list (@arrays) {
          $sum += $_ for @{$list};
          my $avg = $sum / @{$list};
          print "Average = $avg\n";
          $sum = 0;
       }

    Comment

    • kumarboston
      New Member
      • Sep 2007
      • 55

      #3
      Thanks KevinADC for the reply,
      I ran your code on a file with numbers 1 to 10, but its throwing an error, "Can't modify <HANDLE> in chomp at new.pl line 14, near "<$IN> for "
      Execution of new.pl aborted due to compilation errors."
      I checked on the error and when I removed the chomp it was working fine but now "Illegal division by zero at new.pl line 27." error is coming and was not able to remove it.

      Thanks
      Kumar

      Comment

      • KevinADC
        Recognized Expert Specialist
        • Jan 2007
        • 4092

        #4
        oops, my bad. THis new version of the code assumes (for a 10 line file) that cut1 cut2 and cut 3 are equal to 3, 4, 3 respectively, if not the for() loop conditions need to be adjusted.

        Code:
        use strict;
        use warnings;
        
        my $file = $ARGV[0];
        my $cut1 = $ARGV[1];
        my $cut2 = $ARGV[2];
        my $cut3 = $ARGV[3];
         
        my (@cut1, @cut2, @cut3);
         
        open (my $IN, $file) or die "$!";
        for (1 .. $cut1) {
           chomp ($_ = <$IN>);
           push @cut1,$_;
        }
        for ($cut1+1 .. $cut1+$cut2){
           chomp ($_ = <$IN>);
           push @cut2, $_;
        }
        for ($cut1+$cut2+1 .. $cut1+$cut2+$cut3){
           chomp ($_ = <$IN>);
           push @cut3, $_;
        }	
        close $IN;
         
        average(\@cut1,\@cut2,\@cut3);
         
        sub average {
           my @arrays = @_;
           my $sum;
           foreach my $list (@arrays) {
              $sum += $_ for @{$list};
              my $avg = $sum / @{$list};
              print "Average = $avg\n";
              $sum = 0;
           }
        }

        Comment

        • kumarboston
          New Member
          • Sep 2007
          • 55

          #5
          Thanks so much Kevin,
          the output for the calculation of 1 to 10 " Average = 2 ; Average = 5.5; Average = 9" is correct, only a last small help, if in the end if I want to take the average of these values then how do i modifiy the subroutine?

          Thanks
          Kumar

          Comment

          • KevinADC
            Recognized Expert Specialist
            • Jan 2007
            • 4092

            #6
            change the subroutine to something like this:

            Code:
            sub average {
               my @arrays = @_;
               my $sum;
               my @averages;
               my $total;
               foreach my $list (@arrays) {
                  $sum += $_ for @{$list};
                  push @averages, $sum;
                  my $avg = $sum / @{$list};
                  print "Average = $avg\n";
                  $sum = 0;
               }
               ($total += $_) for @averages;
               printf "Total Average = %.3f\n", $total / @averages;
            }
            Next time, please show some effort on your part first to solve the problem.

            Comment

            Working...