Hi all,
I've made the script below to calculate the coverage per position of small pieces of strings. It works fine but when I tried to speed thing up for large files using forkmanager the script takes an awful lot of time when I set 2 parallel forks. Setting it to 0 works fine. Does anybody have an idea what is going wrong?
Kind Regards
Jaap
I've made the script below to calculate the coverage per position of small pieces of strings. It works fine but when I tried to speed thing up for large files using forkmanager the script takes an awful lot of time when I set 2 parallel forks. Setting it to 0 works fine. Does anybody have an idea what is going wrong?
Kind Regards
Jaap
Code:
#!/usr/local/bin/perl -w
use Parallel::ForkManager;
use strict;
if(scalar(@ARGV) != 3){
print "Please use the correct parameters\nUsage: bowtie2wiggle.pl -infile -outfile- -processes-\n";
exit();
}
my $pm = new Parallel::ForkManager($ARGV[2]);
my %hash=();
open (IN, $ARGV[0]);
while (<IN>){
$pm->start and next;
$_ =~ s/\n|\r//;
my @element=split(/\t/,$_);
$element[2] =~ s/T//;
for (my $i=1;$i<=length($element[4]);$i++){
$hash{$element[2]}{($element[3]+$i)}{coverage}++;
}
$pm->finish;
}
$pm->wait_all_children;
close IN;
open (OUT, ">$ARGV[1]");
print OUT 'track type=wiggle_0 name="'.$ARGV[0].'" description="theoretical coverage from '.$ARGV[0].'" visibility=full autoScale=on color=50,150,255'."\n";
my $old="";
foreach my $chromosome (sort keys %hash){
print OUT "variableStep chrom=chr$chromosome span=1\n";
foreach my $position (sort {$a<=>$b} keys %{$hash{$chromosome}}){
if ($old ne "" && $old != ($position-1)){
print OUT "variableStep chrom=chr$chromosome span=1\n";
}
print OUT "$position $hash{$chromosome}{$position}{coverage}\n";
$old=$position;
}
}
close OUT;
Comment