How do I write a program in Perl to extract % of sequence identity, overlap, e-value, etc. of FASTA results.
The file is somewhat like this:
FastaSummary Table
SUBMISSION PARAMETERS
TitleSequenceDa tabaseuniprot
Sequence length417Sequen ce typep
ProgramfastaVer sion3.4t25 Sept 2, 2005
Expectation upper value10.0Matrix BL50
Sequence range1-Number of scores50
Number of alignments50Wor d size2
Open gap penalty-10Gap extension penalty-2
Histogramfalse
AlignmentDB:IDS ourceLengthIden tity%Similar%Ov erlapE()
1 UNIPROT:CBPA2_R AT Carboxypeptidas e A2 precursor 417
100.000 100.000 417 9.7e-176
2 UNIPROT:Q504N0_ MOUSE Carboxypeptidas e A2, pancr 417
93.765 98.082 417 2.1e-166
3 UNIPROT:CBPA2_H UMAN Carboxypeptidas e A2 precurs 417
87.050 96.163 417 1.4e-154
4 UNIPROT:Q53XS1_ HUMAN Carboxypeptidas e A2 (Pancr 417
86.811 95.923 417 5.5e-154
The file is somewhat like this:
FastaSummary Table
SUBMISSION PARAMETERS
TitleSequenceDa tabaseuniprot
Sequence length417Sequen ce typep
ProgramfastaVer sion3.4t25 Sept 2, 2005
Expectation upper value10.0Matrix BL50
Sequence range1-Number of scores50
Number of alignments50Wor d size2
Open gap penalty-10Gap extension penalty-2
Histogramfalse
AlignmentDB:IDS ourceLengthIden tity%Similar%Ov erlapE()
1 UNIPROT:CBPA2_R AT Carboxypeptidas e A2 precursor 417
100.000 100.000 417 9.7e-176
2 UNIPROT:Q504N0_ MOUSE Carboxypeptidas e A2, pancr 417
93.765 98.082 417 2.1e-166
3 UNIPROT:CBPA2_H UMAN Carboxypeptidas e A2 precurs 417
87.050 96.163 417 1.4e-154
4 UNIPROT:Q53XS1_ HUMAN Carboxypeptidas e A2 (Pancr 417
86.811 95.923 417 5.5e-154
Comment