Parsing

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • chandniashar
    New Member
    • Sep 2006
    • 7

    Parsing

    How do I write a program in Perl to extract % of sequence identity, overlap, e-value, etc. of FASTA results.
    The file is somewhat like this:

    FastaSummary Table



    SUBMISSION PARAMETERS
    TitleSequenceDa tabaseuniprot
    Sequence length417Sequen ce typep
    ProgramfastaVer sion3.4t25 Sept 2, 2005
    Expectation upper value10.0Matrix BL50
    Sequence range1-Number of scores50
    Number of alignments50Wor d size2
    Open gap penalty-10Gap extension penalty-2
    Histogramfalse


    AlignmentDB:IDS ourceLengthIden tity%Similar%Ov erlapE()
    1 UNIPROT:CBPA2_R AT Carboxypeptidas e A2 precursor 417
    100.000 100.000 417 9.7e-176
    2 UNIPROT:Q504N0_ MOUSE Carboxypeptidas e A2, pancr 417
    93.765 98.082 417 2.1e-166
    3 UNIPROT:CBPA2_H UMAN Carboxypeptidas e A2 precurs 417
    87.050 96.163 417 1.4e-154
    4 UNIPROT:Q53XS1_ HUMAN Carboxypeptidas e A2 (Pancr 417
    86.811 95.923 417 5.5e-154
  • deep022in
    New Member
    • Sep 2006
    • 23

    #2
    Originally posted by chandniashar
    How do I write a program in Perl to extract % of sequence identity, overlap, e-value, etc. of FASTA results.
    The file is somewhat like this:

    FastaSummary Table



    SUBMISSION PARAMETERS
    TitleSequenceDa tabaseuniprot
    Sequence length417Sequen ce typep
    ProgramfastaVer sion3.4t25 Sept 2, 2005
    Expectation upper value10.0Matrix BL50
    Sequence range1-Number of scores50
    Number of alignments50Wor d size2
    Open gap penalty-10Gap extension penalty-2
    Histogramfalse


    AlignmentDB:IDS ourceLengthIden tity%Similar%Ov erlapE()
    1 UNIPROT:CBPA2_R AT Carboxypeptidas e A2 precursor 417
    100.000 100.000 417 9.7e-176
    2 UNIPROT:Q504N0_ MOUSE Carboxypeptidas e A2, pancr 417
    93.765 98.082 417 2.1e-166
    3 UNIPROT:CBPA2_H UMAN Carboxypeptidas e A2 precurs 417
    87.050 96.163 417 1.4e-154
    4 UNIPROT:Q53XS1_ HUMAN Carboxypeptidas e A2 (Pancr 417
    86.811 95.923 417 5.5e-154


    could you please eloborate on exactly what you want to extract..

    Comment

    • chandniashar
      New Member
      • Sep 2006
      • 7

      #3
      Originally posted by deep022in
      could you please eloborate on exactly what you want to extract..

      I want to extract this portion:
      AlignmentDB:IDS ourceLengthIden tity%Similar%Ov erlap E()
      1 UNIPROT:CBPA2_R AT Carboxypeptidas e A2 precursor 417
      100.000 100.000 417 9.7e-176
      2 UNIPROT:Q504N0_ MOUSE Carboxypeptidas e A2, pancr 417
      93.765 98.082 417 2.1e-166
      3 UNIPROT:CBPA2_H UMAN Carboxypeptidas e A2 precurs 417
      87.050 96.163 417 1.4e-154
      4 UNIPROT:Q53XS1_ HUMAN Carboxypeptidas e A2 (Pancr 417
      86.811 95.923 417 5.5e-154.....etc

      Comment

      • deep022in
        New Member
        • Sep 2006
        • 23

        #4
        Originally posted by chandniashar
        I want to extract this portion:
        AlignmentDB:IDS ourceLengthIden tity%Similar%Ov erlap E()
        1 UNIPROT:CBPA2_R AT Carboxypeptidas e A2 precursor 417
        100.000 100.000 417 9.7e-176
        2 UNIPROT:Q504N0_ MOUSE Carboxypeptidas e A2, pancr 417
        93.765 98.082 417 2.1e-166
        3 UNIPROT:CBPA2_H UMAN Carboxypeptidas e A2 precurs 417
        87.050 96.163 417 1.4e-154
        4 UNIPROT:Q53XS1_ HUMAN Carboxypeptidas e A2 (Pancr 417
        86.811 95.923 417 5.5e-154.....etc

        ==========

        Hi,

        i have written following code.

        just change the name of the input file from demo,txt to your data file and re run it.
        Let me know t he results of the same

        #!/usr/bin/perl;

        open(fp1,"demo. txt");

        my @array;

        #@array will hold the extracted pattern

        my $i=0;

        while(<fp1>)

        {

        #print "$_";

        if(m/^[0-9]{1,1}[\s]{1,1}[a-zA-Z]{1,}/)

        {

        #print $_."\n";

        $array[$i]=$_;

        }

        if(m/^[0-9]{1,}[\.]{1,1}[0-9]{1,1}/)

        {

        #print $_."\n";

        $array[$i]=$_;

        }

        if(m/AlignmentDB:/)

        {

        #print "$_\n";

        $array[$i]=$_;

        }

        $i++;

        }

        close(fp1);

        #Finally print the content of array on

        foreach my $pattern (@array)

        {

        print "$pattern";

        }

        Comment

        • chandniashar
          New Member
          • Sep 2006
          • 7

          #5
          kool thanks...

          How to write a calender program displaying 3 months on each row?

          Comment

          Working...