I'm trying to get data from a txt file, still I don't know how to do it. The data is in fasta format (a format used in molecular biology to store protein/DNA sequences) whis is very simple:
The ">" is always present and denotes an identifier line (in which we usually write the name/id of the sequence below). The line or lines following the header are the proper sequence, which have different lenghts.
So, my question is which instructions to use so I can read the file and copy all the sequences, each one in a list for itself.
Any ideas? Thanks in advance.
Code:
>Header1 Sequence1 >Header2 Sequence2 . . . >HeaderN SequenceN
So, my question is which instructions to use so I can read the file and copy all the sequences, each one in a list for itself.
Any ideas? Thanks in advance.
Comment