I'm programming with C on Linux to get all possible combinations of particular sequence related to biology. But I don't know how I implement it because it seems to be very complicated to make rule. I'm going to use those combination to make different sequences to test.
The sequence is like following.
info_table[0]: W S N Y N Y N N W S
info_table[1]: 2 2 4 2 4 2 4 4 2 2 : the number of replaceable letters
Each letter should be replaced with proper other letters.
For example,
W->A or T,
S->C or G,
Y->C or T,
N-> A or C or G or T
Total the number of combinations is up to 16,384(=2*2*4*2 *4*2*4*4*2*2).
So final results are like below:
case 1: A C A C A C A A A C
case 2: T - - - - - - - - - (-: same as case 1)
case 3: - G - - - - - - - -
case 4: - - C - - - - - - -
case 5: - - G - - - - - - -
case 6: - - T - - - - - - -
case 7: - - - T - - - - - -
case 8: - - - - C - - - - -
case 9: - - - - G - - - - -
case10: - - - - T - - - - -
case11: - - - - - T - - - -
case12: - - - - - - C - - -
case13: - - - - - - G - - -
case14: - - - - - - T - - -
case15: - - - - - - - C - -
case16: - - - - - - - G - -
case17: - - - - - - - T - -
case18: - - - - - - - - T -
case19: - - - - - - - - - G
(replace two letters)
case20: T G - - - - - - - -
case21: T - C - - - - - - -
case22: T - G - - - - - - -
case23: T - T - - - - - - -
case24: T - - T - - - - - -
case25: T - - - C - - - - -
case26: T - - - G - - - - -
case27: T - - - T - - - - -
...
...
(replace three letters)
case n: T G C - - - - - - -
casen1: T G G - - - - - - -
casen2: T G T - - - - - - -
casen3: T G - T - - - - - -
casen4: T G - - C - - - - -
casen5: T G - - G - - - - -
casen6: T G - - T - - - - -
...
...
casem1: T - C T - - - - - -
casem2: T - C - C - - - - -
casem2: T - C - G - - - - -
casem2: T - C - T - - - - -
...
...
casel1: T - - T C - - - - -
casel2: T - - T G - - - - -
...
...
(replace four letters)
casek1: T G C T - - - - - -
...
What I'd like to know is how to make all possible combinations with those sequence to make it different sequence from original one. I think there are lots of patterns to extract letters from different positions. By the way, the length of sequence is up to 100 approximately.
For example, I need to choose one letter to replace it.
W
S
N
Y
N
W
S
I need to choose two letters.
WS WN WY WN WY WN WN WW WS: start with W at first position
SN SY SN SY SN SN SW SS: start with S at second position
NY NN NY NN NN NW NS: start with N at third position
....
....
WS: start with W at second position from last
Next, I need to choose three letters. It's lost of possible to make combinations
WSN WSY WSN WSY WSN WSN WSW WSS: case n
WNY WNN WNY WNN WNN WNW WNS: case m
...
...
Next, five, six and so on...
Thank you for your help in advance.
The sequence is like following.
info_table[0]: W S N Y N Y N N W S
info_table[1]: 2 2 4 2 4 2 4 4 2 2 : the number of replaceable letters
Each letter should be replaced with proper other letters.
For example,
W->A or T,
S->C or G,
Y->C or T,
N-> A or C or G or T
Total the number of combinations is up to 16,384(=2*2*4*2 *4*2*4*4*2*2).
So final results are like below:
case 1: A C A C A C A A A C
case 2: T - - - - - - - - - (-: same as case 1)
case 3: - G - - - - - - - -
case 4: - - C - - - - - - -
case 5: - - G - - - - - - -
case 6: - - T - - - - - - -
case 7: - - - T - - - - - -
case 8: - - - - C - - - - -
case 9: - - - - G - - - - -
case10: - - - - T - - - - -
case11: - - - - - T - - - -
case12: - - - - - - C - - -
case13: - - - - - - G - - -
case14: - - - - - - T - - -
case15: - - - - - - - C - -
case16: - - - - - - - G - -
case17: - - - - - - - T - -
case18: - - - - - - - - T -
case19: - - - - - - - - - G
(replace two letters)
case20: T G - - - - - - - -
case21: T - C - - - - - - -
case22: T - G - - - - - - -
case23: T - T - - - - - - -
case24: T - - T - - - - - -
case25: T - - - C - - - - -
case26: T - - - G - - - - -
case27: T - - - T - - - - -
...
...
(replace three letters)
case n: T G C - - - - - - -
casen1: T G G - - - - - - -
casen2: T G T - - - - - - -
casen3: T G - T - - - - - -
casen4: T G - - C - - - - -
casen5: T G - - G - - - - -
casen6: T G - - T - - - - -
...
...
casem1: T - C T - - - - - -
casem2: T - C - C - - - - -
casem2: T - C - G - - - - -
casem2: T - C - T - - - - -
...
...
casel1: T - - T C - - - - -
casel2: T - - T G - - - - -
...
...
(replace four letters)
casek1: T G C T - - - - - -
...
What I'd like to know is how to make all possible combinations with those sequence to make it different sequence from original one. I think there are lots of patterns to extract letters from different positions. By the way, the length of sequence is up to 100 approximately.
For example, I need to choose one letter to replace it.
W
S
N
Y
N
W
S
I need to choose two letters.
WS WN WY WN WY WN WN WW WS: start with W at first position
SN SY SN SY SN SN SW SS: start with S at second position
NY NN NY NN NN NW NS: start with N at third position
....
....
WS: start with W at second position from last
Next, I need to choose three letters. It's lost of possible to make combinations
WSN WSY WSN WSY WSN WSN WSW WSS: case n
WNY WNN WNY WNN WNN WNW WNS: case m
...
...
Next, five, six and so on...
Thank you for your help in advance.
Comment