HERE IS MY code!!plz help how to proceed??
so here is the code that you suggested for creating dictionaries for a file(matrix)
now i have gotta do something like..
supposing the sequence entered is something like "ATATTA".. so A is in the first position so A[01]*T[02]*A[03]*T[04]*T[05]*A[06]=??
how do i do this?? what changes should i do??
THIS is the file containing the matrix
NA bap
PO A C G T
01 0.00 3.67 0.00 0.00
02 0.00 0.00 3.67 0.00
03 0.00 0.00 0.00 3.67
04 0.00 3.67 0.00 0.00
05 3.67 0.00 0.00 0.00
06 3.46 0.00 0.22 0.00
07 0.00 0.00 3.67 0.00
08 0.00 0.00 0.00 3.67
09 0.00 0.00 0.00 3.67
10 0.00 3.67 0.00 0.00
11 3.67 0.00 0.00 0.00
12 3.67 0.00 0.00 0.00
13 0.00 0.00 3.67 0.00
14 0.00 0.00 0.00 3.67
15 0.00 0.00 3.67 0.00
16 0.00 3.67 0.00 0.00
//
//
NA bcd
PO A C G T
01 42.55 8.75 145.86 8.14
02 0.14 0.53 204.64 0.00
03 126.83 78.02 0.11 0.34
04 0.21 0.17 0.00 204.92
05 0.00 12.38 0.43 192.50
06 174.48 0.95 1.32 28.56
07 79.53 4.70 100.44 20.64
//
//
NA bin
PO A C G T
01 0.45 8.27 0.00 11.39
02 0.00 0.00 10.02 10.09
03 5.80 1.39 0.00 12.93
04 12.33 5.18 2.60 0.00
05 12.43 0.00 0.00 7.68
06 18.55 0.00 1.57 0.00
07 0.05 0.58 0.00 19.48
08 20.11 0.00 0.00 0.00
09 20.06 0.05 0.00 0.00
10 20.11 0.00 0.00 0.00
11 0.00 15.33 0.00 4.78
12 20.06 0.05 0.00 0.00
13 14.99 0.35 4.78 0.00
14 13.64 2.42 3.37 0.68
15 5.03 0.00 15.08 0.00
16 7.23 0.45 10.94 1.49
//
//
Code:
f=open("weight_matrix.transfac.txt","r") line=f.next() while not line.startswith('PO'): line=f.next() headerlist=line.strip().split()[1:] linelist=[] line=f.next().strip() while not line.startswith('/'): if line != '': linelist.append(line.strip().split()) line=f.next().strip() keys=[i[0] for i in linelist] values=[[float(s) for s in item] for item in [j[1:] for j in linelist]] linedict=dict(zip(keys,values)) datadict={} for i,item in enumerate(headerlist): datadict[item]={} for key in linedict: datadict[item][key]=linedict[key][i] for keymain in datadict: for keysub in datadict[keymain]: datadict[keymain][keysub]+=1.0 print datadict
now i have gotta do something like..
supposing the sequence entered is something like "ATATTA".. so A is in the first position so A[01]*T[02]*A[03]*T[04]*T[05]*A[06]=??
how do i do this?? what changes should i do??
THIS is the file containing the matrix
NA bap
PO A C G T
01 0.00 3.67 0.00 0.00
02 0.00 0.00 3.67 0.00
03 0.00 0.00 0.00 3.67
04 0.00 3.67 0.00 0.00
05 3.67 0.00 0.00 0.00
06 3.46 0.00 0.22 0.00
07 0.00 0.00 3.67 0.00
08 0.00 0.00 0.00 3.67
09 0.00 0.00 0.00 3.67
10 0.00 3.67 0.00 0.00
11 3.67 0.00 0.00 0.00
12 3.67 0.00 0.00 0.00
13 0.00 0.00 3.67 0.00
14 0.00 0.00 0.00 3.67
15 0.00 0.00 3.67 0.00
16 0.00 3.67 0.00 0.00
//
//
NA bcd
PO A C G T
01 42.55 8.75 145.86 8.14
02 0.14 0.53 204.64 0.00
03 126.83 78.02 0.11 0.34
04 0.21 0.17 0.00 204.92
05 0.00 12.38 0.43 192.50
06 174.48 0.95 1.32 28.56
07 79.53 4.70 100.44 20.64
//
//
NA bin
PO A C G T
01 0.45 8.27 0.00 11.39
02 0.00 0.00 10.02 10.09
03 5.80 1.39 0.00 12.93
04 12.33 5.18 2.60 0.00
05 12.43 0.00 0.00 7.68
06 18.55 0.00 1.57 0.00
07 0.05 0.58 0.00 19.48
08 20.11 0.00 0.00 0.00
09 20.06 0.05 0.00 0.00
10 20.11 0.00 0.00 0.00
11 0.00 15.33 0.00 4.78
12 20.06 0.05 0.00 0.00
13 14.99 0.35 4.78 0.00
14 13.64 2.42 3.37 0.68
15 5.03 0.00 15.08 0.00
16 7.23 0.45 10.94 1.49
//
//
Comment