Looking for suggestions on improving numpy code

David Lees
#1

Looking for suggestions on improving numpy code

Feb 23 '08, 06:45 AM

I am starting to use numpy and have written a hack for reading in a
large data set that has 8 columns and millions of rows. I want to read
and process a single column. I have written the very ugly hack below,
but am sure there is a more efficient and pythonic way to do this. The
file is too big to read by brute force and select a column, so it is
read in chunks and the column selected. Things I don't like in the code:
1. Performing a transpose on a large array
2. Uncertainty about numpy append efficiency

Is there a way to directly read every n'th element from the file into an
array?

david

from numpy import *
from scipy.io.numpyi o import fread

fd = open('testcase. bin', 'rb')
datatype = 'h'
byteswap = 0
M = 1000000
N = 8
size = M*N
shape = (M,N)
colNum = 2
sf =1.645278e-04*10
z=array([])
for i in xrange(50):
data = fread(fd, size, datatype,dataty pe,byteswap)
data = data.reshape(sh ape)
data = data.transpose( )
z = append(z,data[colNum]*sf)

print z.mean()

fd.close()
Tags: None
7stud
#2

Feb 23 '08, 11:45 AM

Re: Looking for suggestions on improving numpy code

On Feb 22, 11:37 pm, David Lees <debl2NoS...@ve rizon.netwrote:

I want to read
and process a single column.

Then why won't a list suffice?
Comment

Comment