Looking for suggestions on improving numpy code

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • David Lees

    Looking for suggestions on improving numpy code

    I am starting to use numpy and have written a hack for reading in a
    large data set that has 8 columns and millions of rows. I want to read
    and process a single column. I have written the very ugly hack below,
    but am sure there is a more efficient and pythonic way to do this. The
    file is too big to read by brute force and select a column, so it is
    read in chunks and the column selected. Things I don't like in the code:
    1. Performing a transpose on a large array
    2. Uncertainty about numpy append efficiency

    Is there a way to directly read every n'th element from the file into an
    array?

    david


    from numpy import *
    from scipy.io.numpyi o import fread

    fd = open('testcase. bin', 'rb')
    datatype = 'h'
    byteswap = 0
    M = 1000000
    N = 8
    size = M*N
    shape = (M,N)
    colNum = 2
    sf =1.645278e-04*10
    z=array([])
    for i in xrange(50):
    data = fread(fd, size, datatype,dataty pe,byteswap)
    data = data.reshape(sh ape)
    data = data.transpose( )
    z = append(z,data[colNum]*sf)

    print z.mean()

    fd.close()
  • 7stud

    #2
    Re: Looking for suggestions on improving numpy code

    On Feb 22, 11:37 pm, David Lees <debl2NoS...@ve rizon.netwrote:
    I want to read
    and process a single column.
    Then why won't a list suffice?

    Comment

    Working...