General Numerical Python question

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • 2mc

    General Numerical Python question

    Generally speaking, if one had a list (from regular Python) and an
    array (from Numerical Python) that contained the same number of
    elements, would a While loop or a For loop process them at the same
    speed? Or, would the array process faster?

    I'm new to Python, so my question may expose my ignorance. I
    appreciate anyone's effort to help me understand.

    Thanks. It is much appreciated.

    Matt
  • Alex Martelli

    #2
    Re: General Numerical Python question

    2mc wrote:
    [color=blue]
    > Generally speaking, if one had a list (from regular Python) and an
    > array (from Numerical Python) that contained the same number of
    > elements, would a While loop or a For loop process them at the same
    > speed? Or, would the array process faster?
    >
    > I'm new to Python, so my question may expose my ignorance. I
    > appreciate anyone's effort to help me understand.[/color]

    I don't know, I've never measured. Let's find out together.

    The best way to answer these performance questions, which may
    easily vary a little depending on your platform and exact versions
    involved, is to _measure_. Python 2.3's standard library comes with
    timeit.py, a little script that's made just for that. I've copied it to my
    ~/bin/ directory and done a chmod +x (it starts with a shebang line
    so that's sufficient), or in Windows you might set up a .bat or .cmd
    file to call Python on it. Anyway, it's easy to use: you specify zero
    or more -s 'blahblah' arguments to set things up, then the specific
    statement you want to time. Watch...:

    [alex@lancelot pop]$ timeit.py -s'import Numeric' -s'x=Numeric.ara nge(555)'
    'for i in x: id(i)'
    1000 loops, best of 3: 296 usec per loop
    [alex@lancelot pop]$ timeit.py -s'import Numeric' -s'x=range(555)' 'for i in
    x: id(i)'
    1000 loops, best of 3: 212 usec per loop
    [alex@lancelot pop]$ timeit.py -s'import Numeric' -s'x=Numeric.ara nge(555)'
    'for i in x: id(i)'
    1000 loops, best of 3: 296 usec per loop
    [alex@lancelot pop]$ timeit.py -s'import Numeric' -s'x=range(555)' 'for i in
    x: id(i)'
    1000 loops, best of 3: 207 usec per loop
    [alex@lancelot pop]$

    So, on this specific case, looping over a list of ints is a bit faster than
    looping over an otherwise equivalent Numeric.array -- about 210
    microseconds versus about 300.

    Similarly:

    [alex@lancelot pop]$ timeit.py -s'import Numeric' -s'x=range(555)' 'for i in
    range(len(x)): x[i]=id(x[i])'
    1000 loops, best of 3: 353 usec per loop
    [alex@lancelot pop]$ timeit.py -s'import Numeric' -s'x=range(555)' 'for i in
    range(len(x)): x[i]=id(x[i])'
    1000 loops, best of 3: 356 usec per loop
    [alex@lancelot pop]$ timeit.py -s'import Numeric' -s'x=Numeric.ara nge(555)'
    'for i in range(len(x)): x[i]=id(x[i])'
    1000 loops, best of 3: 581 usec per loop
    [alex@lancelot pop]$ timeit.py -s'import Numeric' -s'x=Numeric.ara nge(555)'
    'for i in range(len(x)): x[i]=id(x[i])'
    1000 loops, best of 3: 585 usec per loop

    Here we're accessing AND also modifying each element by index, and the
    list outperforms the array about 350 microseconds to 580.

    So, measure operations of your interest, on platforms of your interest,
    for roughly the kinds of list/array sizes you'll be using, and you'll KNOW
    what performance issues you may be facing, rather than guessing.

    In most cases you'll conclude that the difference is not important enough --
    a factor of 1.5 or more may seem large, but here we're just doing a trivial
    operation on each item -- if we were doing more the looping overhead
    would matter less. AND some operations are available as ufuncs in
    Numeric, cutting down loop overhead dramatically. And in the end a
    100 to 200 microseconds' difference may just not matter much, depending
    on your application. But anyway, you do get the ability to measure just
    what you need to.


    Alex

    Comment

    • Michael Ressler

      #3
      Re: General Numerical Python question

      In article <IGeib.271047$R 32.8825532@news 2.tin.it>, Alex Martelli wrote:[color=blue]
      > 2mc wrote:
      >[color=green]
      >> Generally speaking, if one had a list (from regular Python) and an
      >> array (from Numerical Python) that contained the same number of
      >> elements, would a While loop or a For loop process them at the same
      >> speed? Or, would the array process faster?
      >>
      >> I'm new to Python, so my question may expose my ignorance. I
      >> appreciate anyone's effort to help me understand.[/color]
      >
      > I don't know, I've never measured. Let's find out together.[/color]

      The real question is - why do you want to run a loop over an array?
      The whole point of Numeric is that you want to eliminate loops
      entirely. Keeping things in the array domain is infinitely faster than
      running explicit loops (this is the whole point of Numeric). You may
      need to come up with some clever expressions to do it, but most loops
      can be gotten rid of with clever uses of put(), take(), and the like.

      Loops are evil.

      Mike

      --
      Dr. Michael Ressler
      Research Scientist, Astrophysics Research Element, Jet Propulsion Laboratory
      Email: ressler@cheetah .jpl.nasa.gov Phone: (818)354-5576
      "A bad night at the telescope is still better than the best day at the office."

      Comment

      • Michael Ressler

        #4
        Re: General Numerical Python question

        In article <IGeib.271047$R 32.8825532@news 2.tin.it>, Alex Martelli wrote:[color=blue]
        > 2mc wrote:
        >[color=green]
        >> Generally speaking, if one had a list (from regular Python) and an
        >> array (from Numerical Python) that contained the same number of
        >> elements, would a While loop or a For loop process them at the same
        >> speed? Or, would the array process faster?
        >>
        >> I'm new to Python, so my question may expose my ignorance. I
        >> appreciate anyone's effort to help me understand.[/color]
        >
        > I don't know, I've never measured. Let's find out together.[/color]

        The real question is - why do you want to run a loop over an array?
        The whole point of Numeric is that you want to eliminate loops
        entirely. Keeping things in the array domain is infinitely faster than
        running explicit loops. You may need to come up with some clever
        expressions to do it, but most loops can be gotten rid of with clever
        uses of put(), take(), and the like.

        Loops are evil.

        Mike

        --
        Dr. Michael Ressler
        Research Scientist, Astrophysics Research Element, Jet Propulsion Laboratory
        Email: ressler@cheetah .jpl.nasa.gov Phone: (818)354-5576
        "A bad night at the telescope is still better than the best day at the office."

        Comment

        • 2mc

          #5
          Re: General Numerical Python question

          Michael Ressler <ressler@cheeta h.jpl.nasa.gov> wrote in message news:<slrnbolkk 2.89o.ressler@c heetah.jpl.nasa .gov>...[color=blue]
          > The real question is - why do you want to run a loop over an array?
          > The whole point of Numeric is that you want to eliminate loops
          > entirely. Keeping things in the array domain is infinitely faster than
          > running explicit loops. You may need to come up with some clever
          > expressions to do it, but most loops can be gotten rid of with clever
          > uses of put(), take(), and the like.
          >
          > Loops are evil.
          >
          > Mike[/color]

          For me, the key thought in your post is " you may need to come up with
          some clever expressions to do it, but most loops can be gotten rid of
          with clever uses of put(), take(), and the like."

          This is what I'm looking for. I'm so used to explicitly declaring
          loops that it is hard for me to "speak" in Numerical Python.

          Suppose I have 2 very large arrays of serial data. To make it simple,
          let's assume each has 10s of thousands of rows with one column/field
          of data. Further assume at some point later in the program I am going
          to compare the data in the two arrays - the comparison being on chunks
          of 25 rows throughout the array. But, before I do that, I have to
          "normalize" the data in both arrays in order to make the comparisons
          valid.

          Assume the way I make the comparisons is to find the size of the range
          between the highest and the lowest value in each 25 row 'chunk' and
          normalize each data point as: (datapoint - lowestvalue) /
          (highestvalue - lowestvalue) * 100.

          Then assume I want to find the slope of the linear regression through
          each 25 row 'chunk.' It is this slope that I will ultimately be
          comparing later in the program.

          This is the kind of programming I was hoping I could use Numerical
          Python for. It is the syntax of such a program that I'm grappling
          with. If someone could help me with the above scenario, then I could
          write the program using the real comparisons I want (which are
          considerably more complicated than above).

          Thank you for your kind response. If you have any comments on the
          above I would appreciate hearing them.

          Matt

          Comment

          • Tim Hochberg

            #6
            Re: General Numerical Python question

            2mc wrote:
            [color=blue]
            > Michael Ressler <ressler@cheeta h.jpl.nasa.gov> wrote in message news:<slrnbolkk 2.89o.ressler@c heetah.jpl.nasa .gov>...
            >[color=green]
            >>The real question is - why do you want to run a loop over an array?
            >>The whole point of Numeric is that you want to eliminate loops
            >>entirely. Keeping things in the array domain is infinitely faster than
            >>running explicit loops. You may need to come up with some clever
            >>expressions to do it, but most loops can be gotten rid of with clever
            >>uses of put(), take(), and the like.
            >>
            >>Loops are evil.
            >>
            >>Mike[/color]
            >
            >
            > For me, the key thought in your post is " you may need to come up with
            > some clever expressions to do it, but most loops can be gotten rid of
            > with clever uses of put(), take(), and the like."
            >
            > This is what I'm looking for. I'm so used to explicitly declaring
            > loops that it is hard for me to "speak" in Numerical Python.
            >
            > Suppose I have 2 very large arrays of serial data. To make it simple,
            > let's assume each has 10s of thousands of rows with one column/field
            > of data. Further assume at some point later in the program I am going
            > to compare the data in the two arrays - the comparison being on chunks
            > of 25 rows throughout the array. But, before I do that, I have to
            > "normalize" the data in both arrays in order to make the comparisons
            > valid.
            >
            > Assume the way I make the comparisons is to find the size of the range
            > between the highest and the lowest value in each 25 row 'chunk' and
            > normalize each data point as: (datapoint - lowestvalue) /
            > (highestvalue - lowestvalue) * 100.[/color]

            Something like:

            import Numeric as np # Personal preference
            chunked = np.reshape(data , (-1, 25)) # Chunked is a n/25 x 25 array
            chunk_max = np.maximum.redu ce(chunked, 1) # Reduce along axis 1
            chunk_min = np.minimum.redu ce(chunked, 1)
            # NewAxis changes the relevant arrays from shape [n/25] to shape
            # [n/25,1]. 1 will get broadcast.
            normalized = ((chunked - chunk_min[:,np.NewAxis]) /
            (chunk_max - chunk_min)[:,np.NewAxis]* 100)
            [color=blue]
            > Then assume I want to find the slope of the linear regression through
            > each 25 row 'chunk.' It is this slope that I will ultimately be
            > comparing later in the program.[/color]

            For this you might want to use the LinearAlgebra module (it comes with
            Numeric). I'm not as familiar with the interface for this though, so you
            'll have to check the docs or hope someone else can help you.


            Hope that gets you started.

            -tim

            [color=blue]
            > This is the kind of programming I was hoping I could use Numerical
            > Python for. It is the syntax of such a program that I'm grappling
            > with. If someone could help me with the above scenario, then I could
            > write the program using the real comparisons I want (which are
            > considerably more complicated than above).
            >
            > Thank you for your kind response. If you have any comments on the
            > above I would appreciate hearing them.
            >
            > Matt[/color]

            Comment

            • 2mc

              #7
              Re: General Numerical Python question

              Tim Hochberg <tim.hochberg@i eee.org> wrote in message news:<dgMib.249 25$gi2.17821@fe d1read01>...[color=blue][color=green]
              > > Suppose I have 2 very large arrays of serial data. To make it simple,
              > > let's assume each has 10s of thousands of rows with one column/field
              > > of data. Further assume at some point later in the program I am going
              > > to compare the data in the two arrays - the comparison being on chunks
              > > of 25 rows throughout the array. But, before I do that, I have to
              > > "normalize" the data in both arrays in order to make the comparisons
              > > valid.
              > >
              > > Assume the way I make the comparisons is to find the size of the range
              > > between the highest and the lowest value in each 25 row 'chunk' and
              > > normalize each data point as: (datapoint - lowestvalue) /
              > > (highestvalue - lowestvalue) * 100.[/color]
              >
              > Something like:
              >
              > import Numeric as np # Personal preference
              > chunked = np.reshape(data , (-1, 25)) # Chunked is a n/25 x 25 array
              > chunk_max = np.maximum.redu ce(chunked, 1) # Reduce along axis 1
              > chunk_min = np.minimum.redu ce(chunked, 1)
              > # NewAxis changes the relevant arrays from shape [n/25] to shape
              > # [n/25,1]. 1 will get broadcast.
              > normalized = ((chunked - chunk_min[:,np.NewAxis]) /
              > (chunk_max - chunk_min)[:,np.NewAxis]* 100)
              >[color=green]
              > > Then assume I want to find the slope of the linear regression through
              > > each 25 row 'chunk.' It is this slope that I will ultimately be
              > > comparing later in the program.[/color]
              >
              > For this you might want to use the LinearAlgebra module (it comes with
              > Numeric). I'm not as familiar with the interface for this though, so you
              > 'll have to check the docs or hope someone else can help you.
              >
              >
              > Hope that gets you started.
              >
              > -tim[/color]

              Thanks a million. I appreciate your kind and thoughtful response. I
              have found members of this board to be very prompt with help and very
              courteous. I hope that when I'm a little more savvy with the language
              that I may return the favor by posting help for someone else.

              May I ask you for a little more help. The example you gave was very
              good and it was something I hadn't thought ot. However, I need the 25
              row "window" to move through the entire array one row at a time. In
              other words each 25 row 'chunk' of data will contain 24 rows of the
              previous 'chunk'. Unless I misunderstood your code, each 'chunk' has
              a unique set of rows - there is no overlapping.

              Do you have any ideas how I could do this without loops?

              Matt

              Comment

              • Michael Ressler

                #8
                Re: General Numerical Python question

                In article <500a4565.03101 40541.4f4fd35f@ posting.google. com>, 2mc wrote:[color=blue]
                > May I ask you for a little more help. The example you gave was very
                > good and it was something I hadn't thought ot. However, I need the 25
                > row "window" to move through the entire array one row at a time. In
                > other words each 25 row 'chunk' of data will contain 24 rows of the
                > previous 'chunk'. Unless I misunderstood your code, each 'chunk' has
                > a unique set of rows - there is no overlapping.
                >
                > Do you have any ideas how I could do this without loops?[/color]

                Okay, maybe you can't get rid of all loops as I implied in my previous
                "loops are evil" post, but the trick is to minimize them.

                This is "pseudo-code", so don't try to run it, but see if the ideas
                are useful. One way to approach "running" things, is extensive use of
                subarrays. Suppose you want to do a running average on an n-element
                run of a large array a with m elements (let's ignore the endpoints for
                now). This isn't the best way to do a running average, but it might
                help with things more complex than an average.

                m=len(a)
                avg=zeros(len(a ))
                for i in range(n) : # window size to smooth over
                avg=avg+a[i:m-n+i]
                avg=avg/n

                I think I might have screwed up the syntax of the subarray statement
                (I'm still much better with the commercial language IDL than I am with
                Numeric, and I get them confused, but the thought process is the
                same). The idea is to pile up subarrays which have been shifted by
                one. Instead of a zillion loops through the array, you only have to
                deal with n (in your case 25) cycles.

                This is what I meant by "clever expressions" in my first response.
                Hope this stimulates more ideas.

                Mike

                --
                Dr. Michael Ressler
                Research Scientist, Astrophysics Research Element, Jet Propulsion Laboratory
                Email: ressler@cheetah .jpl.nasa.gov Phone: (818)354-5576
                "A bad night at the telescope is still better than the best day at the office."

                Comment

                • Tim Hochberg

                  #9
                  Re: General Numerical Python question

                  Michael Ressler wrote:[color=blue]
                  > In article <500a4565.03101 40541.4f4fd35f@ posting.google. com>, 2mc wrote:
                  >[color=green]
                  >>May I ask you for a little more help. The example you gave was very
                  >>good and it was something I hadn't thought ot. However, I need the 25
                  >>row "window" to move through the entire array one row at a time. In
                  >>other words each 25 row 'chunk' of data will contain 24 rows of the
                  >>previous 'chunk'. Unless I misunderstood your code, each 'chunk' has
                  >>a unique set of rows - there is no overlapping.
                  >>
                  >>Do you have any ideas how I could do this without loops?[/color]
                  >
                  >
                  > Okay, maybe you can't get rid of all loops as I implied in my previous
                  > "loops are evil" post, but the trick is to minimize them.
                  >
                  > This is "pseudo-code", so don't try to run it, but see if the ideas
                  > are useful. One way to approach "running" things, is extensive use of
                  > subarrays. Suppose you want to do a running average on an n-element
                  > run of a large array a with m elements (let's ignore the endpoints for
                  > now). This isn't the best way to do a running average, but it might
                  > help with things more complex than an average.
                  >
                  > m=len(a)
                  > avg=zeros(len(a ))
                  > for i in range(n) : # window size to smooth over
                  > avg=avg+a[i:m-n+i]
                  > avg=avg/n[/color]

                  I agree, you probably can't get rid of all the loops in this case, and
                  if you could, the resulting code would probably be horrible. I have a
                  couple of minor quibles with the above code though. I think I'd write it as:

                  lenavg = len(a) - n + 1
                  avg =np.zeros(lenav g, np.Float)
                  for i in range(n) : # window size to smooth over
                  avg += a[i:lenavg+i] # Using += reuses the same array every time
                  # Instead of creating a new one each time
                  # Through the loop.
                  avg /= n # Same here.

                  The important point being the use of += and /=. And, in order to make
                  that work, you need to set the type of avg appropriately, not let it
                  default to int.

                  -tim

                  [color=blue]
                  >
                  > I think I might have screwed up the syntax of the subarray statement
                  > (I'm still much better with the commercial language IDL than I am with
                  > Numeric, and I get them confused, but the thought process is the
                  > same). The idea is to pile up subarrays which have been shifted by
                  > one. Instead of a zillion loops through the array, you only have to
                  > deal with n (in your case 25) cycles.
                  >
                  > This is what I meant by "clever expressions" in my first response.
                  > Hope this stimulates more ideas.
                  >
                  > Mike
                  >[/color]

                  Comment

                  • Fernando Perez

                    #10
                    Re: General Numerical Python question

                    John J. Lee wrote:
                    [color=blue]
                    > Perhaps part of the trick is to know when to leave Numeric behind and
                    > use Pyrex.[/color]

                    Do you have a successful example of pyrex manipulating data which is in a
                    Numeric array? Last time I tried (a while back), the performance was
                    catastrophic (meaning, no better than python itself for explicit loops).
                    These days I either use weave.inline() or hand-written extensions. In both
                    cases I use Blitz++ for the arrays, so the C++ code retains much of the flavor
                    of the originaly python.

                    But I'd really like to know if pyrex has caught up to handling Numeric arrays
                    efficiently (including complex ones).

                    Thanks in advance,

                    f

                    Comment

                    • 2mc

                      #11
                      Re: General Numerical Python question

                      Michael Ressler <ressler@cheeta h.jpl.nasa.gov> wrote in message news:<slrnboqrh 1.6mk.ressler@c heetah.jpl.nasa .gov>...[color=blue]
                      > Another example of thinking things differently is suppose you have a
                      > vector where the values are randomly positive or negative. Suppose for
                      > reasons known only to you, you want to replace the negative values
                      > with the sqrt of their absolute values. With Numeric, no loops are
                      > involved.
                      >
                      > from Numeric import *
                      > a=array([1.,2.,-3.,4.,-5.,6.,-7.,-8.,9.]) # make up an array
                      > idx=nonzero(a<0 ) # indexes of the negative values
                      > sqrs=sqrt(abs(t ake(a,idx))) # get the sqrts of neg elements
                      > put(a,idx,sqrs) # put them back into a
                      > print a # works!
                      >
                      > You can make the whole thing a one-liner if you want to get carried
                      > away with it. It's too bad "nonzero" isn't called "whereis" or
                      > something like that - it would make the idx= line more obvious.
                      >
                      > Mike[/color]

                      I think I'm finally getting a handle on this. So, my thanks to
                      everyone who has so graciously helped me out with their suggestions.

                      How would you handle the above if "a" were a 2d array since "nonzero"
                      only works on 1d arrays? Could you have used the "nonzero" function
                      on a "vertical" slice of the array (from the perspective of an array
                      of rows and columns - a vertical slice being the data in the column)?

                      I mostly deal with 2d arrays, but I have a few 3d arrays. So, I'm
                      curious how you would handle your example above with a multdimensional
                      array.

                      Thanks. And, thanks again to all.

                      Matt

                      Comment

                      • Michael Ressler

                        #12
                        Re: General Numerical Python question

                        In article <500a4565.03101 62140.1d7c9c1b@ posting.google. com>, 2mc wrote:[color=blue]
                        > How would you handle the above if "a" were a 2d array since "nonzero"
                        > only works on 1d arrays? Could you have used the "nonzero" function
                        > on a "vertical" slice of the array (from the perspective of an array
                        > of rows and columns - a vertical slice being the data in the column)?[/color]

                        I don't have a lot of experience with this yet, but every array has a
                        attribute called flat (e.g. a.flat) which is a 1-D representation of
                        the array. So if a is a 2-D (or 3-D) array, you could do something
                        like:

                        idx=nonzero(a.f lat)
                        put(a.flat, values, idx)
                        print a

                        where a now has the appropriate values placed in their proper 2-D
                        positions.

                        As a side note, the numarray package (intended to be a Numeric
                        replacement) will provide better syntax for dealing with put and take,
                        maybe even handle 2-D issues like this transparently, but it's not
                        quite ready for prime time yet.

                        Mike

                        --
                        Dr. Michael Ressler
                        Research Scientist, Astrophysics Research Element, Jet Propulsion Laboratory
                        Email: ressler@cheetah .jpl.nasa.gov Phone: (818)354-5576
                        "A bad night at the telescope is still better than the best day at the office."

                        Comment

                        • Mark Jackson

                          #13
                          Re: General Numerical Python question

                          mcrider@bigfoot .com (2mc) writes:[color=blue]
                          > Michael Ressler <ressler@cheeta h.jpl.nasa.gov> wrote in message news:<slrnboqrh 1.6mk.ressler@c heetah.jpl.nasa .gov>...[color=green]
                          > > Another example of thinking things differently is suppose you have a
                          > > vector where the values are randomly positive or negative. Suppose for
                          > > reasons known only to you, you want to replace the negative values
                          > > with the sqrt of their absolute values. With Numeric, no loops are
                          > > involved.
                          > >
                          > > from Numeric import *
                          > > a=array([1.,2.,-3.,4.,-5.,6.,-7.,-8.,9.]) # make up an array
                          > > idx=nonzero(a<0 ) # indexes of the negative values
                          > > sqrs=sqrt(abs(t ake(a,idx))) # get the sqrts of neg elements
                          > > put(a,idx,sqrs) # put them back into a
                          > > print a # works!
                          > >
                          > > You can make the whole thing a one-liner if you want to get carried
                          > > away with it. It's too bad "nonzero" isn't called "whereis" or
                          > > something like that - it would make the idx= line more obvious.
                          > >
                          > > Mike[/color]
                          >
                          > I think I'm finally getting a handle on this. So, my thanks to
                          > everyone who has so graciously helped me out with their suggestions.
                          >
                          > How would you handle the above if "a" were a 2d array since "nonzero"
                          > only works on 1d arrays? Could you have used the "nonzero" function
                          > on a "vertical" slice of the array (from the perspective of an array
                          > of rows and columns - a vertical slice being the data in the column)?[/color]

                          I'm very new at this myself (currently porting some Fortran code to
                          Numeric) but I believe that Numeric.putmask is your friend here:
                          [color=blue][color=green][color=darkred]
                          >>> a=Numeric.array ([i*(-1)**i for i in range(20)],Numeric.Float)
                          >>> b=a.resize((4,5 ))
                          >>> b[/color][/color][/color]
                          array([[ 0., -1., 2., -3., 4.],
                          [ -5., 6., -7., 8., -9.],
                          [ 10., -11., 12., -13., 14.],
                          [-15., 16., -17., 18., -19.]])[color=blue][color=green][color=darkred]
                          >>> mask = b<0
                          >>> mask[/color][/color][/color]
                          array([[0, 1, 0, 1, 0],
                          [1, 0, 1, 0, 1],
                          [0, 1, 0, 1, 0],
                          [1, 0, 1, 0, 1]])[color=blue][color=green][color=darkred]
                          >>> Numeric.putmask (b, mask, Numeric.sqrt(ab s(b)))
                          >>> b[/color][/color][/color]
                          array([[ 0. , 1. , 2. , 1.73205081, 4. ],
                          [ 2.23606798, 6. , 2.64575131, 8. , 3. ],
                          [ 10. , 3.31662479, 12. , 3.60555128, 14. ],
                          [ 3.87298335, 16. , 4.12310563, 18. , 4.35889894]])

                          --
                          Mark Jackson - http://www.alumni.caltech.edu/~mjackson
                          There are two kinds of fool. One says, "This is old,
                          and therefore good." And one says, "This is new, and
                          therefore better." - Dean William Inge


                          Comment

                          • Scott Ransom

                            #14
                            Re: General Numerical Python question

                            mjackson@alumni .caltech.edu (Mark Jackson) wrote in message news:<bmp9us$oa 8$1@news.wrc.xe rox.com>...[color=blue]
                            > mcrider@bigfoot .com (2mc) writes:[color=green]
                            > > Michael Ressler <ressler@cheeta h.jpl.nasa.gov> wrote in message news:<slrnboqrh 1.6mk.ressler@c heetah.jpl.nasa .gov>...[color=darkred]
                            > > > Another example of thinking things differently is suppose you have a
                            > > > vector where the values are randomly positive or negative. Suppose for
                            > > > reasons known only to you, you want to replace the negative values
                            > > > with the sqrt of their absolute values. With Numeric, no loops are
                            > > > involved.
                            > > >
                            > > > from Numeric import *
                            > > > a=array([1.,2.,-3.,4.,-5.,6.,-7.,-8.,9.]) # make up an array
                            > > > idx=nonzero(a<0 ) # indexes of the negative values
                            > > > sqrs=sqrt(abs(t ake(a,idx))) # get the sqrts of neg elements
                            > > > put(a,idx,sqrs) # put them back into a
                            > > > print a # works!
                            > > >
                            > > > You can make the whole thing a one-liner if you want to get carried
                            > > > away with it. It's too bad "nonzero" isn't called "whereis" or
                            > > > something like that - it would make the idx= line more obvious.
                            > > >
                            > > > Mike[/color]
                            > >
                            > > I think I'm finally getting a handle on this. So, my thanks to
                            > > everyone who has so graciously helped me out with their suggestions.
                            > >
                            > > How would you handle the above if "a" were a 2d array since "nonzero"
                            > > only works on 1d arrays? Could you have used the "nonzero" function
                            > > on a "vertical" slice of the array (from the perspective of an array
                            > > of rows and columns - a vertical slice being the data in the column)?[/color]
                            >
                            > I'm very new at this myself (currently porting some Fortran code to
                            > Numeric) but I believe that Numeric.putmask is your friend here:
                            >[color=green][color=darkred]
                            > >>> a=Numeric.array ([i*(-1)**i for i in range(20)],Numeric.Float)
                            > >>> b=a.resize((4,5 ))
                            > >>> b[/color][/color]
                            > array([[ 0., -1., 2., -3., 4.],
                            > [ -5., 6., -7., 8., -9.],
                            > [ 10., -11., 12., -13., 14.],
                            > [-15., 16., -17., 18., -19.]])[color=green][color=darkred]
                            > >>> mask = b<0
                            > >>> mask[/color][/color]
                            > array([[0, 1, 0, 1, 0],
                            > [1, 0, 1, 0, 1],
                            > [0, 1, 0, 1, 0],
                            > [1, 0, 1, 0, 1]])[color=green][color=darkred]
                            > >>> Numeric.putmask (b, mask, Numeric.sqrt(ab s(b)))
                            > >>> b[/color][/color]
                            > array([[ 0. , 1. , 2. , 1.73205081, 4. ],
                            > [ 2.23606798, 6. , 2.64575131, 8. , 3. ],
                            > [ 10. , 3.31662479, 12. , 3.60555128, 14. ],
                            > [ 3.87298335, 16. , 4.12310563, 18. , 4.35889894]])[/color]

                            Once again, this can be done in a single (easy-to-read) line using:

                            b = where(b<0, sqrt(fabs(b)), b)

                            where does all the masking and putmasking for you.

                            Scott

                            Comment

                            • 2mc

                              #15
                              Re: General Numerical Python question

                              To all who have helped,

                              I am finding out all kinds of ways to do things. It's exciting.
                              Thanks to all who have replied.

                              I'm still having trouble with one thing. Let me set a scenario and
                              see if anyone has any ideas.

                              Assume a multidimensiona l array (2d). This would be like a
                              spreadsheet of rows and columns. Further, assume many 'rows' and 3
                              columns. Suppose I want a running list of the highest value for 20
                              'rows'. So, starting at 'row' 19, the answer would be the highest
                              value from 'row' 0 to 'row' 19. Then, at 'row' 20, the answer would
                              be the highest value from 'row' 1 to 'row' 20. And, so on. Further,
                              suppose I want this value for each 'column'. The result would be a 3
                              'column' array with 19 less rows than the source array containing the
                              running list of highest values in the last 20.

                              How would this be done without loops?

                              Thanks a million.

                              Matt

                              Comment

                              Working...