More efficient array processing

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • John [H2O]

    More efficient array processing


    Hello,

    I'm trying to do the following:

    datagrid = numpy.zeros(360 ,180,3,73,20)

    But I get an error saying that the dimensions are too large? Is there a
    memory issue here?

    So, my workaround is this:

    numpoint = 73

    datagrid = numpy.zeros(360 ,180,3,73,1)

    for np in range(numpoint) :
    datagrid[:,:,:,np,0] = datagrid[:,:,:,np,0] + concgrid[:,:,:,np,0]

    But this is SLOW.. what can I do to increase efficiency here? Is there a way
    to create the larger array? The program loops through several days actually,
    filling the 5th dimension. Eventually I just sum the 5th dimension anyway
    (as done in the loop of the workaround).

    Thanks!
    john


    --
    Configuration
    ``````````````` ```````````
    Plone 2.5.3-final,
    CMF-1.6.4,
    Zope (Zope 2.9.7-final, python 2.4.4, linux2),
    Five 1.4.1,
    Python 2.4.4 (#1, Jul 3 2007, 22:58:17) [GCC 4.1.1 20070105 (Red Hat
    4.1.1-51)],
    PIL 1.1.6
    Mailman 2.1.9
    Postfix 2.4.5
    Procmail v3.22 2001/09/10
    --
    View this message in context: http://www.nabble.com/More-efficient...p20136676.html
    Sent from the Python - python-list mailing list archive at Nabble.com.

  • Marc 'BlackJack' Rintsch

    #2
    Re: More efficient array processing

    On Thu, 23 Oct 2008 11:11:32 -0700, John [H2O] wrote:
    I'm trying to do the following:
    >
    datagrid = numpy.zeros(360 ,180,3,73,20)
    >
    But I get an error saying that the dimensions are too large? Is there a
    memory issue here?
    Let's see:

    You have: 360 * 180 * 3 * 73 * 20 * 8 bytes
    You want: GiB
    * 2.1146536
    / 0.47289069

    Do you have a 32 bit system? Then 2 GiB is too much for a process.

    Ciao,
    Marc 'BlackJack' Rintsch

    Comment

    • John [H2O]

      #3
      Re: More efficient array processing


      Thanks for the clarification.

      What is strange though, is that I have several Fortran programs that create
      the exact same array srtucture... wouldn't they be restricted to the 2Gb
      limit as well?

      Thoughts on a more efficient work around?


      Marc 'BlackJack' Rintsch wrote:

      On Thu, 23 Oct 2008 11:11:32 -0700, John [H2O] wrote:
      >I'm trying to do the following:
      >
      >datagrid = numpy.zeros(360 ,180,3,73,20)
      >
      >But I get an error saying that the dimensions are too large? Is there a
      >memory issue here?
      Let's see:

      You have: 360 * 180 * 3 * 73 * 20 * 8 bytes
      You want: GiB
      * 2.1146536
      / 0.47289069

      Do you have a 32 bit system? Then 2 GiB is too much for a process.

      Ciao,
      Marc 'BlackJack' Rintsch
      --

      --
      View this message in context: http://www.nabble.com/More-efficient...p20137263.html
      Sent from the Python - python-list mailing list archive at Nabble.com.

      Comment

      • Marc 'BlackJack' Rintsch

        #4
        Re: More efficient array processing

        On Thu, 23 Oct 2008 11:44:04 -0700, John [H2O] wrote:
        What is strange though, is that I have several Fortran programs that
        create the exact same array srtucture... wouldn't they be restricted to
        the 2Gb limit as well?
        They should be. What about the data type of the elements? Any chance
        they are just 4 byte floats in your Fortran code i.e. C floats instead of
        C doubles like the default in `numpy`?

        Ciao,
        Marc 'BlackJack' Rintsch

        Comment

        • John [H2O]

          #5
          Re: More efficient array processing


          I'm using zeros with type np.float, is there a way to define the data type to
          be 4 byte floats?


          Marc 'BlackJack' Rintsch wrote:
          >
          On Thu, 23 Oct 2008 11:44:04 -0700, John [H2O] wrote:
          >
          >What is strange though, is that I have several Fortran programs that
          >create the exact same array srtucture... wouldn't they be restricted to
          >the 2Gb limit as well?
          >
          They should be. What about the data type of the elements? Any chance
          they are just 4 byte floats in your Fortran code i.e. C floats instead of
          C doubles like the default in `numpy`?
          >
          Ciao,
          Marc 'BlackJack' Rintsch
          --

          >
          >
          --
          View this message in context: http://www.nabble.com/More-efficient...p20139062.html
          Sent from the Python - python-list mailing list archive at Nabble.com.

          Comment

          • Marc 'BlackJack' Rintsch

            #6
            Re: More efficient array processing

            On Thu, 23 Oct 2008 13:56:22 -0700, John [H2O] wrote:
            I'm using zeros with type np.float, is there a way to define the data
            type to be 4 byte floats?
            Yes:

            In [13]: numpy.zeros(5, numpy.float32)
            Out[13]: array([ 0., 0., 0., 0., 0.], dtype=float32)

            Ciao,
            Marc 'BlackJack' Rintsch

            Comment

            • Robert Kern

              #7
              Re: More efficient array processing

              John [H2O] wrote:
              I'm using zeros with type np.float, is there a way to define the data type to
              be 4 byte floats?
              np.float32. np.float is not part of the numpy API. It's just Python's builtin
              float type which corresponds to C doubles.

              --
              Robert Kern

              "I have come to believe that the whole world is an enigma, a harmless enigma
              that is made terrible by our own mad attempt to interpret it as though it had
              an underlying truth."
              -- Umberto Eco

              Comment

              • Ivan Reborin

                #8
                Re: More efficient array processing

                On Thu, 23 Oct 2008 11:44:04 -0700 (PDT), "John [H2O]"
                <washakie@gmail .comwrote:
                >
                >Thanks for the clarification.
                >
                >What is strange though, is that I have several Fortran programs that create
                >the exact same array srtucture... wouldn't they be restricted to the 2Gb
                >limit as well?
                Depends on lot of things, as Mark has hinted.
                But, why are you rewriting fortran subroutines to py ? Expecially for
                this kind of array processing.

                And, if it's no secret, would you mind telling what do you need an
                array of that size for ?
                I cannot think of many uses that would require an array of that size
                and that many dimensions, which couldn't be rewritten in a more
                efficient manner, to several smaller arrays.


                --
                Ivan

                >
                >Thoughts on a more efficient work around?
                >
                >

                Comment

                • Ivan Reborin

                  #9
                  Re: More efficient array processing

                  On Fri, 24 Oct 2008 00:32:11 +0200, Ivan Reborin
                  <ireborin@delet e.this.gmail.co mwrote:
                  >On Thu, 23 Oct 2008 11:44:04 -0700 (PDT), "John [H2O]"
                  ><washakie@gmai l.comwrote:
                  >
                  >>
                  >>Thanks for the clarification.
                  >>
                  >>What is strange though, is that I have several Fortran programs that create
                  >>the exact same array srtucture... wouldn't they be restricted to the 2Gb
                  >>limit as well?
                  >
                  >Depends on lot of things, as Mark has hinted.
                  *Marc*

                  Apologies.



                  --
                  Ivan

                  Comment

                  • sturlamolden

                    #10
                    Re: More efficient array processing

                    On Oct 23, 8:11 pm, "John [H2O]" <washa...@gmail .comwrote:
                    datagrid = numpy.zeros(360 ,180,3,73,20)
                    On a 32 bit system, try this instead:

                    datagrid = numpy.zeros((36 0,180,3,73,20), dtype=numpy.flo at32)

                    (if you can use single precision that is.)












                    Comment

                    Working...