valarray <vallaray<T> > efficiency

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • john

    valarray <vallaray<T> > efficiency

    Hi, in TC++PL 3 on pages 674-675 it is mentioned:


    "Maybe your first idea for a two-dimensional vector was something like this:

    class Matrix {
    valarray< valarray<double v;
    public:
    // ...
    };

    This would also work (22.9[10]). However, it is not easy to match
    efficiency and compatibility required by high performance computations
    without dropping to the lower and more conventional level represented by
    valarray plus slices".


    However since 1998 much time has passed, and I wonder if the current
    compiler implementations allow valarray<valarr ay<T to be equally
    efficient (or more) than using a valarray with slices/slice_arrays.
  • kwikius

    #2
    Re: valarray &lt;vallaray&lt ;T&gt; &gt; efficiency

    On Jan 3, 3:03 pm, john <j...@no.spamwr ote:
    Hi, in TC++PL 3 on pages 674-675 it is mentioned:
    >
    "Maybe your first idea for a two-dimensional vector was something like this:
    >
    class Matrix {
          valarray< valarray<double v;
    public:
           // ...
    >
    };
    >
    This would also work (22.9[10]). However, it is not easy to match
    efficiency and compatibility required by high performance computations
    without dropping to the lower and more conventional level represented by
    valarray plus slices".
    >
    However since 1998 much time has passed, and I wonder if the current
    compiler implementations allow valarray<valarr ay<T to be equally
    efficient (or more) than using a valarray with slices/slice_arrays.
    My 2p's worth.. FWIW

    Because a valarray is not a compile time fixed size it must? use heap
    allocation, hence a matrix as view of list approach will likely only
    require one allocation, whereas matrix as sequence of sequence will
    require multiple allocations.

    I believe even now that compilers find it difficult to do much
    optimisation on heap pointers( even wary of stack pointers and
    anything complicated in references), so it probably makes sense to use
    as few pointers as possible, then the compiler only needs to deal with
    a single offset.

    Anyway its an interesting issue :-)

    regards
    Andy Little

    Comment

    • Ioannis Vranos

      #3
      Re: valarray &lt;vallaray&lt ;T&gt; &gt; efficiency

      kwikius wrote:
      >
      My 2p's worth.. FWIW
      >
      Because a valarray is not a compile time fixed size it must? use heap
      allocation, hence a matrix as view of list approach will likely only
      require one allocation, whereas matrix as sequence of sequence will
      require multiple allocations.
      >
      I believe even now that compilers find it difficult to do much
      optimisation on heap pointers( even wary of stack pointers and
      anything complicated in references), so it probably makes sense to use
      as few pointers as possible, then the compiler only needs to deal with
      a single offset.
      >
      Anyway its an interesting issue :-)

      Given that TC++PL3 was written before the change in the standard that
      *requires* that vectors, valarrays, etc are implemented as continuous
      sequences,would n't the following be more efficient or at least the same
      efficient as the usage of valarrays and slices/slice_arrays to
      "simulate" a 2-dimensional matrix?


      #include <iostream>
      #include <valarray>



      int main()
      {
      using namespace std;

      valarray<intvi( 1, 10);

      int (*p)[5]= reinterpret_cas t<int (*)[5](&vi[0]);

      for (size_t i= 0; i< 2; ++i)
      for(size_t j=0; j<5; ++j)
      cout<< p[i][j]<<" ";

      cout<< endl;

      }


      Here p behaves a s a 2-dimensional matrix, that is a 2x5 matrix.

      Comment

      • Victor Bazarov

        #4
        Re: valarray &lt;vallaray&lt ;T&gt; &gt; efficiency

        Ioannis Vranos wrote:
        kwikius wrote:
        >>
        >My 2p's worth.. FWIW
        >>
        >Because a valarray is not a compile time fixed size it must? use heap
        >allocation, hence a matrix as view of list approach will likely only
        >require one allocation, whereas matrix as sequence of sequence will
        >require multiple allocations.
        >>
        >I believe even now that compilers find it difficult to do much
        >optimisation on heap pointers( even wary of stack pointers and
        >anything complicated in references), so it probably makes sense to
        >use as few pointers as possible, then the compiler only needs to
        >deal with a single offset.
        >>
        >Anyway its an interesting issue :-)
        >
        >
        Given that TC++PL3 was written before the change in the standard that
        *requires* that vectors, valarrays, etc are implemented as continuous
        sequences,would n't the following be more efficient or at least the
        same efficient as the usage of valarrays and slices/slice_arrays to
        "simulate" a 2-dimensional matrix?

        Given that 'reinterpret_ca st' is not supposed to be used that way,
        the following has undefined behaviour. Aside from that...
        >
        >
        #include <iostream>
        #include <valarray>
        >
        >
        >
        int main()
        {
        using namespace std;
        >
        valarray<intvi( 1, 10);
        >
        int (*p)[5]= reinterpret_cas t<int (*)[5](&vi[0]);
        >
        for (size_t i= 0; i< 2; ++i)
        for(size_t j=0; j<5; ++j)
        cout<< p[i][j]<<" ";
        >
        cout<< endl;
        >
        }
        >
        >
        Here p behaves a s a 2-dimensional matrix, that is a 2x5 matrix.
        V
        --
        Please remove capital 'A's when replying by e-mail
        I do not respond to top-posted replies, please don't ask


        Comment

        • Ioannis Vranos

          #5
          Re: valarray &lt;vallaray&lt ;T&gt; &gt; efficiency

          Victor Bazarov wrote:
          >
          Given that 'reinterpret_ca st' is not supposed to be used that way,

          Apart from the approach:

          static_cast<int (*)[5]( static_cast<voi d *(&vi[0]) );


          why reinterpret_cas t "is not supposed to be used that way"?

          Comment

          • Victor Bazarov

            #6
            Re: valarray &lt;vallaray&lt ;T&gt; &gt; efficiency

            Ioannis Vranos wrote:
            Victor Bazarov wrote:
            >>
            >Given that 'reinterpret_ca st' is not supposed to be used that way,
            >
            >
            Apart from the approach:
            >
            static_cast<int (*)[5]( static_cast<voi d *(&vi[0]) );
            >
            >
            why reinterpret_cas t "is not supposed to be used that way"?
            If you reinterpret_cas t one object pointer into another object
            pointer, the Standard only specifies that the resulting pointer
            can be converted back to the original type. Please see Standard,
            [expr.reinterpre t.cast]/7. Whatever you think you can do with
            the pointer, is not in the langauge specification, literally.

            As to the two consequitive static_cast operations, the result
            of those is also unspecified, since you're not converting the
            result of static_cast<voi d*back to the original type. See
            [expr.static.cas t].

            V
            --
            Please remove capital 'A's when replying by e-mail
            I do not respond to top-posted replies, please don't ask


            Comment

            • kwikius

              #7
              Re: valarray &lt;vallaray&lt ;T&gt; &gt; efficiency

              On Jan 3, 4:49 pm, Ioannis Vranos <j...@no.spamwr ote:
              kwikius wrote:
              >
              My 2p's worth.. FWIW
              >
              Because a valarray is not a compile time fixed size it must? use heap
              allocation, hence a matrix as view of list approach will likely only
              require one allocation, whereas matrix as sequence of sequence will
              require multiple allocations.
              >
              I believe even now that compilers find it difficult to do much
              optimisation on heap pointers( even wary of stack pointers and
              anything complicated in references), so it probably makes sense to use
              as few pointers as possible, then the compiler only needs to deal with
              a single offset.
              >
              Anyway its an interesting issue :-)
              >
              Given that TC++PL3 was written before the change in the standard that
              *requires* that vectors, valarrays, etc are implemented as continuous
              sequences,would n't the following be more efficient or at least the same
              efficient as the usage of valarrays and slices/slice_arrays to
              "simulate" a 2-dimensional matrix?
              The more interesting question to me is how necessary runtime
              resizability of matrices is. Because if they are not resizeable then
              there is no need to allocate from the heap and valarray is unnecessary
              (you can simple use a wrapped c-style array), in which case the
              compiler has much better opportunities for optimisation.

              And N.B some might call an e.g array of 3dpoints a matrix where I
              would see it as an array of matrices where 1 dimension is equal to 1
              dependent on the local convention.

              regards
              Andy Little

              Comment

              • =?UTF-8?B?RXJpayBXaWtzdHLDtm0=?=

                #8
                Re: valarray &lt;vallaray&lt ;T&gt; &gt; efficiency

                On 2008-01-03 16:03, john wrote:
                Hi, in TC++PL 3 on pages 674-675 it is mentioned:
                >
                >
                "Maybe your first idea for a two-dimensional vector was something like this:
                >
                class Matrix {
                valarray< valarray<double v;
                public:
                // ...
                };
                >
                This would also work (22.9[10]). However, it is not easy to match
                efficiency and compatibility required by high performance computations
                without dropping to the lower and more conventional level represented by
                valarray plus slices".
                >
                >
                However since 1998 much time has passed, and I wonder if the current
                compiler implementations allow valarray<valarr ay<T to be equally
                efficient (or more) than using a valarray with slices/slice_arrays.
                For various reasons valarray never became what it was meant to be, and
                few people use it. Since few use it the library writers have not
                bothered much with it so I would not be surprised if the code is more or
                less the same as it was back in 1998. And even if they had it would
                still not be able to match the efficiency of slices since a valarray of
                valarrays adds an additional layer of indirection.

                --
                Erik Wikström

                Comment

                • Ioannis Vranos

                  #9
                  Re: valarray &lt;vallaray&lt ;T&gt; &gt; efficiency

                  Victor Bazarov wrote:
                  Ioannis Vranos wrote:
                  >Victor Bazarov wrote:
                  >>Given that 'reinterpret_ca st' is not supposed to be used that way,
                  >>
                  >Apart from the approach:
                  >>
                  >static_cast<in t (*)[5]( static_cast<voi d *(&vi[0]) );
                  >>
                  >>
                  >why reinterpret_cas t "is not supposed to be used that way"?
                  >
                  If you reinterpret_cas t one object pointer into another object
                  pointer, the Standard only specifies that the resulting pointer
                  can be converted back to the original type. Please see Standard,
                  [expr.reinterpre t.cast]/7. Whatever you think you can do with
                  the pointer, is not in the langauge specification, literally.

                  There, it is mentioned:

                  "A pointer to an object can be explicitly converted to a pointer to an
                  object of different type. 65)".

                  "65) The types may have different cv-qualifiers, subject to the overall
                  restriction that a reinterpret_cas t cannot cast away const-
                  ness".


                  >
                  As to the two consequitive static_cast operations, the result
                  of those is also unspecified, since you're not converting the
                  result of static_cast<voi d*back to the original type. See
                  [expr.static.cas t].

                  Yes, I think in the standard, conversions from "pointer to object" to
                  "pointer to an array of n objects" is not explicitly mentioned, but I
                  think there isn't any decent implementation where it does not work
                  whenever the pointed object is member of an array (sequence).

                  Actually I have the feeling the guarantee exists in the standard and is
                  probably "buried" in some clause. :-)


                  So we have

                  int main()
                  {


                  int array[5]= {0};

                  int *p= &array[0];

                  int (*q)[5]= reinterpret_cas t<int (*)[5](p);
                  }



                  Shouldn't the last always work?

                  Comment

                  • Ioannis Vranos

                    #10
                    Re: valarray &lt;vallaray&lt ;T&gt; &gt; efficiency

                    kwikius wrote:
                    >
                    >Given that TC++PL3 was written before the change in the standard that
                    >*requires* that vectors, valarrays, etc are implemented as continuous
                    >sequences,woul dn't the following be more efficient or at least the same
                    >efficient as the usage of valarrays and slices/slice_arrays to
                    >"simulate" a 2-dimensional matrix?
                    >
                    The more interesting question to me is how necessary runtime
                    resizability of matrices is. Because if they are not resizeable then
                    there is no need to allocate from the heap and valarray is unnecessary
                    (you can simple use a wrapped c-style array), in which case the
                    compiler has much better opportunities for optimisation.

                    valarray is not a typical container in terms of implementation. It is
                    intended to be heavily optimised, and it can use raw storage from
                    anywhere or something else AFAIK.

                    Comment

                    • Ioannis Vranos

                      #11
                      Re: valarray &lt;vallaray&lt ;T&gt; &gt; efficiency

                      Erik Wikström wrote:
                      On 2008-01-03 16:03, john wrote:
                      >Hi, in TC++PL 3 on pages 674-675 it is mentioned:
                      >>
                      >>
                      >"Maybe your first idea for a two-dimensional vector was something like this:
                      >>
                      >class Matrix {
                      > valarray< valarray<double v;
                      >public:
                      > // ...
                      >};
                      >>
                      >This would also work (22.9[10]). However, it is not easy to match
                      >efficiency and compatibility required by high performance computations
                      >without dropping to the lower and more conventional level represented by
                      >valarray plus slices".
                      >>
                      >>
                      >However since 1998 much time has passed, and I wonder if the current
                      >compiler implementations allow valarray<valarr ay<T to be equally
                      >efficient (or more) than using a valarray with slices/slice_arrays.
                      >
                      For various reasons valarray never became what it was meant to be, and
                      few people use it. Since few use it the library writers have not
                      bothered much with it so I would not be surprised if the code is more or
                      less the same as it was back in 1998. And even if they had it would
                      still not be able to match the efficiency of slices since a valarray of
                      valarrays adds an additional layer of indirection.


                      How about using "pointer to array" pointing to a member of a valarray as
                      I mentioned in another post in the thread instead of slices, to
                      simulate a matrix?

                      Comment

                      • Victor Bazarov

                        #12
                        Re: valarray &lt;vallaray&lt ;T&gt; &gt; efficiency

                        Ioannis Vranos wrote:
                        Victor Bazarov wrote:
                        >Ioannis Vranos wrote:
                        >>Victor Bazarov wrote:
                        >>>Given that 'reinterpret_ca st' is not supposed to be used that way,
                        >>>
                        >>Apart from the approach:
                        >>>
                        >>static_cast<i nt (*)[5]( static_cast<voi d *(&vi[0]) );
                        >>>
                        >>>
                        >>why reinterpret_cas t "is not supposed to be used that way"?
                        >>
                        >If you reinterpret_cas t one object pointer into another object
                        >pointer, the Standard only specifies that the resulting pointer
                        >can be converted back to the original type. Please see Standard,
                        >[expr.reinterpre t.cast]/7. Whatever you think you can do with
                        >the pointer, is not in the langauge specification, literally.
                        >
                        >
                        There, it is mentioned:
                        >
                        "A pointer to an object can be explicitly converted to a pointer to an
                        object of different type. 65)".
                        >
                        "65) The types may have different cv-qualifiers, subject to the
                        overall restriction that a reinterpret_cas t cannot cast away const-
                        ness".
                        >
                        >
                        >
                        >>
                        >As to the two consequitive static_cast operations, the result
                        >of those is also unspecified, since you're not converting the
                        >result of static_cast<voi d*back to the original type. See
                        >[expr.static.cas t].
                        >
                        >
                        Yes, I think in the standard, conversions from "pointer to object" to
                        "pointer to an array of n objects" is not explicitly mentioned, but I
                        think there isn't any decent implementation where it does not work
                        whenever the pointed object is member of an array (sequence).
                        >
                        Actually I have the feeling the guarantee exists in the standard and
                        is probably "buried" in some clause. :-)
                        Find it; everybody will be grateful.
                        [..]
                        V
                        --
                        Please remove capital 'A's when replying by e-mail
                        I do not respond to top-posted replies, please don't ask


                        Comment

                        • =?UTF-8?B?RXJpayBXaWtzdHLDtm0=?=

                          #13
                          Re: valarray &lt;vallaray&lt ;T&gt; &gt; efficiency

                          On 2008-01-03 22:39, Ioannis Vranos wrote:
                          Erik Wikström wrote:
                          >On 2008-01-03 16:03, john wrote:
                          >>Hi, in TC++PL 3 on pages 674-675 it is mentioned:
                          >>>
                          >>>
                          >>"Maybe your first idea for a two-dimensional vector was something like this:
                          >>>
                          >>class Matrix {
                          >> valarray< valarray<double v;
                          >>public:
                          >> // ...
                          >>};
                          >>>
                          >>This would also work (22.9[10]). However, it is not easy to match
                          >>efficiency and compatibility required by high performance computations
                          >>without dropping to the lower and more conventional level represented by
                          >>valarray plus slices".
                          >>>
                          >>>
                          >>However since 1998 much time has passed, and I wonder if the current
                          >>compiler implementations allow valarray<valarr ay<T to be equally
                          >>efficient (or more) than using a valarray with slices/slice_arrays.
                          >>
                          >For various reasons valarray never became what it was meant to be, and
                          >few people use it. Since few use it the library writers have not
                          >bothered much with it so I would not be surprised if the code is more or
                          >less the same as it was back in 1998. And even if they had it would
                          >still not be able to match the efficiency of slices since a valarray of
                          >valarrays adds an additional layer of indirection.
                          >
                          >
                          >
                          How about using "pointer to array" pointing to a member of a valarray as
                          I mentioned in another post in the thread instead of slices, to
                          simulate a matrix?
                          If you want a matrix there are a number of libraries out there, or you
                          can write your own. In the end it usually ends up with a contiguous
                          piece of memory that is somehow divided into rown/columns and some
                          simple arithmetic is used to get the correct element. Besides, these
                          kinds of optimisations usually have a smaller impact than expected in
                          real world applications where operations are usually performed on the
                          whole matrix and where the real costs comes from creation of
                          temporaries. That is why expression templates are used by all libraries
                          aiming for high performance.

                          --
                          Erik Wikström

                          Comment

                          • =?UTF-8?B?RXJpayBXaWtzdHLDtm0=?=

                            #14
                            Re: valarray &lt;vallaray&lt ;T&gt; &gt; efficiency

                            On 2008-01-03 22:36, Ioannis Vranos wrote:
                            kwikius wrote:
                            >>
                            >>Given that TC++PL3 was written before the change in the standard that
                            >>*requires* that vectors, valarrays, etc are implemented as continuous
                            >>sequences,wou ldn't the following be more efficient or at least the same
                            >>efficient as the usage of valarrays and slices/slice_arrays to
                            >>"simulate" a 2-dimensional matrix?
                            >>
                            >The more interesting question to me is how necessary runtime
                            >resizability of matrices is. Because if they are not resizeable then
                            >there is no need to allocate from the heap and valarray is unnecessary
                            >(you can simple use a wrapped c-style array), in which case the
                            >compiler has much better opportunities for optimisation.
                            >
                            >
                            valarray is not a typical container in terms of implementation. It is
                            intended to be heavily optimised, and it can use raw storage from
                            anywhere or something else AFAIK.
                            That was the intention, however, as I mentioned elsethread, that is not
                            the case. I do not know of any high-performance applications (and I
                            doubt that they exist) where valarrays are used. My understanding is
                            that valarray is designed to allow for optimisations (such as utilising
                            SIMD instructions), but no implementation have made those optimisations.

                            --
                            Erik Wikström

                            Comment

                            • Ioannis Vranos

                              #15
                              Re: valarray &lt;vallaray&lt ;T&gt; &gt; efficiency

                              Erik Wikström wrote:
                              >
                              >valarray is not a typical container in terms of implementation. It is
                              >intended to be heavily optimised, and it can use raw storage from
                              >anywhere or something else AFAIK.
                              >
                              That was the intention, however, as I mentioned elsethread, that is not
                              the case. I do not know of any high-performance applications (and I
                              doubt that they exist) where valarrays are used. My understanding is
                              that valarray is designed to allow for optimisations (such as utilising
                              SIMD instructions), but no implementation have made those optimisations.

                              So, what kind of container is used for high performance C++ computing
                              instead? Those template math libraries on the web?

                              Comment

                              Working...