are arrays contiguous in memory?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Brandon Bray [MSFT]

    #16
    Re: are arrays contiguous in memory?

    Willy Denoyette [MVP] wrote:[color=blue]
    > Agreed, all I want is to make sure people don't start to believe that
    > using pointer arithmetic has performance advantages when accessing
    > arrays (elements). I know a lot of C# (and some C++) developers believe
    > that there is such advantage when using unsafe constructs, but it's
    > most often not the case, worse, they start to pin arrays for a long
    > period of time, effectively disturbing the GC activity which results in
    > reduced overall performance.[/color]

    There is one performance benefit for using pointers over indexing for CLR
    arrays. When indexing a CLR array, there is a bounds check. That check is
    eliminated when using a pointer to iterate over the array.

    Of course, that's a trade off between safety and performance. One should be
    very careful when making that decision. Because the JIT compiler has the
    ability to eliminate bounds checks in some cases, switching to pointers is
    not always a worthwhile exercise. Though, it is definitely useful in some
    cases where bitmap manipulation is done. This is probably the only reason
    why interior pointers exist in the new C++. If it wasn't for this case, I
    probably would have succeeded at keeping them out of the language entirely.

    --
    Brandon Bray, Visual C++ Compiler http://blogs.msdn.com/branbray/
    Bugs? Suggestions? Feedback? http://msdn.microsoft.com/productfeedback/


    Comment

    • Arnaud Debaene

      #17
      Re: are arrays contiguous in memory?

      Willy Denoyette [MVP] wrote:[color=blue]
      > Do you guy's know any "arrays" that aren't contigious?[/color]

      A double-linked list?

      Arnaud
      MVP - VC


      Comment

      • Willy Denoyette [MVP]

        #18
        Re: are arrays contiguous in memory?


        "Arnaud Debaene" <adebaene@clu b-internet.fr> wrote in message
        news:Oak7qiNBGH A.2644@TK2MSFTN GP09.phx.gbl...[color=blue]
        > Willy Denoyette [MVP] wrote:[color=green]
        >> Do you guy's know any "arrays" that aren't contigious?[/color]
        >
        > A double-linked list?
        >
        > Arnaud
        > MVP - VC
        >
        >[/color]

        Why a double linked list and not a linked list?
        Anyway, it's not what I would call an array (or a vector).

        Willy.


        Comment

        • Willy Denoyette [MVP]

          #19
          Re: are arrays contiguous in memory?


          "Brandon Bray [MSFT]" <branbray@onlin e.microsoft.com > wrote in message
          news:%23PSDJINB GHA.2692@TK2MSF TNGP10.phx.gbl. ..[color=blue]
          > Willy Denoyette [MVP] wrote:[color=green]
          >> Agreed, all I want is to make sure people don't start to believe that
          >> using pointer arithmetic has performance advantages when accessing
          >> arrays (elements). I know a lot of C# (and some C++) developers believe
          >> that there is such advantage when using unsafe constructs, but it's
          >> most often not the case, worse, they start to pin arrays for a long
          >> period of time, effectively disturbing the GC activity which results in
          >> reduced overall performance.[/color]
          >
          > There is one performance benefit for using pointers over indexing for CLR
          > arrays. When indexing a CLR array, there is a bounds check. That check is
          > eliminated when using a pointer to iterate over the array.
          >[/color]
          That's true, though the overhead is real small (something like a cmp and a
          jmp) and only important to consider for some operations on arrays.
          [color=blue]
          > Of course, that's a trade off between safety and performance. One should
          > be very careful when making that decision. Because the JIT compiler has
          > the ability to eliminate bounds checks in some cases, switching to
          > pointers is not always a worthwhile exercise. Though, it is definitely
          > useful in some cases where bitmap manipulation is done. This is probably
          > the only reason why interior pointers exist in the new C++. If it wasn't
          > for this case, I probably would have succeeded at keeping them out of the
          > language entirely.
          >[/color]

          Well, as I said in another message, a Bitmap (assumed we talk about
          GDI/GDI+) is stored in unmanaged (GDI+ allocated) memory and accessing them
          directly using pointers is a big perf. win, but that's because accessing
          them using the GDI+ methods is extremely slow.
          [color=blue]
          > --
          > Brandon Bray, Visual C++ Compiler http://blogs.msdn.com/branbray/
          > Bugs? Suggestions? Feedback? http://msdn.microsoft.com/productfeedback/
          >[/color]

          Willy.


          Comment

          • Peter Oliphant

            #20
            Re: are arrays contiguous in memory?

            Wow, you ask a simple question..... : )

            As I understand it, the 'old' STL containers make a distinction in this
            regard. List's are linked lists, and so are not necessarily contiguous in
            memory. Vector's are the STL equivalent of contiguous lists. The reason for
            both is that vector's are better at lookup performance (randomly accessible
            and therefore a fixed length of lookup time); but suffer when being deleted
            from or resized, since this is often done by moving the entire end of the
            list and allocating a bigger vector somewhere else and copying over,
            respectively. List's are worse at lookup performance since they require
            serial lookup (linking thorugh the list from the header, which is takes a
            variable length of time based on 'distance' from header), but benefit when
            it comes to deletion and resizing since an element can be removed without
            moving anything else, and resizing just means finding a large enough memory
            space anywhere for the added element.

            My opinion is this: with a faster computer and more memory it really doesn't
            matter which method is used! : )

            [==P==]

            "Willy Denoyette [MVP]" <willy.denoyett e@telenet.be> wrote in message
            news:umXF6pNBGH A.3760@tk2msftn gp13.phx.gbl...[color=blue]
            >
            > "Arnaud Debaene" <adebaene@clu b-internet.fr> wrote in message
            > news:Oak7qiNBGH A.2644@TK2MSFTN GP09.phx.gbl...[color=green]
            >> Willy Denoyette [MVP] wrote:[color=darkred]
            >>> Do you guy's know any "arrays" that aren't contigious?[/color]
            >>
            >> A double-linked list?
            >>
            >> Arnaud
            >> MVP - VC
            >>
            >>[/color]
            >
            > Why a double linked list and not a linked list?
            > Anyway, it's not what I would call an array (or a vector).
            >
            > Willy.
            >
            >[/color]


            Comment

            • Arnaud Debaene

              #21
              Re: are arrays contiguous in memory?

              Peter Oliphant wrote:[color=blue]
              > Wow, you ask a simple question..... : )
              >
              > As I understand it, the 'old' STL containers make a distinction in
              > this regard. List's are linked lists, and so are not necessarily
              > contiguous in memory. Vector's are the STL equivalent of contiguous
              > lists. The reason for both is that vector's are better at lookup
              > performance (randomly accessible and therefore a fixed length of
              > lookup time); but suffer when being deleted from or resized, since
              > this is often done by moving the entire end of the list and
              > allocating a bigger vector somewhere else and copying over,
              > respectively. List's are worse at lookup performance since they
              > require serial lookup (linking thorugh the list from the header,
              > which is takes a variable length of time based on 'distance' from
              > header), but benefit when it comes to deletion and resizing since an
              > element can be removed without moving anything else, and resizing
              > just means finding a large enough memory space anywhere for the added
              > element.[/color]
              And there is also in intermediate "mix" between the two : the deque.
              [color=blue]
              >
              > My opinion is this: with a faster computer and more memory it really
              > doesn't matter which method is used! : )[/color]
              This is of course basically wrong : processor speed has never changed
              anything to algorithm complexity. When manipulating great number of elements
              (several tens of thousands), the difference between logarithmic and linear
              complexity is as important today as it was 10 years ago.

              Arnaud
              MVP - VC


              Comment

              • Peter Oliphant

                #22
                Re: are arrays contiguous in memory?

                > This is of course basically wrong : processor speed has never changed[color=blue]
                > anything to algorithm complexity. When manipulating great number of
                > elements (several tens of thousands), the difference between logarithmic
                > and linear complexity is as important today as it was 10 years ago.[/color]

                Since in my days I've had to write 'clever' code to combat slow computer
                speed, I have to disagree with any blanketing statement along the lines of
                "processor speed has never changed anything to algorithm complexity".

                For example, this is very algorithmically simple:

                int Square( int X )
                {
                return X * X ;
                }

                This however, is not:

                int Square( int X )
                {
                if ( X < 0 ) { X = -X ; }
                int Y = 1 ;
                int DY = 3 ;
                for ( int i = 1 ; i < X ; i++ )
                {
                Y += DY ;
                DY += 2 ;
                }
                return Y ;
                }

                These methods both do the same thing (if you don't believe me, test them!).
                But, in the 'old days' the former method would have been the wrong way to go
                for small values of X since in those the days people counted clock cycles,
                and the multiply instruction took about 50-100 longers to do than the add
                instruction.

                But since computers, through piping and other methods, have gotten adds and
                mutliplies down to the same 1-cycle speed, nowadays the algorithmically
                simpler method is the way to go. That's because today this algorithmically
                simpler method executes faster than the 'clever' one (and as a side benefit
                saves memory, is more intuitive, is less prone to error, and can be
                programmed in less time).

                So, theoretically you are correct. In practice, you're not-so-correct... : )

                [==P==]

                PS - A third method would be to create a lookup table by pre-calculating all
                the squares and look them up, and works for even large values of X. This is
                even more algorithmically complex as it requires a setup phase. But look-up
                tables were the norm until a few years ago when computers got fast enough to
                do things 'on the fly'. This is why we can entertain 'procedural textures'
                in 3D worlds, something that would have been algorithmic suicide to attempt
                when computers were slower and more inefficient when dealing with
                graphics...


                "Arnaud Debaene" <adebaene@clu b-internet.fr> wrote in message
                news:OM3rtROBGH A.3408@TK2MSFTN GP12.phx.gbl...[color=blue]
                > Peter Oliphant wrote:[color=green]
                >> Wow, you ask a simple question..... : )
                >>
                >> As I understand it, the 'old' STL containers make a distinction in
                >> this regard. List's are linked lists, and so are not necessarily
                >> contiguous in memory. Vector's are the STL equivalent of contiguous
                >> lists. The reason for both is that vector's are better at lookup
                >> performance (randomly accessible and therefore a fixed length of
                >> lookup time); but suffer when being deleted from or resized, since
                >> this is often done by moving the entire end of the list and
                >> allocating a bigger vector somewhere else and copying over,
                >> respectively. List's are worse at lookup performance since they
                >> require serial lookup (linking thorugh the list from the header,
                >> which is takes a variable length of time based on 'distance' from
                >> header), but benefit when it comes to deletion and resizing since an
                >> element can be removed without moving anything else, and resizing
                >> just means finding a large enough memory space anywhere for the added
                >> element.[/color]
                > And there is also in intermediate "mix" between the two : the deque.
                >[color=green]
                >>
                >> My opinion is this: with a faster computer and more memory it really
                >> doesn't matter which method is used! : )[/color]
                > This is of course basically wrong : processor speed has never changed
                > anything to algorithm complexity. When manipulating great number of
                > elements (several tens of thousands), the difference between logarithmic
                > and linear complexity is as important today as it was 10 years ago.
                >
                > Arnaud
                > MVP - VC
                >
                >[/color]


                Comment

                • alextha@microsoft.com

                  #23
                  Re: are arrays contiguous in memory?

                  One thing that is missing from this discussion is the importance of cache effects when iterating. If your vector accesses are distributed all over memory like they might be with a linked list that has been sorted, then your cache miss rate will be significantly higher than if you iterate over a contiguous block of memory. This effect is drastic because L1 cache can be well over 20 times faster than RAM.

                  Another thing here is the fact that modern compilers have extremely good optimizations for looping over contiguous arrays because the compiler knows that it is working with contiguous memory.

                  That said, depending on what you are doing, it is not always worth it to sacrifice an order of complexity on an algorithm just to get these benefits. The two things I describe above are only constant time improvements, while dispersed memory vectors (such as linked lists) can sometimes give you order N or greater improvements over contiguous arrays.

                  We kind of got off topic from the original question but this is a good discussion!

                  -Alex

                  -----Original Message-----
                  From: Arnaud Debaene
                  Posted At: Monday, December 19, 2005 1:38 PM
                  Posted To: microsoft.publi c.dotnet.langua ges.vc
                  Conversation: are arrays contiguous in memory?
                  Subject: Re: are arrays contiguous in memory?


                  Peter Oliphant wrote:[color=blue]
                  > Wow, you ask a simple question..... : )
                  >
                  > As I understand it, the 'old' STL containers make a distinction in
                  > this regard. List's are linked lists, and so are not necessarily
                  > contiguous in memory. Vector's are the STL equivalent of contiguous
                  > lists. The reason for both is that vector's are better at lookup
                  > performance (randomly accessible and therefore a fixed length of
                  > lookup time); but suffer when being deleted from or resized, since
                  > this is often done by moving the entire end of the list and
                  > allocating a bigger vector somewhere else and copying over,
                  > respectively. List's are worse at lookup performance since they
                  > require serial lookup (linking thorugh the list from the header,
                  > which is takes a variable length of time based on 'distance' from
                  > header), but benefit when it comes to deletion and resizing since an
                  > element can be removed without moving anything else, and resizing
                  > just means finding a large enough memory space anywhere for the added
                  > element.[/color]
                  And there is also in intermediate "mix" between the two : the deque.
                  [color=blue]
                  >
                  > My opinion is this: with a faster computer and more memory it really
                  > doesn't matter which method is used! : )[/color]
                  This is of course basically wrong : processor speed has never changed
                  anything to algorithm complexity. When manipulating great number of elements
                  (several tens of thousands), the difference between logarithmic and linear
                  complexity is as important today as it was 10 years ago.

                  Arnaud
                  MVP - VC

                  Comment

                  • Brian Muth

                    #24
                    Re: are arrays contiguous in memory?

                    > These methods both do the same thing (if you don't believe me, test[color=blue]
                    > them!). But, in the 'old days' the former method would have been the wrong
                    > way to go for small values of X since in those the days people counted
                    > clock cycles, and the multiply instruction took about 50-100 longers to
                    > do than the add instruction.
                    >
                    > But since computers, through piping and other methods, have gotten adds
                    > and mutliplies down to the same 1-cycle speed, nowadays the
                    > algorithmically simpler method is the way to go. That's because today this
                    > algorithmically simpler method executes faster than the 'clever' one (and
                    > as a side benefit saves memory, is more intuitive, is less prone to error,
                    > and can be programmed in less time).
                    >
                    > So, theoretically you are correct. In practice, you're not-so-correct...
                    > : )[/color]

                    This argument is entirely irrelevant.

                    Arnaud has correctly pointed out that lookups with lists are of O(n) whereas
                    lookups with vectors are of constant order. Advances in processor speeds
                    don't change this. Advances in floating point arithmetic don't change this.
                    He has correctly challenged your statement that "with a faster computer and
                    more memory it really doesn't matter which method is used!". Indeed, for
                    large amounts of data, it matters a lot.

                    Best regards,

                    Brian


                    Comment

                    • Holger Grund

                      #25
                      Re: are arrays contiguous in memory?

                      "Brandon Bray [MSFT]" <branbray@onlin e.microsoft.com > wrote
                      [color=blue]
                      > Of course, that's a trade off between safety and performance. One should
                      > be very careful when making that decision. Because the JIT compiler has
                      > the ability to eliminate bounds checks in some cases, switching to
                      > pointers is not always a worthwhile exercise. Though, it is definitely
                      > useful in some cases where bitmap manipulation is done. This is probably
                      > the only reason why interior pointers exist in the new C++. If it wasn't
                      > for this case, I probably would have succeeded at keeping them out of the
                      > language entirely.
                      >[/color]
                      Ok, maybe I got too many islays today. But you can't be serious about that,
                      can you?

                      Am I missing something or do you say (plain) pointers are (and have always
                      been)
                      useless? What about by-ref params in managed signatures?

                      -hg


                      Comment

                      • Tamas Demjen

                        #26
                        Re: are arrays contiguous in memory?

                        Willy Denoyette [MVP] wrote:
                        [color=blue]
                        > That's true, though the overhead is real small (something like a cmp and a
                        > jmp) and only important to consider for some operations on arrays.[/color]

                        In most cases it is small, but if you have 200 million bytes to iterate
                        through in a large scanned image (let's say to apply a lookup table to
                        each pixel), it's better to go back to unmanaged. It's not that hard to
                        make such operations pretty safe. So it's good to know that a pinned
                        array is contiguous. In an average algorithm with 100 items in an array
                        the performance most likely doesn't matter and it's better to be on the
                        safe side.

                        Tom

                        Comment

                        • David Wilkinson

                          #27
                          Re: are arrays contiguous in memory?

                          Peter Oliphant wrote:[color=blue][color=green]
                          >>This is of course basically wrong : processor speed has never changed
                          >>anything to algorithm complexity. When manipulating great number of
                          >>elements (several tens of thousands), the difference between logarithmic
                          >>and linear complexity is as important today as it was 10 years ago.[/color]
                          >
                          >
                          > Since in my days I've had to write 'clever' code to combat slow computer
                          > speed, I have to disagree with any blanketing statement along the lines of
                          > "processor speed has never changed anything to algorithm complexity".
                          >
                          > For example, this is very algorithmically simple:
                          >
                          > int Square( int X )
                          > {
                          > return X * X ;
                          > }
                          >
                          > This however, is not:
                          >
                          > int Square( int X )
                          > {
                          > if ( X < 0 ) { X = -X ; }
                          > int Y = 1 ;
                          > int DY = 3 ;
                          > for ( int i = 1 ; i < X ; i++ )
                          > {
                          > Y += DY ;
                          > DY += 2 ;
                          > }
                          > return Y ;
                          > }
                          >
                          > These methods both do the same thing (if you don't believe me, test them!).
                          > But, in the 'old days' the former method would have been the wrong way to go
                          > for small values of X since in those the days people counted clock cycles,
                          > and the multiply instruction took about 50-100 longers to do than the add
                          > instruction.
                          >[/color]
                          [snip]

                          Peter:

                          Cute, but actually your function does not work for X=0. A correct (and
                          likely more efficient) version of the same idea is:

                          int Square( int X )
                          {
                          if ( X < 0 ) { X = -X ; }
                          int Y = 0 ;
                          for ( int i = 1 ; i < X ; i++ )
                          {
                          Y += i ;
                          }
                          return Y + Y + X ;
                          }

                          A simple way to see that this is correct is to observe that Y is a sum
                          of (X-1) terms whose average is X/2 (average of first and last term).

                          David Wilkinson

                          Comment

                          • Peteroid

                            #28
                            Re: are arrays contiguous in memory?

                            > He has correctly challenged your statement that "with a faster computer[color=blue]
                            > and more memory it really doesn't matter which method is used!". Indeed,
                            > for large amounts of data, it matters a lot.[/color]

                            Except this statement is taken out of context. My ORIGINAL statement was
                            made in an obviously jovial manner (note smiley at end that has been omitted
                            since), it was about the SPECIFIC problem regarding contiguous memory (the
                            topic of this thread), and the 'challenge' was then made in the GENERAL
                            case. That's not fair... : )

                            To be serious here, we are talking about two different things here. I'm
                            talking about how processor speed makes a big difference in algorithmic
                            design choice in most real world applications. You guys are talking about
                            computer science. Apples and ibms - er - oranges...

                            [==P==]


                            "Brian Muth" <bmuth@mvps.org > wrote in message
                            news:ulmr4dPBGH A.2040@TK2MSFTN GP14.phx.gbl...[color=blue][color=green]
                            >> These methods both do the same thing (if you don't believe me, test
                            >> them!). But, in the 'old days' the former method would have been the
                            >> wrong way to go for small values of X since in those the days people
                            >> counted clock cycles, and the multiply instruction took about 50-100
                            >> longers to do than the add instruction.
                            >>
                            >> But since computers, through piping and other methods, have gotten adds
                            >> and mutliplies down to the same 1-cycle speed, nowadays the
                            >> algorithmically simpler method is the way to go. That's because today
                            >> this algorithmically simpler method executes faster than the 'clever' one
                            >> (and as a side benefit saves memory, is more intuitive, is less prone to
                            >> error, and can be programmed in less time).
                            >>
                            >> So, theoretically you are correct. In practice, you're not-so-correct...
                            >> : )[/color]
                            >
                            > This argument is entirely irrelevant.
                            >
                            > Arnaud has correctly pointed out that lookups with lists are of O(n)
                            > whereas lookups with vectors are of constant order. Advances in processor
                            > speeds don't change this. Advances in floating point arithmetic don't
                            > change this. He has correctly challenged your statement that "with a
                            > faster computer and more memory it really doesn't matter which method is
                            > used!". Indeed, for large amounts of data, it matters a lot.
                            >
                            > Best regards,
                            >
                            > Brian
                            >
                            >[/color]


                            Comment

                            • Rodrigo Corral [MVP]

                              #29
                              Re: are arrays contiguous in memory?

                              I agree that pin is not needed.
                              The point in the article is that you get better performance using pointer
                              arithmetic, so in some scenarios to use pointer arithmetic is valuable.

                              Carl Daniel wrote:

                              "Not of much (any?) value unless you're writing unmanaged code."

                              I wanted to show that pointer arithmetic is valuable even in unmanaged code.

                              --
                              Un saludo
                              Rodrigo Corral González [MVP]

                              FAQ de microsoft.publi c.es.vc++



                              Comment

                              • Peter Oliphant

                                #30
                                Re: are arrays contiguous in memory?

                                Dang, you're right! I needed to add the line at the top:

                                if ( X == 0 ) { return 0 ; }

                                And your Square( ) method is cool too! : )

                                [==P==]

                                "David Wilkinson" <no-reply@effisols. com> wrote in message
                                news:O2nkFERBGH A.532@TK2MSFTNG P15.phx.gbl...[color=blue]
                                > Peter Oliphant wrote:[color=green][color=darkred]
                                >>>This is of course basically wrong : processor speed has never changed
                                >>>anything to algorithm complexity. When manipulating great number of
                                >>>elements (several tens of thousands), the difference between logarithmic
                                >>>and linear complexity is as important today as it was 10 years ago.[/color]
                                >>
                                >>
                                >> Since in my days I've had to write 'clever' code to combat slow computer
                                >> speed, I have to disagree with any blanketing statement along the lines
                                >> of "processor speed has never changed anything to algorithm complexity".
                                >>
                                >> For example, this is very algorithmically simple:
                                >>
                                >> int Square( int X )
                                >> {
                                >> return X * X ;
                                >> }
                                >>
                                >> This however, is not:
                                >>
                                >> int Square( int X )
                                >> {
                                >> if ( X < 0 ) { X = -X ; }
                                >> int Y = 1 ;
                                >> int DY = 3 ;
                                >> for ( int i = 1 ; i < X ; i++ )
                                >> {
                                >> Y += DY ;
                                >> DY += 2 ;
                                >> }
                                >> return Y ;
                                >> }
                                >>
                                >> These methods both do the same thing (if you don't believe me, test
                                >> them!). But, in the 'old days' the former method would have been the
                                >> wrong way to go for small values of X since in those the days people
                                >> counted clock cycles, and the multiply instruction took about 50-100
                                >> longers to do than the add instruction.
                                >>[/color]
                                > [snip]
                                >
                                > Peter:
                                >
                                > Cute, but actually your function does not work for X=0. A correct (and
                                > likely more efficient) version of the same idea is:
                                >
                                > int Square( int X )
                                > {
                                > if ( X < 0 ) { X = -X ; }
                                > int Y = 0 ;
                                > for ( int i = 1 ; i < X ; i++ )
                                > {
                                > Y += i ;
                                > }
                                > return Y + Y + X ;
                                > }
                                >
                                > A simple way to see that this is correct is to observe that Y is a sum of
                                > (X-1) terms whose average is X/2 (average of first and last term).
                                >
                                > David Wilkinson[/color]


                                Comment

                                Working...