Array Performance: fix-size vs dynamic

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • iammilind
    New Member
    • Sep 2006
    • 14

    Array Performance: fix-size vs dynamic

    In following codes, both the places 'p' is a global variable. In 1st case it's an array and 2nd case it's a pointer to an array. Do we get any performance benefit, while accessing 'p[ i ]' in the fixed size case ?

    Fixed size array:
    ============
    int p[1000];
    void main ( )
    {
    for(int i = 0; i < 1000; i++)
    p[ i ] = i;
    }

    Dynamic array:
    ===========
    int *p = new int[1000];
    void main ( )
    {
    for(int i = 0; i < 1000; i++)
    p[ i ] = i;
    }
  • code green
    Recognized Expert Top Contributor
    • Mar 2007
    • 1726

    #2
    The first would allocate memoery on the stack,
    The second would allocate memory on the heap.
    Stack access should be faster.
    Why not run a benchmark test
    Code:
    int main() 
    { 
        clock_t start = clock(); 
        int p[10000];
        for(int i = 0; i < 10000; i++)
            p[ i ] = i;
    
        clock_t duration = clock() - start; 
        cout << "stack time " << duration << "\n"; 
        
        start = clock(); 
        int *p = new int[10000];
        for(int i = 0; i < 10000; i++)
            p[ i ] = i;
     
        duration = clock() - start; 
        cout << "heap time " << duration << "\n"; 
    }

    Comment

    • Banfa
      Recognized Expert Expert
      • Feb 2006
      • 9067

      #3
      Originally posted by code green
      Stack access should be faster.
      Complete rubbish. Heap and stack access times are platform dependent, since platforms are not even required to have them.

      However the majority of platforms that have a heap and a stack use the same type of memory to store them and therefore the access times will be the same.

      It may take marginally longer to allocate data from the heap.

      All as verified by running you code on my PC.

      Comment

      • code green
        Recognized Expert Top Contributor
        • Mar 2007
        • 1726

        #4
        Complete rubbish.
        It may take marginally longer to allocate data from the heap.
        Yes I didn't read the post properly and was thinking more about memory allocation instead os access

        Comment

        • iammilind
          New Member
          • Sep 2006
          • 14

          #5
          @Banfa
          Please note that here the comparison is not happening between stack and heap. But the data segment and the heap.
          For this case, assume that we are not accounting the original heap allocation.
          'p' in the 1st case is at data segment; thus its address would be known by compiler.
          'p' in the 2nd case at heap segment is also known; but it's pointing to some location.
          I was just wondering at assembly level, is it possible that the data segment 'p' might become a little faster ?
          I am unsure about it.

          Comment

          • Banfa
            Recognized Expert Expert
            • Feb 2006
            • 9067

            #6
            Originally posted by iammilind
            @Banfa
            Please note that here the comparison is not happening between stack and heap. But the data segment and the heap.
            Not really relevant or if anything even less relevant. The "data segment" can mean a number of things but most commonly either
            1. The area of memory containing heap segment and uninitialised program global/static variables (which will be initialised to 0) called bss and initialised program global/static variables called data.
            2. Initialised program global/static variables.


            Sometimes if using definition 1 the stack segment is included as part of the data segment. Regardless all of those segments generally (and for a PC are specifically) stored in the same type of memory. Once you have the pointer to that memory the method used to allocate that pointer, be it stack, heap or program data is irrelevance to the speed at which the memory is accessed.

            Comment

            • Banfa
              Recognized Expert Expert
              • Feb 2006
              • 9067

              #7
              Take this program

              Code:
              int ugarray[5];
              int igarray[5] = {10,9,8,7,6};
              
              int get1(int *array);
              
              int main()
              {
              	int sarray[5];
              	int* harray = new int[5];
              	int ix;
              
              	ugarray[1] = 4;
              	igarray[1] = 4;
              	sarray[1] = 4;
              	harray[1] = 4;
              
              	ix = ugarray[1];
              	ix = igarray[1];
              	ix = sarray[1];
              	ix = harray[1];
              
              	get1(ugarray);
              	get1(igarray);
              	get1(sarray);
              	get1(harray);
              
              	delete harray;
              
                return 0;
              }
              
              int get1(int *array)
              {
              	return array[1];
              }
              It creates all types of arrays and accesses them for both read and write.

              Take a look at the assembler
              Code:
              int main()
              {
              0042D660  push        ebp  
              0042D661  mov         ebp,esp 
              0042D663  sub         esp,10Ch 
              0042D669  push        ebx  
              0042D66A  push        esi  
              0042D66B  push        edi  
              0042D66C  lea         edi,[ebp-10Ch] 
              0042D672  mov         ecx,43h 
              0042D677  mov         eax,0CCCCCCCCh 
              0042D67C  rep stos    dword ptr es:[edi] 
              	int sarray[5];
              	int* harray = new int[5];
              0042D67E  push        14h  
              0042D680  call        operator new (42C072h) 
              0042D685  add         esp,4 
              0042D688  mov         dword ptr [ebp-108h],eax 
              0042D68E  mov         eax,dword ptr [ebp-108h] 
              0042D694  mov         dword ptr [harray],eax 
              	int ix;
              
              	ugarray[1] = 4;
              0042D697  mov         dword ptr [ugarray+4 (495504h)],4 
              	igarray[1] = 4;
              0042D6A1  mov         dword ptr [igarray+4 (494004h)],4 
              	sarray[1] = 4;
              0042D6AB  mov         dword ptr [ebp-14h],4 
              	harray[1] = 4;
              0042D6B2  mov         eax,dword ptr [harray] 
              0042D6B5  mov         dword ptr [eax+4],4 
              
              	ix = ugarray[1];
              0042D6BC  mov         eax,dword ptr [ugarray+4 (495504h)] 
              0042D6C1  mov         dword ptr [ix],eax 
              	ix = igarray[1];
              0042D6C4  mov         eax,dword ptr [igarray+4 (494004h)] 
              0042D6C9  mov         dword ptr [ix],eax 
              	ix = sarray[1];
              0042D6CC  mov         eax,dword ptr [ebp-14h] 
              0042D6CF  mov         dword ptr [ix],eax 
              	ix = harray[1];
              0042D6D2  mov         eax,dword ptr [harray] 
              0042D6D5  mov         ecx,dword ptr [eax+4] 
              0042D6D8  mov         dword ptr [ix],ecx 
              
              	get1(ugarray);
              0042D6DB  push        offset ugarray (495500h) 
              0042D6E0  call        get1 (42C32Eh) 
              0042D6E5  add         esp,4 
              	get1(igarray);
              0042D6E8  push        offset igarray (494000h) 
              0042D6ED  call        get1 (42C32Eh) 
              0042D6F2  add         esp,4 
              	get1(sarray);
              0042D6F5  lea         eax,[sarray] 
              0042D6F8  push        eax  
              0042D6F9  call        get1 (42C32Eh) 
              0042D6FE  add         esp,4 
              	get1(harray);
              0042D701  mov         eax,dword ptr [harray] 
              0042D704  push        eax  
              0042D705  call        get1 (42C32Eh) 
              0042D70A  add         esp,4 
              
              	delete harray;
              0042D70D  mov         eax,dword ptr [harray] 
              0042D710  mov         dword ptr [ebp-0FCh],eax 
              0042D716  mov         ecx,dword ptr [ebp-0FCh] 
              0042D71C  push        ecx  
              0042D71D  call        operator delete (42B5EBh) 
              0042D722  add         esp,4 
              
                return 0;
              0042D725  xor         eax,eax 
              }
              0042D727  push        edx  
              0042D728  mov         ecx,ebp 
              0042D72A  push        eax  
              0042D72B  lea         edx,[ (42D74Ch)] 
              0042D731  call        @ILT+1470(@_RTC_CheckStackVars@8) (42B5C3h) 
              0042D736  pop         eax  
              0042D737  pop         edx  
              0042D738  pop         edi  
              0042D739  pop         esi  
              0042D73A  pop         ebx  
              0042D73B  add         esp,10Ch 
              0042D741  cmp         ebp,esp 
              0042D743  call        @ILT+3420(__RTC_CheckEsp) (42BD61h) 
              0042D748  mov         esp,ebp 
              0042D74A  pop         ebp  
              0042D74B  ret
              And for get1
              Code:
              int get1(int *array)
              {
              0042D860  push        ebp  
              0042D861  mov         ebp,esp 
              0042D863  sub         esp,0C0h 
              0042D869  push        ebx  
              0042D86A  push        esi  
              0042D86B  push        edi  
              0042D86C  lea         edi,[ebp-0C0h] 
              0042D872  mov         ecx,30h 
              0042D877  mov         eax,0CCCCCCCCh 
              0042D87C  rep stos    dword ptr es:[edi] 
              	return array[1];
              0042D87E  mov         eax,dword ptr [array] 
              0042D881  mov         eax,dword ptr [eax+4] 
              }
              0042D884  pop         edi  
              0042D885  pop         esi  
              0042D886  pop         ebx  
              0042D887  mov         esp,ebp 
              0042D889  pop         ebp  
              0042D88A  ret
              A number of things are obvious (but note this only holds for the platform I am using here, Windows XP Visual Studio 2008).

              uninitialised global, initialised global and stack variables all use the same instructions for access. A heap array using these instructions too but has to start by loading the pointer to the array. That is natural the heap array is accessed via an intermediary variable.

              This might suggest that heap access will be a lot slower, it appears that it take 2 instructions to read where everything else takes 1. However a good optimising compiler is going to notice the constant loading of the pointer in harray and will endeavour, and normally succeed to optimise away most of them just leaving the read/write instruction making access to the heap about the same as anywhere else.

              Next look at get1. Everyone can pass their pointer to get1 and it executes exactly the same set of instructions. My point here is that you rarely operate directly on data objects, you most often are passing them to other functions via a pointer and once you have done that you really have completely levelled the playing field.

              Comment

              • iammilind
                New Member
                • Sep 2006
                • 14

                #8
                Hi Banfa,
                This was the exact point i was trying to make.
                In my particular requirement, I have the ability to use the data segment directly. If I use heap segment with global pointer, then as your assembly suggests, it will create one extra indirection.
                Thus the heap access will be slower compare to the direct data segment access.

                get1() example is naturally same for all segments, because we deal with a simple pointer. But I don't intend to do that.

                Thanks.

                Comment

                • Banfa
                  Recognized Expert Expert
                  • Feb 2006
                  • 9067

                  #9
                  My point is your trying to decided which is the "best" memory to use is mainly a waste of time unless you actually have 2 different banks of memory with different access times. I have worked with processors that had a small amount of on-board RAM you could use to hold some variables. In that situation there was a significant difference in access speed to the different memory banks.

                  Efficiency of the algorithm used is much more likely to have an effect than choosing the "right" memory allocation technique.

                  Accessing an array directly rather than through a pointer may be ever so marginally faster but you would be needing to do an extremely large number of accesses. By definition a heap array is always accessed through a pointer but as I said most of the time a stack or static data array is also accessed through a pointer too eliminating difference.

                  Comment

                  • iammilind
                    New Member
                    • Sep 2006
                    • 14

                    #10
                    No, you got me wrong. I am not deciding which is the better memory to use by this single example.

                    My requirement was something like following:
                    I need an array which will keep record of some info.
                    The array length is decided before the program executes, so naturally the choice will go for static array on data segment.
                    At the same time, the array length has to be configurable by user before the program starts, so naturally the choice will go for heap segment (otherwise we need to always go and edit the file and recompile it);

                    The code needs to be highly optimized. So I was wondering if heap doesn't have any performance difference compared to data segment, then I will straight away choose heap segment itself.
                    Hope I made my 'complex' requirement clear. :)

                    Comment

                    Working...