static char arrays size

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • daniel

    static char arrays size

    Hello ,


    I always had the feeling that is better to have char arrays with the
    size equal to a power of two.

    For example:
    char a_str[128]; // feels ok
    char b_str[100]; //feels not ok.

    Is that true , and if is true then what is the reason behind this? ( I
    suppose that is something related to memory alignment but it's not
    very clear in my mind ).

    A detailed explanation would be very useful.

    Thx,
    Daniel.
  • Bartc

    #2
    Re: static char arrays size


    "daniel" <daniel.baluta@ gmail.comwrote in message
    news:691361da-a884-4bde-b260-7ec77c61bbda@79 g2000hsk.google groups.com...
    Hello ,
    >
    >
    I always had the feeling that is better to have char arrays with the
    size equal to a power of two.
    >
    For example:
    char a_str[128]; // feels ok
    char b_str[100]; //feels not ok.
    >
    Is that true , and if is true then what is the reason behind this? ( I
    suppose that is something related to memory alignment but it's not
    very clear in my mind ).
    You're thinking about array element sizes, where the calculation of the
    address of a[i] (or the offset from a[0]) can be simpler when the size of
    each element of a is a power of two.

    (Although alignment can also come into it, when the element is a struct for
    example. But this should be taken care of for you.)

    --
    Bartc

    Comment

    • James Kuyper

      #3
      Re: static char arrays size

      daniel wrote:
      Hello ,
      >
      >
      I always had the feeling that is better to have char arrays with the
      size equal to a power of two.
      >
      For example:
      char a_str[128]; // feels ok
      char b_str[100]; //feels not ok.
      >
      Is that true , and if is true then what is the reason behind this? ( I
      suppose that is something related to memory alignment but it's not
      very clear in my mind ).
      >
      A detailed explanation would be very useful.
      An array should always be big enough to hold the biggest thing that you
      plan to put in it. Making it any bigger than that is, in general,
      wasteful. Therefore, you only have a choice to make if you're still at
      the point in the design process where you get to decide what you plan to
      put into that array, and where the answer to that question is not forced
      upon you by other considerations.

      In that case, a length that is a power of 2 does indeed provide a minor
      benefit. Many computer systems push around data more efficiently when it
      has a size which is a multiple of a certain number of bytes. There's no
      guarantees that any such sizes exist, and there's no guarantees that
      they will be powers of 2. However, as a practical matter the special
      sizes are almost always powers of 2.

      Since every power of 2 is a multiple of all smaller powers of 2,
      rounding an array size up to the next power of 2 is a choice that in
      general won't hurt, an on many systems will slightly improve the
      efficiency of your code. However, I would strongly recommend not making
      a big deal about this. Just about any other reason for choosing a
      particular size for the array is going to be more important than this
      one. Use this criterion only when all other constraints still leave you
      with a choice to make.

      Please note that this doesn't apply just to arrays of char; it applies
      to all arrays.

      Comment

      • Bartc

        #4
        Re: static char arrays size


        "James Kuyper" <jameskuyper@ve rizon.netwrote in message
        news:rxVnk.739$ 7N1.719@trnddc0 6...
        daniel wrote:
        >Hello ,
        >>
        >>
        >I always had the feeling that is better to have char arrays with the
        >size equal to a power of two.
        >>
        >For example:
        > char a_str[128]; // feels ok
        > char b_str[100]; //feels not ok.
        >>
        >Is that true , and if is true then what is the reason behind this? ( I
        >suppose that is something related to memory alignment but it's not
        >very clear in my mind ).
        >>
        >A detailed explanation would be very useful.
        In that case, a length that is a power of 2 does indeed provide a minor
        benefit. Many computer systems push around data more efficiently when it
        has a size which is a multiple of a certain number of bytes.
        I don't quite understand how having an unused 28 bytes on the end of a
        100-byte array is going to help.

        If a machine can do 8-byte moves then there might be a case for a 104-byte
        array size instead of 100, because of the fiddly 4 bytes at the end, but no
        need for 128. That's assuming C can move around entire arrays, which it
        can't directly. And anyway this is a matter for the compiler to worry about.
        Since every power of 2 is a multiple of all smaller powers of 2, rounding
        an array size up to the next power of 2 is a choice that in general won't
        hurt, an on many systems will slightly improve the efficiency of your
        code.
        Rounding a 1048577-byte array up to 2097152 bytes I think would definitely
        hurt.

        --
        Bartc

        Comment

        • James Kuyper

          #5
          Re: static char arrays size

          Bartc wrote:
          >
          "James Kuyper" <jameskuyper@ve rizon.netwrote in message
          news:rxVnk.739$ 7N1.719@trnddc0 6...
          >daniel wrote:
          >>Hello ,
          >>>
          >>>
          >>I always had the feeling that is better to have char arrays with the
          >>size equal to a power of two.
          >>>
          >>For example:
          >> char a_str[128]; // feels ok
          >> char b_str[100]; //feels not ok.
          >>>
          >>Is that true , and if is true then what is the reason behind this? ( I
          >>suppose that is something related to memory alignment but it's not
          >>very clear in my mind ).
          >>>
          >>A detailed explanation would be very useful.
          >
          >In that case, a length that is a power of 2 does indeed provide a
          >minor benefit. Many computer systems push around data more efficiently
          >when it has a size which is a multiple of a certain number of bytes.
          >
          I don't quite understand how having an unused 28 bytes on the end of a
          100-byte array is going to help.
          It won't. The only case where you should ever use this criterion for
          deciding for deciding the length of the array, is if you're still at the
          point of deciding whether to design the program to work with a maximum
          of 100 bytes, or a maximum of 128 bytes. I'm assuming that, regardless
          of what decision is made, there will be cases which use the remaining 28
          bytes; cases which, at that point of the design cycle, you're still free
          to decide whether or not the program should handle those cases as
          normal, or as exceptions to be handled by some other method.

          Prime example: fields meant to hold people's names. There's no
          reasonable fixed length that can deal with every possible name. If
          variable-length fields are not acceptable (which is often the case),
          then the exact length you should use is fairly arbitrary, and might as
          well be chosen based in part upon the minor efficiencies of power-of-2.
          If a machine can do 8-byte moves then there might be a case for a
          104-byte array size instead of 100, because of the fiddly 4 bytes at the
          end, but no need for 128.
          You're assuming 8-bytes is the special number, and for some systems, for
          some purposes, it is. However, there's many different sizes that carry
          special advantages, for different purposes. I've known the technical
          details of systems where sizes of 16 bytes, 128 bytes, 256 bytes and
          1024 bytes have been relevant, for a variety of different reasons. When
          I consider the small number of machines which I'm personally familiar
          with, compared to the huge variety of real-world machines for which C
          code is written, I wouldn't recommend ignoring the possibility that any
          particular power of 2 might be relevant on some system, somewhere. On
          the flip side, I wouldn't recommend attaching any great significance to
          that possibility, either.
          That's assuming C can move around entire
          arrays, which it can't directly.
          I'm not sure what you mean by that comment. I use memcpy() frequently
          for that purpose. Some versions of memcpy() do work more efficiently on
          objects which have particular alignments, and a size which is a multiple
          of that alignment, which is usually a power of 2.
          ... And anyway this is a matter for the
          compiler to worry about.
          The issue I'm thinking about cannot be decided by the compiler; it is
          inherently exclusively in the domain of the designer of a program. You
          may be thinking of a different issue than I am.
          >Since every power of 2 is a multiple of all smaller powers of 2,
          >rounding an array size up to the next power of 2 is a choice that in
          >general won't hurt, an on many systems will slightly improve the
          >efficiency of your code.
          >
          Rounding a 1048577-byte array up to 2097152 bytes I think would
          definitely hurt.
          If you know that 1048577 bytes is guaranteed to be sufficient, that's
          the size you should use. The requirements for reasonable use of this
          reason for selecting the size of an array to be pow(2,N) are as follows:
          the chance that a size greater than 'n' will be needed is unacceptably
          high for n==pow(2,N-1), and is acceptably low at n==pow(2,N+1), and the
          transition point between "unacceptab ly high" and "acceptably low" is
          sufficiently poorly known to justify estimating it as pow(2,N).

          To use my example above; if I had a complete list of every name that
          might ever be entered in my database, or at least a statistical summary
          of the characteristics of such a list, and sufficient knowledge of how
          those names might be used, I could put together a mathematical model
          that calculates the costs and benefits of using different lengths for
          the name field, and solve that model to determine the optimal length. If
          the issue was sufficiently important, that is precisely what I would do,
          and I would set the field length to that optimum. However, it's
          relatively rare to have that kind of information available for free-form
          text fields, and it's not unusual to have a similar lack of information
          for other kinds of arrays. Even if it is technically feasible to collect
          the relevant information, it might be unacceptably expensive. Under such
          circumstances, rounding a rough estimate of the space needed to the next
          power of 2 is entirely reasonable.

          Comment

          • Flash Gordon

            #6
            Re: static char arrays size

            Bartc wrote, On 11/08/08 11:57:
            >
            "daniel" <daniel.baluta@ gmail.comwrote in message
            news:691361da-a884-4bde-b260-7ec77c61bbda@79 g2000hsk.google groups.com...
            >Hello ,
            >>
            >>
            >I always had the feeling that is better to have char arrays with the
            >size equal to a power of two.
            >>
            >For example:
            > char a_str[128]; // feels ok
            > char b_str[100]; //feels not ok.
            >>
            >Is that true , and if is true then what is the reason behind this? ( I
            >suppose that is something related to memory alignment but it's not
            >very clear in my mind ).
            >
            You're thinking about array element sizes, where the calculation of the
            address of a[i] (or the offset from a[0]) can be simpler when the size of
            each element of a is a power of two.
            I've not come across that being an issue.
            (Although alignment can also come into it, when the element is a struct
            for example. But this should be taken care of for you.)
            Indeed.

            The one (and only) time where I seriously go for a power of two size is
            when I'm implementing a circular buffer. Then I know that my "modulus
            buffer_size" operation can be reduced to applying a simple mask. I still
            use the modulus operator and leave it up to the compiler to optimise it
            so that if someone changes the buffer size it all continues to work even
            if the efficiency is reduced.
            --
            Flash Gordon

            Comment

            • Bartc

              #7
              Re: static char arrays size

              "Flash Gordon" <spam@flash-gordon.me.ukwro te in message
              news:bgf6n5x4ak .ln2@news.flash-gordon.me.uk...
              Bartc wrote, On 11/08/08 11:57:
              >>
              >"daniel" <daniel.baluta@ gmail.comwrote in message
              >news:691361d a-a884-4bde-b260-7ec77c61bbda@79 g2000hsk.google groups.com...
              >>Hello ,
              >>>
              >>>
              >>I always had the feeling that is better to have char arrays with the
              >>size equal to a power of two.
              >>>
              >>For example:
              >> char a_str[128]; // feels ok
              >> char b_str[100]; //feels not ok.
              >>>
              >>Is that true , and if is true then what is the reason behind this? ( I
              >>suppose that is something related to memory alignment but it's not
              >>very clear in my mind ).
              >>
              >You're thinking about array element sizes, where the calculation of the
              >address of a[i] (or the offset from a[0]) can be simpler when the size of
              >each element of a is a power of two.
              >
              I've not come across that being an issue.
              It can mean the difference between having to multiply to get at an array
              element, and simply shifting (or, sometimes, just using a scale factor in
              the address mode of the output code).
              The one (and only) time where I seriously go for a power of two size is
              when I'm implementing a circular buffer. Then I know that my "modulus
              buffer_size" operation can be reduced to applying a simple mask. I still
              use the modulus operator and leave it up to the compiler to optimise it so
              that if someone changes the buffer size it all continues to work even if
              the efficiency is reduced.
              I sometimes do the same thing for hashtables.

              But unless there's a good reason for it, there's no particular need for
              power-of-two array sizes.

              In the past I did do this a lot, for good reasons (for example an array of
              4096 bytes would exactly fit into one page of my pseudo-virtual memory
              system, or I had exactly N bits available for indexing so might as well use
              a full 2**N elements).


              --
              Bartc

              Comment

              • christian.bau

                #8
                Re: static char arrays size

                On Aug 11, 9:44 pm, "Bartc" <b...@freeuk.co mwrote:
                It can mean the difference between having to multiply to get at an array
                element, and simply shifting (or, sometimes, just using a scale factor in
                the address mode of the output code).
                The size of the array wouldn't make a difference to the code needed to
                access an array element. Where it makes a difference is two-
                dimensional arrays, or an array of structs where you can by design
                influence whether the struct size is a power of two or not.

                HOWEVER, there are processors around that really dislike consecutive
                memory accesses where the distance between consecutive accesses is a
                large power of two. An example that I found was one platform where I
                measured the time for a straightforward matrix multiplication, and
                multiplying two matrices of size 128 x 128 took seven times longer
                than the case 127 x 127 or 129 x 129.

                Comment

                Working...