Value Bits Vs Object Bits

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Tomás

    Value Bits Vs Object Bits


    The quantity of bits used by an unsigned integer type in memory can be
    determined by:


    typedef unsigned long long UIntType;

    CHAR_BIT * sizeof(UIntType )


    However, what would be a good portable way to determine how many of these
    bits are value bits? If possible, it would be nice to have it as a
    compile time constant.


    Here's something I played around with:


    typedef unsigned long long UIntType;


    unsigned GetQuantityValu eBits(void)
    {
    /* We know it's atleast 8 in anyway */

    unsigned quantity = 8;

    UIntType val = 128;

    while (val <<= 1) ++quantity;

    return quantity;
    }


    Also, we could determine the amount of non-value bits (trapping bits
    perhaps) from:

    sizeof(UIntType ) * CHAR_BIT - GetQuantityValu eBits()


    -Tomás
  • Eric Sosman

    #2
    Re: Value Bits Vs Object Bits

    Tomás wrote:
    [color=blue]
    > The quantity of bits used by an unsigned integer type in memory can be
    > determined by:
    >
    >
    > typedef unsigned long long UIntType;
    >
    > CHAR_BIT * sizeof(UIntType )
    >
    >
    > However, what would be a good portable way to determine how many of these
    > bits are value bits? If possible, it would be nice to have it as a
    > compile time constant.[/color]

    Speaking for myself, I've never been able to discover a
    way to write a constant expression for this. However, it's
    easy enough to determine at run-time by starting with a value
    of all-ones and counting how many one-bit shifts are required
    before the value becomes zero. You could also try something
    like ceil(log((UIntT ype)-1)/log(2)).

    It is occasionally annoying that this information is not
    easily available. For many uses an upper bound is all that's
    needed and CHAR_BIT*sizeof (UIntType) is good enough. For a
    different class of uses a range bound is wanted, and you can
    get that from (UIntType)-1. But for things like managing
    "arrays" of single-bit flags packed into larger units, it is
    annoying that the value bits can't be counted at compile time.

    Confession: I usually ignore the possibility of P.B.s and
    just pretend CHAR_BIT*sizeof is the Right Answer. The nasal
    demons haven't punished me yet (for that sin, anyway), but of
    course the Standard makes no guarantee about the timeliness
    of retribution.

    --
    Eric Sosman
    esosman@acm-dot-org.invalid

    Comment

    • Hallvard B Furuseth

      #3
      Re: Value Bits Vs Object Bits

      Eric Sosman writes:[color=blue]
      > Tomás wrote:
      >[color=green]
      >> The quantity of bits used by an unsigned integer type in memory can be
      >> determined by:
      >> typedef unsigned long long UIntType;
      >> CHAR_BIT * sizeof(UIntType )
      >> However, what would be a good portable way to determine how many of
      >> these bits are value bits? If possible, it would be nice to have it as
      >> a compile time constant.[/color]
      >
      > Speaking for myself, I've never been able to discover a
      > way to write a constant expression for this.[/color]

      I have. Posted it before.

      /* Number of bits in inttype_MAX, or in any (1<<k)-1 where 0 <= k < 3.2E+10 */
      #define IMAX_BITS(m) ((m) /((m)%0x3fffffff L+1) /0x3fffffffL %0x3fffffffL *30 \
      + (m)%0x3fffffffL /((m)%31+1)/31%31*5 + 4-12/((m)%31+3))

      So IMAX_BITS((unsi gned_type)-1) computes the number of bits in an
      unsigned integer type. IMAX_BITS(INT_M AX) computes the number of bits
      in an int. Until someone implements a 4-gigabyte integer type:-)

      Or if you are less paranoid about how large UINTMAX_MAX can get:

      /* Number of bits in inttype_MAX, or in any (1<<k)-1 where 0 <= k < 2040 */
      #define IMAX_BITS(m) ((m)/((m)%255+1) / 255%255*8 + 7-86/((m)%255+12))
      [color=blue]
      > Confession: I usually ignore the possibility of P.B.s and
      > just pretend CHAR_BIT*sizeof is the Right Answer. The nasal
      > demons haven't punished me yet (for that sin, anyway), but of
      > course the Standard makes no guarantee about the timeliness
      > of retribution.[/color]

      As long as that just wastes a bit of memory when it's wrong, that's my
      preference too. IMAX_BITS is a fun little hack, but not exactly
      readable.

      Explanation, were 'x**y' means x raised to the power of y:
      Line 1 computes (number of whole 30-bit chunks) * 30:
      For m = (2**(K*n+r))-1 and P = (2**K)-1 with K=30, P=0x3fffffff,
      m = (P+1)**n * 2**r - 1,
      m % P + 1 = 1 * 2**r - 1 + 1 = 2**r when 2**r-1<P so r<K,
      m /(m%P+1) = (P+1)**n - 1
      = ((P+1)-1) * ((P+1)**0 +...+ (P+1)**(n-1)),
      .../P%P*K = ( 1**0 +...+ 1**(n-1)) % P * K
      = n*K when n < P.
      Part 2 does the same to the remainder (m%0x3fffffff) with K=5, P=31.
      Part 3 "happens" to count the final 0-4 bits in m%31=[0/1/3/7/15].
      m % 31 is short for m % 0x3fffffff % 31, because 0x3fffffff % 31 = 0.
      0x3fffffffL is the largest portable 2**x-1 with such a 2**y-1 factor.

      --
      Hallvard

      Comment

      • Thad Smith

        #4
        Re: Value Bits Vs Object Bits

        Hallvard B Furuseth wrote:
        [color=blue]
        > /* Number of bits in inttype_MAX, or in any (1<<k)-1 where 0 <= k < 3.2E+10 */
        > #define IMAX_BITS(m) ((m) /((m)%0x3fffffff L+1) /0x3fffffffL %0x3fffffffL *30 \
        > + (m)%0x3fffffffL /((m)%31+1)/31%31*5 + 4-12/((m)%31+3))
        >
        > So IMAX_BITS((unsi gned_type)-1) computes the number of bits in an
        > unsigned integer type.[/color]

        Great job. Thanks!

        --
        Thad

        Comment

        • Eric Sosman

          #5
          Re: Value Bits Vs Object Bits

          Hallvard B Furuseth wrote:[color=blue]
          > Eric Sosman writes:
          >[color=green]
          >>Tomás wrote:
          >>
          >>[color=darkred]
          >>>The quantity of bits used by an unsigned integer type in memory can be
          >>>determined by:
          >>> typedef unsigned long long UIntType;
          >>> CHAR_BIT * sizeof(UIntType )
          >>>However, what would be a good portable way to determine how many of
          >>>these bits are value bits? If possible, it would be nice to have it as
          >>>a compile time constant.[/color]
          >>
          >> Speaking for myself, I've never been able to discover a
          >>way to write a constant expression for this.[/color]
          >
          >
          > I have. Posted it before.
          >
          > /* Number of bits in inttype_MAX, or in any (1<<k)-1 where 0 <= k < 3.2E+10 */
          > #define IMAX_BITS(m) ((m) /((m)%0x3fffffff L+1) /0x3fffffffL %0x3fffffffL *30 \
          > + (m)%0x3fffffffL /((m)%31+1)/31%31*5 + 4-12/((m)%31+3))[/color]

          Yikes! It looks like an IOCCC snippet, but ... I am in awe.

          --
          Eric Sosman
          esosman@acm-dot-org.invalid

          Comment

          • Hallvard B Furuseth

            #6
            Re: Value Bits Vs Object Bits

            Some time ago I wrote:[color=blue]
            >/* Number of bits in inttype_MAX, or in any (1<<k)-1 where 0 <= k < 3.2E+10 */
            >#define IMAX_BITS(m) ((m) /((m)%0x3fffffff L+1) /0x3fffffffL %0x3fffffffL *30 \
            > + (m)%0x3fffffffL /((m)%31+1)/31%31*5 + 4-12/((m)%31+3))
            >
            > So IMAX_BITS((unsi gned_type)-1) computes the number of bits in an
            > unsigned integer type. IMAX_BITS(INT_M AX) computes the number of bits
            > in an int. Until someone implements a 4-gigabyte integer type:-)[/color]

            Eh. An int has IMAX_BITS(INT_M AX)+1 bits - with the sign bit.

            --
            Hallvard

            Comment

            • Frederick Gotham

              #7
              Re: Value Bits Vs Object Bits

              Hallvard B Furuseth posted:
              [color=blue]
              > #define IMAX_BITS(m) ((m) /((m)%0x3fffffff L+1) /0x3fffffffL
              > %0x3fffffffL *30 \
              > + (m)%0x3fffffffL /((m)%31+1)/31%31*5 +
              > 4-12/((m)%31+3))
              >[/color]


              Are you sure that works? It seems to work a lot of the time... but
              sometimes I get inaccurate values.


              If I enter 2048 into the following program, I get 112 bits:


              #define IMAX_BITS(m) \
              ((m) /((m)%0x3fffffff L+1) /0x3fffffffL %0x3fffffffL *30 \
              + (m)%0x3fffffffL /((m)%31+1)/31%31*5 + 4-12/((m)%31+3))


              #include <stdio.h>
              #include <stdlib.h>

              int main(void)
              {
              for(;;)
              {
              printf( "\nEnter a number: " );

              unsigned long val;

              scanf( "%lu", &val );

              if ( 999 == val ) break;

              printf("\n\nIt consists of: %lu bits\n\n", IMAX_BITS(val) );

              system("PAUSE") ;
              }
              }



              --

              Frederick Gotham

              Comment

              • pete

                #8
                Re: Value Bits Vs Object Bits

                Frederick Gotham wrote:[color=blue]
                >
                > Hallvard B Furuseth posted:
                >[color=green]
                > > #define IMAX_BITS(m) ((m) /((m)%0x3fffffff L+1) /0x3fffffffL
                > > %0x3fffffffL *30 \
                > > + (m)%0x3fffffffL /((m)%31+1)/31%31*5 +
                > > 4-12/((m)%31+3))
                > >[/color]
                >
                > Are you sure that works? It seems to work a lot of the time... but
                > sometimes I get inaccurate values.
                >
                > If I enter 2048 into the following program, I get 112 bits:[/color]

                Me too.
                There are also problems with higher numbers.

                --
                pete

                Comment

                • Eric Sosman

                  #9
                  Re: Value Bits Vs Object Bits

                  Frederick Gotham wrote:
                  [color=blue]
                  > Hallvard B Furuseth posted:
                  >
                  >[color=green]
                  >>#define IMAX_BITS(m) ((m) /((m)%0x3fffffff L+1) /0x3fffffffL
                  >>%0x3fffffff L *30 \
                  >> + (m)%0x3fffffffL /((m)%31+1)/31%31*5 +
                  >> 4-12/((m)%31+3))
                  >>[/color]
                  >
                  >
                  >
                  > Are you sure that works? It seems to work a lot of the time... but
                  > sometimes I get inaccurate values.
                  >
                  >
                  > If I enter 2048 into the following program, I get 112 bits:
                  > [code snipped; it just passes input numbers to the macro][/color]

                  Somewhere along the line, somebody has snipped away the
                  comment that preceded the macro when originally posted:
                  [color=blue]
                  > /* Number of bits in inttype_MAX, or in any (1<<k)-1
                  > where 0 <= k < 3.2E+10 */[/color]

                  That is, the macro is only advertised to work for arguments
                  that are one less than a power of two. It is not a general-
                  purpose compile-time log() function!

                  --
                  Eric Sosman
                  esosman@acm-dot-org.invalid

                  Comment

                  • Frederick Gotham

                    #10
                    Re: Value Bits Vs Object Bits

                    Eric Sosman posted:

                    [color=blue]
                    > Somewhere along the line, somebody has snipped away the
                    > comment that preceded the macro when originally posted:
                    >[color=green]
                    > > /* Number of bits in inttype_MAX, or in any (1<<k)-1
                    > > where 0 <= k < 3.2E+10 */[/color]
                    >
                    > That is, the macro is only advertised to work for arguments
                    > that are one less than a power of two. It is not a general-
                    > purpose compile-time log() function![/color]


                    Hehe... I found the code so confusing that I didn't pay attention to the
                    comments.

                    "One less than a power of two" would refer to numbers where every bit is
                    one, e.g.

                    111

                    11111111

                    11111

                    111111111111


                    Anyhow, the macro is a stroke of genius. I wonder what other gems
                    Hallvard B Furuseth has come up with... ?


                    A compile time "Log base 2" macro would be brilliant...


                    --

                    Frederick Gotham

                    Comment

                    • Hallvard B Furuseth

                      #11
                      Re: Value Bits Vs Object Bits

                      Frederick Gotham writes:[color=blue]
                      >Eric Sosman posted:[color=green]
                      >> Somewhere along the line, somebody has snipped away the
                      >> comment that preceded the macro when originally posted:
                      >> (...)[/color]
                      >
                      > Hehe... I found the code so confusing that I didn't pay attention to
                      > the comments.[/color]

                      You know, some of us do it the other way around:-)
                      [color=blue]
                      > Anyhow, the macro is a stroke of genius. I wonder what other gems
                      > Hallvard B Furuseth has come up with... ?[/color]

                      Not much if you mean C hacks. I just got so annoyed about the lack
                      of a log2 feature that I beat at it until something gave. For some
                      reason my most significant "hack" seems to be that one can use
                      struct align_TYPE_s { char c; TYPE align; };
                      ... offsetof(struct align_TYPE_s, align) ...
                      instead of
                      sizeof(TYPE)
                      for the alignment of TYPE. I've sent that to the maintainers of
                      several programs where they cared about data size. I have no idea
                      why it's apparently not obvious.
                      [color=blue]
                      > A compile time "Log base 2" macro would be brilliant...[/color]

                      I gave that up since IMAX_BITS was hairy enough even when restricted to
                      (1<<k)-1 arguments. Silly of me. With a bunch of helper macros, it
                      works fine. Check me on this though, I've been staring at it too long:

                      /*
                      * Return (v ? floor(log2(v)) : 0) when 0 <= v < 1<<[8, 16, 32, 64].
                      * Inefficient algorithm, intended for compile-time constants.
                      */
                      #define LOG2_8BIT(v) (8 - 90/(((v)/4+14)|1) - 2/((v)/2+1))
                      #define LOG2_16BIT(v) (8*((v)>255) + LOG2_8BIT((v) >>8*((v)>255) ))
                      #define LOG2_32BIT(v) \
                      (16*((v)>65535L ) + LOG2_16BIT((v)* 1L >>16*((v)>65535 L)))
                      #define LOG2_64BIT(v)\
                      (32*((v)/2L>>31 > 0) \
                      + LOG2_32BIT((v)* 1L >>16*((v)/2L>>31 > 0) \[color=blue][color=green]
                      >>16*((v)/2L>>31 > 0)))[/color][/color]

                      Above this, shifting int/long by at most 15/31 bits per operation
                      gets cumbersome. Also the parentheses nesting level is already
                      rather high. But if the compiler doesn't mind approx 14K macro
                      expansions and lots of parentheses, even for simple macro arguments:

                      /*
                      * Return (v ? floor(log2(v)) : 0) when 0 <= v < 1<<512.
                      * Huge computation, intended for compile-time constants.
                      */
                      #define LOG2_512BIT(val )\
                      LOG2_TMP_FINISH (val,\
                      LOG2_TMP_CHUNK( 0, val,\
                      LOG2_TMP_CHUNK( 1, val,\
                      LOG2_TMP_CHUNK( 3, val,\
                      LOG2_TMP_CHUNK( 7, val,\
                      LOG2_TMP_CHUNK( 15, val,\
                      LOG2_TMP_CHUNK( 31, val, 0)))))))
                      #define LOG2_TMP_FINISH (val, known_bits)\
                      (LOG2_8BIT((val ) >> known_bits) + known_bits)
                      #define LOG2_TMP_CHUNK( ones, val, known_bits)\
                      (((val)*1L >>known_bits \[color=blue][color=green]
                      >>8>>ones>>ones >>ones>>ones>>o nes>>ones>>ones >>ones \[/color]
                      > 0) * 8*(ones+1) + known_bits)[/color]

                      Still doesn't scale to "practicall y infinite" like IMAX_BITS does, but
                      I haven't seen a 1024-bit integer yet. (E.g. 4096 bits would need more
                      terms in LOG2_8BIT, 64 bit long long shifts, and 4 lines with >>ones.)

                      On the other hand - the computation doesn't have to be an _expression_.
                      By unpacking the above algorithm into intermediate enum constants, the
                      macro expansion is reduced from around 14K to around 0.8K. Also with
                      declarations we can get an error check by violating constraints if the
                      value is out of range:

                      /*
                      * Define Name = (val ? floor(log2(val) ) : 0) as an enum constant.
                      * Expects 0 <= val < 1<<512. Tries to force a compile error otherwise.
                      * Note: Uses some <Name>__<lett er/digit> identifiers.
                      */
                      #define DEFINE_LOG2(Nam e, val)\
                      enum {\
                      Name##__5= LOG2_TMP_CHUNK( 31, val, 0),\
                      Name##__4= LOG2_TMP_CHUNK( 15, val, Name##__5),\
                      Name##__3= LOG2_TMP_CHUNK( 7, val, Name##__4),\
                      Name##__2= LOG2_TMP_CHUNK( 3, val, Name##__3),\
                      Name##__1= LOG2_TMP_CHUNK( 1, val, Name##__2),\
                      Name##__0= LOG2_TMP_CHUNK( 0, val, Name##__1),\
                      Name = LOG2_TMP_FINISH (val, Name##__0),\
                      Name##__c= (val)>>Name == !!(val) ? 1 : -999 \
                      };\
                      typedef struct{ int Name##__b:Name# #__c; } Name##__t[Name##__c]

                      --
                      Hallvard

                      Comment

                      • Frederick Gotham

                        #12
                        Re: Value Bits Vs Object Bits

                        Hallvard B Furuseth posted:
                        [color=blue]
                        > For some reason my most significant "hack" seems to be that one can
                        > use
                        > struct align_TYPE_s { char c; TYPE align; };
                        > ... offsetof(struct align_TYPE_s, align) ...
                        > instead of
                        > sizeof(TYPE)
                        > for the alignment of TYPE. I've sent that to the maintainers of
                        > several programs where they cared about data size. I have no idea
                        > why it's apparently not obvious.[/color]


                        Would the method you devised be preferable on a system where the
                        alignment requirements of a type are less than its actual size? (e.g. a
                        system where a double is 8 bytes, but need only be aligned on a 4-byte
                        boundary?)

                        Are you sure it's fully-portable? Could the implementation potentially
                        place the char directly behind a byte boundary, thus making your method
                        give an inaccurate alignment value of zero?

                        [color=blue]
                        > With a bunch of helper macros, it works fine. Check me on this
                        > though, I've been staring at it too long:[/color]


                        Give me a few days to mull it over...

                        Have you been at this sort of stuff for many years? I thought I was
                        pretty handy at playing around with things like this, but your own
                        expertise astounds me.

                        I've been taking so long to reply to your posts because I'm staring at
                        the macros trying to get the faintest idea of how they work, and I still
                        haven't begun to comprehend!

                        Give me time though, and I might figure them out...


                        --

                        Frederick Gotham

                        Comment

                        • Michael Mair

                          #13
                          Re: Value Bits Vs Object Bits

                          Frederick Gotham schrieb:[color=blue]
                          > Hallvard B Furuseth posted:
                          >[color=green]
                          >>For some reason my most significant "hack" seems to be that one can
                          >>use
                          >> struct align_TYPE_s { char c; TYPE align; };
                          >> ... offsetof(struct align_TYPE_s, align) ...
                          >>instead of
                          >> sizeof(TYPE)
                          >>for the alignment of TYPE. I've sent that to the maintainers of
                          >>several programs where they cared about data size. I have no idea
                          >>why it's apparently not obvious.[/color]
                          >
                          > Would the method you devised be preferable on a system where the
                          > alignment requirements of a type are less than its actual size? (e.g. a
                          > system where a double is 8 bytes, but need only be aligned on a 4-byte
                          > boundary?)[/color]

                          Yes. This method _might_ then give you the minimal _sensible_
                          alignment.
                          It does so for all compilers I know -- sensible means that
                          there are systems without hard alignment requirements but
                          with preferred alignments for certain types which give you
                          advantages such as faster access.

                          [color=blue]
                          > Are you sure it's fully-portable? Could the implementation potentially
                          > place the char directly behind a byte boundary, thus making your method
                          > give an inaccurate alignment value of zero?[/color]

                          Read again. offsetof will give you zero for c and some value[color=blue]
                          >= (offsetof(..., c)+sizeof align_TYPE_inst ance.c) which is 1.[/color]
                          Of course, the implementation may give you a value greater
                          than sizeof align_TYPE_inst ance.align but this is improbable.

                          Cheers
                          Michael
                          --
                          E-Mail: Mine is an /at/ gmx /dot/ de address.

                          Comment

                          • Hallvard B Furuseth

                            #14
                            Re: Value Bits Vs Object Bits

                            Frederick Gotham writes:
                            >Hallvard B Furuseth posted:
                            >With a bunch of helper macros, it works fine. Check me on this
                            >though, I've been staring at it too long:
                            Oh dear, I _was_ tired. "Check me on this" was not the best thing
                            to say about this one:

                            #define LOG2_8BIT(v) (8 - 90/(((v)/4+14)|1) - 2/((v)/2+1))

                            There is nothing to understand about that except to notice that it works
                            (up to v=255): I just ran a search of expressions like A-B/(v*C+D|E) and
                            A-B/(v/C+D|E), for some giving correct the floor(log2(n)) for n=1..v.

                            I only found variants correct for 6 bits and wanted 8, so I needed two
                            such terms. Quite possibly there are better types of expressions, or
                            some of the above form which I skipped. (Didn't bother to reduce the
                            search space to be small enough that I could check all possibilities.)
                            Have you been at this sort of stuff for many years? I thought I was
                            pretty handy at playing around with things like this, but your own
                            expertise astounds me.
                            Yes, I like playing around with stuff like that. Haven't done so much
                            for a while though.
                            I've been taking so long to reply to your posts because I'm staring at
                            the macros trying to get the faintest idea of how they work, and I
                            still haven't begun to comprehend!
                            >
                            Give me time though, and I might figure them out...
                            #define LOG2_16BIT(v) (8*((v)>255) + LOG2_8BIT((v) >>8*((v)>255) ))

                            8*((v)>255) === ((v)>255 ? 8 : 0). Since I'm using that twice, the
                            macro does the same as

                            #define LOG2_16BIT(v) \
                            ((v) 255 ? 8 + LOG2_8BIT((v) >>8) : 0 + LOG2_8BIT((v) >>0))

                            Not sure if it's really any need to use such tricks to reduce the size
                            of the macro expansion. The expanded size just made me a bit nervous.

                            This does the same given a log2() valid for 32 bits, but more cumbersome
                            - to ensure that ints are shifted at most 15 bits and longs at most 31
                            bits. (v>>N 0) === (v>>N >= 1) === (v <= 1<<N) === log2(v) <= N
                            (if you ignore possible overflow in 1<<N):

                            #define LOG2_64BIT(v)\
                            (32*((v)/2L>>31 0) \
                            + LOG2_32BIT((v)* 1L >>16*((v)/2L>>31 0) \
                            >>16*((v)/2L>>31 0)))
                            LOG2_512BIT() also does the same: 512-bit log2() based on 256-bit log2()
                            based on...based on 8-bit log2(), except it breaks up the shifts to stay
                            below 32-bit shifts. Note that (ones+1) is a power of 2.
                            The enum variant is equivalent, maybe it is easier to read.

                            Finally, "Name##__c= (val)>>Name == !!(val) ? 1 : -999" in the enum is 1
                            if the result is correct according to the spec and -999 otherwise. Used
                            to force a constraint violation in the typedef if the result is wrong.

                            --
                            Hallvard

                            Comment

                            Working...