vector and bool

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • daniel

    vector and bool

    1)
    is C++ smart enough to automatically use "bits" for bool or will
    a bool have the size of a charcter (byte).
    2) The index of a vector is it an integer (4 byte) or a "long long"
    with 8 bytes or something else?

    I ask because I want to use

    vector<bool> with a few billion entries exceeding the int32 range for
    indexing and I want to use as few memory as possible for the whole
    vector<bool>

    e.g.
    vector<bool> B;
    B.resize(2^34);

    thanks, daniel
  • Ivan Vecerina

    #2
    Re: vector and bool

    "daniel" <danielheiserer @netscape.net> wrote in message
    news:b385c098.0 410112357.408a0 c8e@posting.goo gle.com...[color=blue]
    > 1)
    > is C++ smart enough to automatically use "bits" for bool or will
    > a bool have the size of a charcter (byte).[/color]
    The standard C++ library is actually required to specialize the
    std::vector for bool, and to provide 'compact' storage.
    [NB: for many uses, this is actually seen as an inconvenience]
    [color=blue]
    > 2) The index of a vector is it an integer (4 byte) or a "long long"
    > with 8 bytes or something else?[/color]
    This depends on the implementation, but it may well be limited
    to size_t, or a 4-byte unsigned integer on many platforms.
    You can check the return value of vector<bool>::m ax_size() to
    find out.
    [color=blue]
    > I ask because I want to use
    >
    > vector<bool> with a few billion entries exceeding the int32 range for
    > indexing and I want to use as few memory as possible for the whole
    > vector<bool>[/color]
    May we ask for what type of application or what kind of data?
    In many cases, using some form of compression might be better
    that allocating gigabytes of RAM. Does your system have enough
    memory?


    Regards,
    Ivan
    --
    http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form


    Comment

    • JKop

      #3
      Re: vector and bool

      daniel posted:
      [color=blue]
      > 1)
      > is C++ smart enough to automatically use "bits" for bool or will
      > a bool have the size of a charcter (byte).[/color]

      I believe that sizeof(bool) must be >= 1, which implies that a "bool" is of
      one byte. But that doesn't mean that your particular system might do
      something different in the background.

      It's to do with the actual CPU of the system, what is the smallest unit of
      memory that can be accessed, which is called a "byte", which 9 times out of
      10 is equal to "8 bits". Code which would store 8 bools in 1 byte would be
      slower and (ironically) more memory consumptive as the code which
      manipulates the byte would have to be kept in memory.
      [color=blue]
      > 2) The index of a vector is it an integer (4 byte) or a "long long"
      > with 8 bytes or something else?[/color]

      There is no such intrinsic type "long long".

      There is a type "long int", which is guaranteed to be atleast 32-bit on all
      systems, but that doesn't guarantee that it'll be 4 bytes long. For
      instance:

      If "char" were 9-Bit, then although "long" would be atleast 32-Bit, it
      wouldn't necessarily be 4 bytes long, as 4 bytes = 36 bits on that system.
      [color=blue]
      > I ask because I want to use
      >
      > vector<bool> with a few billion entries exceeding the int32 range for
      > indexing and I want to use as few memory as possible for the whole
      > vector<bool>
      >
      > e.g.
      > vector<bool> B;
      > B.resize(2^34);
      >
      > thanks, daniel[/color]

      I'm sure there's people here that've done this before. . .

      I'm not one of them!


      -JKop

      Comment

      • Rolf Magnus

        #4
        Re: vector and bool

        JKop wrote:
        [color=blue]
        > daniel posted:
        >[color=green]
        >> 1)
        >> is C++ smart enough to automatically use "bits" for bool or will
        >> a bool have the size of a charcter (byte).[/color]
        >
        > I believe that sizeof(bool) must be >= 1, which implies that a "bool" is
        > of one byte. But that doesn't mean that your particular system might do
        > something different in the background.[/color]

        AFAIK, some systems use even 4 bytes for a bool, because they can access it
        faster.
        [color=blue]
        > It's to do with the actual CPU of the system, what is the smallest unit of
        > memory that can be accessed, which is called a "byte", which 9 times out
        > of 10 is equal to "8 bits".[/color]

        However, a CPU byte might be different from a C++ byte. I have heared there
        are signal processors that have a smallest memory unit of 24 bits and still
        the C++ implementation gives you a char with 8 bit. And there are 4 bit
        CPUS which would need to combine two bytes to meet the minimum requirement
        of 8 bits for char. I guess there might be C implelemntation s, but probably
        no C++ ones on such CPUs, but C has the same requirements in that case.
        [color=blue]
        > Code which would store 8 bools in 1 byte would be slower and (ironically)
        > more memory consumptive as the code which manipulates the byte would have
        > to be kept in memory.
        >[color=green]
        >> 2) The index of a vector is it an integer (4 byte) or a "long long"
        >> with 8 bytes or something else?[/color]
        >
        > There is no such intrinsic type "long long".
        >
        > There is a type "long int", which is guaranteed to be atleast 32-bit on
        > all systems, but that doesn't guarantee that it'll be 4 bytes long.[/color]

        That's right.
        [color=blue]
        > For instance:
        >
        > If "char" were 9-Bit, then although "long" would be atleast 32-Bit, it
        > wouldn't necessarily be 4 bytes long, as 4 bytes = 36 bits on that system.[/color]

        That example isn't really chosen very well. 3 bytes would only be 27 bits,
        so you would still need 4 bytes to meet the requirement of at least 32
        bits.

        Comment

        • assaarpa

          #5
          Re: vector and bool

          > Code which would store 8 bools in 1 byte would be[color=blue]
          > slower and (ironically) more memory consumptive as the code which
          > manipulates the byte would have to be kept in memory.[/color]

          Who gives a damn if the code takes 10 or 30 bytes, when the guy wants to
          store billions of bool's and the choises are 128 MB vs. 1024 MB for storing
          one billion flags. And you cannot be so sure about the performance either,
          very often smaller data translates to better performance in the real world
          applications.


          Comment

          • Ioannis Vranos

            #6
            Re: vector and bool

            daniel wrote:[color=blue]
            > 1)
            > is C++ smart enough to automatically use "bits" for bool or will
            > a bool have the size of a charcter (byte).[/color]


            In C++ it is

            1 <= sizeof(bool) <= sizeof(long)


            However especially for vector<bool> the standard says:


            "To optimize space allocation, a specialization of vector for bool
            elements is provided:

            ....


            reference is a class that simulates the behaviour of references of a
            single bit in vector<bool>."




            "The C++ Programming Language 3rd Edition" may help here:


            "16.3.11 Vector<bool>

            The specialization (§13.5) vector<bool> is provided as a compact vector
            of bool. A bool variable is addressable, so it takes up at least one
            byte. However, it is easy to implement vector<bool> so that each element
            takes up only a bit.
            The usual vector operations work for vector<bool> and retain their usual
            meanings. In particular, subscripting and iteration work as expected.

            For example:
            void f(vector<bool>& v)
            {
            // iterate using subscripting
            for (int i = 0; i<v.size() ; ++i) cin >> v[i] ;

            typedef vector<bool>: :const_ iterator VI;

            // iterate using iterators
            for (VI p = v.begin() ; p!=v.end() ; ++p) cout<<*p;
            }

            To achieve this, an implementation must simulate addressing of a single
            bit. Since a pointer cannot address a unit of memory smaller than a
            byte, vector<bool>: :iterator cannot be a pointer. In particular,
            one cannot rely on bool* as an iterator for a vector<bool>:

            void f(vector<bool>& v)
            {
            bool* p = v.begin() ; // error: type mismatch
            // ...
            }

            A technique for addressing a single bit is outlined in §17.5.3.

            The library also provides bitset as a set of Boolean values with Boolean
            set operations (§17.5.3)."









            [color=blue]
            > 2) The index of a vector is it an integer (4 byte) or a "long long"
            > with 8 bytes or something else?[/color]


            It is vector<T>::size _type which is some unsigned integer type.


            Example:


            vector<int> someVec(100);

            for(vector<int> ::size_type i=0; i<someVec.size( ); ++i)
            // ...








            [color=blue]
            >
            > I ask because I want to use
            >
            > vector<bool> with a few billion entries exceeding the int32 range for
            > indexing and I want to use as few memory as possible for the whole
            > vector<bool>
            >
            > e.g.
            > vector<bool> B;
            > B.resize(2^34);[/color]



            Well, it is not guaranteed that bits will be used, although in most
            cases this is what should happen.

            You may also want to check bitset which uses bits for sure.



            --
            Ioannis Vranos


            Comment

            • Ioannis Vranos

              #7
              Re: vector and bool

              Ioannis Vranos wrote:
              [color=blue]
              > Well, it is not guaranteed that bits will be used, although in most
              > cases this is what should happen.
              >
              > You may also want to check bitset which uses bits for sure.[/color]


              My mistake! It is *guaranteed* that bits are used:


              From the standard:

              "23.2.5 Class vector<bool>

              To optimize space allocation, a specialization of vector for bool
              elements is provided:

              ....

              // bit reference:
              class reference {
              friend class vector;
              reference();
              public:
              ˜reference();
              operator bool() const;
              reference& operator=(const bool x);
              reference& operator=(const reference& x);
              void flip(); // flips the bit
              };

              ....

              void flip(); // flips all bits

              ....

              reference is a class that simulates the behavior of references of a
              single bit in vector<bool>."



              --
              Ioannis Vranos


              Comment

              • Ron Natalie

                #8
                Re: vector and bool

                JKop wrote:
                [color=blue]
                > I believe that sizeof(bool) must be >= 1, which implies that a "bool" is of
                > one byte.[/color]

                Actually it implies bool is at least one byte. Many systems
                sizeof(bool) == sizeof(int).
                [color=blue]
                > But that doesn't mean that your particular system might do
                > something different in the background.[/color]

                Actually, for vector it is REQUIRED to do something different. I think
                it's stupid but the standard requires vector<bool> to be specialized to
                act differently than vectors of anything else. It should have been a
                different class.
                [color=blue]
                >
                > It's to do with the actual CPU of the system, what is the smallest unit of
                > memory that can be accessed, which is called a "byte", which 9 times out of
                > 10 is equal to "8 bits".[/color]

                Actually it has nothing to do with the actual CPU. C++ could just as
                easily have bit types. There are machines where the smallest
                addressable unit of storage is NOT a byte. A byte is carved up out of
                a larger word.

                However, getting back on topic, a byte (and it's related C++ type char)
                is the smallest unit of memory C++ will address. This unfornuately is
                another C/C++ stupidity. There really shouldn't be a hard relationship
                between "character" and "atomic memory unit." Of course, this is 30
                some years of hindsight speaking.

                Comment

                • Richard Herring

                  #9
                  Re: vector and bool

                  In message <1097586912.771 988@athnrd02>, Ioannis Vranos
                  <ivr@guesswh.at .grad.com> writes[color=blue]
                  >Ioannis Vranos wrote:
                  >[color=green]
                  >> Well, it is not guaranteed that bits will be used, although in most
                  >>cases this is what should happen.
                  >> You may also want to check bitset which uses bits for sure.[/color]
                  >
                  >My mistake! It is *guaranteed* that bits are used:[/color]

                  I don't think so.[color=blue]
                  >
                  >From the standard:
                  >
                  >"23.2.5 Class vector<bool>
                  >
                  >To optimize space allocation, a specialization of vector for bool
                  >elements is provided:
                  >
                  >...
                  >
                  >// bit reference:
                  >class reference {
                  >friend class vector;
                  >reference();
                  >public:
                  >Ëœreference( );
                  >operator bool() const;
                  >reference& operator=(const bool x);
                  >reference& operator=(const reference& x);
                  >void flip(); // flips the bit
                  >};
                  >
                  >...
                  >
                  >void flip(); // flips all bits
                  >
                  >...
                  >
                  >reference is a class that simulates the behavior of references of a
                  >single bit in vector<bool>."
                  >[/color]
                  That just redefines the semantics of operator[] etc. to return a proxy
                  instead of an actual bool &, thus removing the requirement for elements
                  to have distinct addresses. Nothing there says that it _must_ use only
                  use one bit per element.

                  --
                  Richard Herring

                  Comment

                  • Jeff Flinn

                    #10
                    Re: vector and bool


                    "daniel" <danielheiserer @netscape.net> wrote in message
                    news:b385c098.0 410112357.408a0 c8e@posting.goo gle.com...[color=blue]
                    > 1)
                    > is C++ smart enough to automatically use "bits" for bool or will
                    > a bool have the size of a charcter (byte).
                    > 2) The index of a vector is it an integer (4 byte) or a "long long"
                    > with 8 bytes or something else?
                    >
                    > I ask because I want to use
                    >
                    > vector<bool> with a few billion entries exceeding the int32 range for
                    > indexing and I want to use as few memory as possible for the whole
                    > vector<bool>[/color]

                    Use either std::bitset or boost::dynamic_ bitset. Both "pack" the bits for
                    better memory usage. You'll most likely have to modify these classes to
                    account for indices > 32 bit.

                    Jeff F


                    Comment

                    • Ioannis Vranos

                      #11
                      Re: vector and bool

                      Richard Herring wrote:
                      [color=blue][color=green]
                      >> reference is a class that simulates the behavior of references of a
                      >> single bit in vector<bool>."[/color]
                      >
                      > That just redefines the semantics of operator[] etc. to return a proxy
                      > instead of an actual bool &, thus removing the requirement for elements
                      > to have distinct addresses. Nothing there says that it _must_ use only
                      > use one bit per element.[/color]



                      Well since it says that

                      "reference is a class that simulates the behavior of references of a
                      single bit in vector<bool>"

                      I think this implies that vector<bool> should be implemented using bits.



                      --
                      Ioannis Vranos


                      Comment

                      • Richard Herring

                        #12
                        Re: vector and bool

                        In message <1097598729.868 638@athnrd02>, Ioannis Vranos
                        <ivr@guesswh.at .grad.com> writes[color=blue]
                        >Richard Herring wrote:
                        >[color=green][color=darkred]
                        >>> reference is a class that simulates the behavior of references of a
                        >>>single bit in vector<bool>."[/color]
                        >>
                        >> That just redefines the semantics of operator[] etc. to return a
                        >>proxy instead of an actual bool &, thus removing the requirement for
                        >>elements to have distinct addresses. Nothing there says that it _must_
                        >>use only use one bit per element.[/color]
                        >
                        >Well since it says that
                        >
                        >"reference is a class that simulates the behavior of references of a
                        >single bit in vector<bool>"
                        >
                        >I think this implies that vector<bool> should be implemented using bits.
                        >[/color]
                        For a weak enough meaning of "should", I agree that that's the
                        intention. But I don't think anything in the Standard rules out a naive
                        implementation that makes no attempt to optimise, and simply implements
                        vector<bool> in terms of an underlying vector<some_int _type>.

                        --
                        Richard Herring

                        Comment

                        • Ioannis Vranos

                          #13
                          Re: vector and bool

                          Richard Herring wrote:
                          [color=blue]
                          > For a weak enough meaning of "should", I agree that that's the
                          > intention. But I don't think anything in the Standard rules out a naive
                          > implementation that makes no attempt to optimise, and simply implements
                          > vector<bool> in terms of an underlying vector<some_int _type>.[/color]


                          There is not an explicit prohibition on this, however explicit
                          prohibitions are rare in the standard.

                          Also the implementer has to implement this particular reference:


                          // bit reference:
                          class reference {
                          friend class vector;
                          reference();
                          public:
                          ˜reference();
                          operator bool() const;
                          reference& operator=(const bool x);
                          reference& operator=(const reference& x);
                          void flip(); // flips the bit
                          };


                          with the note:

                          reference is a class that simulates the behavior of references of a
                          single bit in vector<bool>.



                          and this particular member function:

                          void flip(); // flips all bits




                          Also vector<bool> is mentioned as a separate case in the standard, and
                          thus I think the legislator intended this to be implemented only with
                          bits. :-)



                          --
                          Ioannis Vranos


                          Comment

                          • Nonenix (astalavista.net)

                            #14
                            Re: vector and bool


                            hmm not sure but i think somethink lieke that can be done with structures i
                            mena this:


                            struct item {

                            unsigned int IsTrue:1;
                            unsigned int watever:1;
                            unsigned int number:14;

                            }

                            some cpu dosen't support that or the compiler doesen't use it because of
                            speed loss

                            but if it work is should looks like this in the ram:

                            0 | 1 | 2.............1 5 |
                            IsTrue | waterver | number |


                            you could use it like an int but IMPORTANT

                            if you forgot the unsignet the int will always be 0

                            gnu says this:

                            ...cpp: In function 'int main()':
                            ....cpp:13: warning: comparison is always 0 due to width of bitfield


                            here is also the name they#re called bitfields

                            hope this can help ;-)







                            --
                            This message has been sent using the astalavista.net newsreader
                            webinterface.

                            Comment

                            • daniel

                              #15
                              Re: vector and bool

                              Well I found something:
                              http://www.sgi.com/tech/stl/bit_vector.html states
                              ----------------
                              A bit_vector is essentially a vector<bool>: it is a Sequence that has
                              the same interface as vector. The main difference is that bit_vector
                              is optimized for space efficiency. A vector always requires at least
                              one byte per element, but a bit_vector only requires one bit per
                              element.

                              Warning: The name bit_vector will be removed in a future release of
                              the STL. The only reason that bit_vector is a separate class, instead
                              of a template specialization of vector<bool>, is that this would
                              require partial specialization of templates. On compilers that support
                              partial specialization, bit_vector is a specialization of
                              vector<bool>. The name bit_vector is a typedef. This typedef is not
                              defined in the C++ standard, and is retained only for backward
                              compatibility.
                              ----------------
                              The std::bitset seems not to be useful because I cannot change its
                              size during runtime (no ->resize()).

                              So that should answer my first question.

                              Thanks for the help, daniel



                              "Jeff Flinn" <NONONE@nowhere .com> wrote in message news:<ckgnot$q0 h$1@bluegill.ad i.com>...[color=blue]
                              > "daniel" <danielheiserer @netscape.net> wrote in message
                              > news:b385c098.0 410112357.408a0 c8e@posting.goo gle.com...[color=green]
                              > > 1)
                              > > is C++ smart enough to automatically use "bits" for bool or will
                              > > a bool have the size of a charcter (byte).
                              > > 2) The index of a vector is it an integer (4 byte) or a "long long"
                              > > with 8 bytes or something else?
                              > >
                              > > I ask because I want to use
                              > >
                              > > vector<bool> with a few billion entries exceeding the int32 range for
                              > > indexing and I want to use as few memory as possible for the whole
                              > > vector<bool>[/color]
                              >
                              > Use either std::bitset or boost::dynamic_ bitset. Both "pack" the bits for
                              > better memory usage. You'll most likely have to modify these classes to
                              > account for indices > 32 bit.
                              >
                              > Jeff F[/color]

                              Comment

                              Working...