std::string as data array

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Jason Heyes

    std::string as data array

    If s is a std::string, does &s[0] refer to the contiguous block of
    characters representing s?


  • Alan Shi

    #2
    Re: std::string as data array

    >If s is a std::string, does &s[0] refer to the contiguous block of[color=blue]
    >characters representing s?[/color]
    Return type of operator [] is a reference to char.
    And the reference may be invalidated by string reallocations or
    modifications for the non-const strings.

    Comment

    • Peter_Julian

      #3
      Re: std::string as data array


      "Jason Heyes" <jasonheyes@opt usnet.com.au> wrote in message
      news:435da8f6$0 $15138$afc38c87 @news.optusnet. com.au...
      | If s is a std::string, does &s[0] refer to the contiguous block of
      | characters representing s?
      |

      Of course not, s[0] is a single char. If what you seek is a constant
      pointer to a legacy char array, try s.c_str(). C++ is not in the habit
      of decaying containers into pointers (although it can due to backward
      compatibility with certain ancient aspects of C).

      In fact, C++ replaces pointer-manipulation with iterators where the end
      iterator(s) having the value(s) of null are deemed to be part of the
      container, not part of the dataset. And its customary to refer to an
      instance of a std:string by reference, not by pointer.

      Hence, a std::string initialized like so...

      std::string s("abcde");

      ....has 5 values in its dataset. While an old type char array initialized
      like so...

      char str[] = {'a', 'b', 'c', 'd', 'e', '0x0'};

      ....has 6 values in it. Many of the ancient C functions rely on the
      presence of that 0x0 terminator. Standard C++ does not. A pointer to a
      char is a pointer to a char, nothing more, nothing less.

      Instead of passing a pointer to the first element, you pass the entire
      string by reference. Since you can't break the relationship between a
      reference and its object, the transfer is nuke proof, and no more
      debugging for hours due to invalid pointers either.

      void foo(std::string & r_s) { ... } // is evident in its purpose

      Consider an empty array of chars...

      char str[]; // this decays to a pointer with an undetermined value at
      pointee
      // and arrays have a fixed compile-time size

      ....not so with a std::string...

      std::string s;

      .... where the iterators at...

      std::string::it erator s_iter = s.begin();
      s_iter = s.end();

      ....have a guarenteed value of null. And unlike an array, you can modify
      its size at runtime on the fly...

      s += "a short phrase"; // no need to preset the std::strings size

      And thats but a glimpse to a std::string's deep capabilities. Combine
      all of the above and add its own algorithms + member functions +
      overloaded operators and the std::string repays the effort of learning
      it by the first day. You'll freak out over the simplicity of the concept
      too.


      Comment

      • yepp

        #4
        Re: std::string as data array

        Hmm, I don't think Peter's reply solves Jason's puzzle, though he gives
        a lot of comparison between C and C++'s string.
        Jason wants to know whether &s[0] is the start address of the
        contiguous block of characters that s holds.
        I think the answer of this question depends on the implementation
        details of std::string.
        If std::string is implemented via an character array, the answer is
        ture; however, if std::string is implemented via some other linked list
        method, the answer is definitely false.
        You can test the following program on vc6.

        #include <string>
        #include <iostream>

        using namespace std;

        int main()
        {
        string s("abcde");

        char *pstr = &s[0];

        cout<<pstr<<end l;
        cout<<hex<<int( pstr)<<endl;
        cout<<hex<<int( s.c_str())<<end l;

        return 0;
        }

        Output:
        abcde
        481cf1
        481cf1

        The output shows that std::string in vc6 is implemented with the first
        method.

        Certainly, just as Peter and Alan say, we should seldom manipulate s
        via the casted pstr char pointer. Because, std::string is a wrapper of
        conservative C string, and users can shrink and stretch std::string
        freely without the annoying memory manipulation problem, which you
        always suffer in the old C string's time.

        Comment

        • Richard Herring

          #5
          Re: std::string as data array

          In message <435da8f6$0$151 38$afc38c87@new s.optusnet.com. au>, Jason Heyes
          <jasonheyes@opt usnet.com.au> writes[color=blue]
          >If s is a std::string, does &s[0] refer to the contiguous block of
          >characters representing s?
          >
          >[/color]
          What makes you think it's contiguous?

          --
          Richard Herring

          Comment

          • Greg

            #6
            Re: std::string as data array

            Richard Herring wrote:[color=blue]
            > In message <435da8f6$0$151 38$afc38c87@new s.optusnet.com. au>, Jason Heyes
            > <jasonheyes@opt usnet.com.au> writes[color=green]
            > >If s is a std::string, does &s[0] refer to the contiguous block of
            > >characters representing s?
            > >
            > >[/color]
            > What makes you think it's contiguous?[/color]

            Since the Standard specifies that std::string's operator[] provides
            indexed access to std::string::da ta() - and since the Standard
            specifies that std::string::da ta() point to a character array (that is,
            a block of contiguously allocated memory), it must be the case by
            transitive logic that &s[0] refers to the first character in a block of
            contiguously-allocated character data.

            Greg

            Comment

            • Greg

              #7
              Re: std::string as data array

              yepp wrote:[color=blue]
              > Hmm, I don't think Peter's reply solves Jason's puzzle, though he gives
              > a lot of comparison between C and C++'s string.
              > Jason wants to know whether &s[0] is the start address of the
              > contiguous block of characters that s holds.[/color]

              It's the starting address of a block of characters identical to those
              that comprise s, but it is not (necessarily) a pointer to the character
              data of s itself.
              [color=blue]
              > I think the answer of this question depends on the implementation
              > details of std::string.[/color]

              No, the implementation details of std::string make no difference.
              [color=blue]
              > If std::string is implemented via an character array, the answer is
              > ture; however, if std::string is implemented via some other linked list
              > method, the answer is definitely false.[/color]

              No, the answer is always true.

              The client has no access to std::string's internal data representation
              (that is why it is called "internal") . So however std::string stores
              its character data is of interest only to itself.
              [color=blue]
              > You can test the following program on vc6.
              >
              > #include <string>
              > #include <iostream>
              >
              > using namespace std;
              >
              > int main()
              > {
              > string s("abcde");
              >
              > char *pstr = &s[0];
              >
              > cout<<pstr<<end l;
              > cout<<hex<<int( pstr)<<endl;
              > cout<<hex<<int( s.c_str())<<end l;
              >
              > return 0;
              > }
              >
              > Output:
              > abcde
              > 481cf1
              > 481cf1
              >
              > The output shows that std::string in vc6 is implemented with the first
              > method.[/color]

              No, it does not show how vc6's std::string is implemented internally.
              After all, internal implementations are not observable from the outside
              by definition. It does show &s[0] == data() but that relationship is a
              requirement.

              Greg

              Comment

              • Richard Herring

                #8
                Re: std::string as data array

                In message <1130248848.515 155.135300@g14g 2000cwa.googleg roups.com>, Greg
                <greghe@pacbell .net> writes[color=blue]
                >Richard Herring wrote:[color=green]
                >> In message <435da8f6$0$151 38$afc38c87@new s.optusnet.com. au>, Jason Heyes
                >> <jasonheyes@opt usnet.com.au> writes[color=darkred]
                >> >If s is a std::string, does &s[0] refer to the contiguous block of
                >> >characters representing s?
                >> >
                >> >[/color]
                >> What makes you think it's contiguous?[/color]
                >
                >Since the Standard specifies that std::string's operator[] provides
                >indexed access to std::string::da ta() - and since the Standard
                >specifies that std::string::da ta() point to a character array (that is,
                >a block of contiguously allocated memory), it must be the case by
                >transitive logic that &s[0] refers to the first character in a block of
                >contiguously-allocated character data.[/color]

                Is that something that's been corrected in the latest version of the
                Standard, then? The 1998 version appears to be inconsistent:

                21.3.4:
                const_reference operator[] (size_type pos) const;
                reference operator[](size_type pos);

                Returns: If pos < size(), returns data()[pos]. Otherwise [...]

                But:

                21.3.6

                const charT* data() const;

                Returns: If size() is nonzero, the member returns a pointer to the
                initial element of an array whose first size() elements equal the
                corresponding elements of the string controlled by *this [...]
                Requires: The program shall not alter any of the values stored in the
                character array [...]

                So data() returns a const pointer to something which the standard takes
                pains not to say is "the" string, but may be a copy of it, yet
                operator[] magically converts it to a non-const reference which can
                modify the string.

                Hmmm.

                --
                Richard Herring

                Comment

                • Old Wolf

                  #9
                  Re: std::string as data array

                  Peter_Julian wrote:[color=blue]
                  > "Jason Heyes" wrote:
                  > | If s is a std::string, does &s[0] refer to the contiguous block of
                  > | characters representing s?
                  > |
                  >
                  > Of course not, s[0] is a single char.[/color]

                  Actually s[0] is a reference to a single char.
                  It may or may not have other the other chars following it.
                  [color=blue]
                  > If what you seek is a constant pointer to a legacy char array,
                  > try s.c_str().[/color]

                  Better would be s.data() , which does not bother to null-terminate
                  the array.
                  [color=blue]
                  > In fact, C++ replaces pointer-manipulation with iterators where the
                  > end iterator(s) having the value(s) of null are deemed to be part
                  > of the container, not part of the dataset.[/color]

                  Most iterators cannot have a value of NULL.
                  Try writing:
                  std::string::it erator it = NULL;
                  and see how far you get.
                  (Whether or not this works is implementation-specific.)
                  [color=blue]
                  > Hence, a std::string initialized like so...
                  >
                  > std::string s("abcde");
                  >
                  > ...has 5 values in its dataset. While an old type char array
                  > initialized like so...
                  >
                  > char str[] = {'a', 'b', 'c', 'd', 'e', '0x0'};
                  >
                  > ...has 6 values in it. Many of the ancient C functions rely on the
                  > presence of that 0x0 terminator.[/color]

                  Note that '0x0' is not a end-of-string marker (it's a multi-
                  byte character constant). You might be thinking of 0, or '\0'.
                  [color=blue]
                  > Standard C++ does not. A pointer to a char is a pointer to
                  > a char, nothing more, nothing less.[/color]

                  Standard C++ has the same rules and expectations about char
                  arrays as Standard C does. (Except that string literals
                  are const in C++).
                  [color=blue]
                  > Consider an empty array of chars...
                  >
                  > char str[]; // this decays to a pointer with an undetermined
                  > // value at pointee and arrays have a fixed compile-time size[/color]

                  Actually this is a syntax error; arrays must have a specified size.
                  [color=blue]
                  > ...not so with a std::string...
                  > std::string s;
                  > ... where the iterators at...
                  >
                  > std::string::it erator s_iter = s.begin();
                  > s_iter = s.end();
                  >
                  > ...have a guarenteed value of null.[/color]

                  No, they don't. In fact you are not guaranteed to be able to
                  compare these iterators to anything except for other iterators
                  of the same type. Dereferencing any of these iterators
                  causes undefined behaviour.

                  Comment

                  Working...