<string> class with support of Null-Bytes?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Karl Ebener

    <string> class with support of Null-Bytes?

    Hi!

    I asked a similar question before but then changed everything to using
    char-Arrays instead of the string class, but I would rather not do this
    again.

    So, does anyone know of a string-Class similar to the STL-<string> that
    supports null-bytes?

    I tried with standard <string> but this definitely does not support
    them... :(

    Tnx
    Karl
  • Alf P. Steinbach

    #2
    Re: &lt;string&g t; class with support of Null-Bytes?

    * Karl Ebener:[color=blue]
    >
    > I asked a similar question before but then changed everything to using
    > char-Arrays instead of the string class, but I would rather not do this
    > again.
    >
    > So, does anyone know of a string-Class similar to the STL-<string> that
    > supports null-bytes?
    >
    > I tried with standard <string> but this definitely does not support
    > them... :([/color]

    Depends what you mean by "support", but with usual definitions that's
    not correct.

    Perhaps post a simple program that shows what you mean by "not support"?

    Then we can see whether the problem is in the code or with std::string,
    and give better suggestions on how to proceeed.

    --
    A: Because it messes up the order in which people normally read text.
    Q: Why is it such a bad thing?
    A: Top-posting.
    Q: What is the most annoying thing on usenet and in e-mail?

    Comment

    • Karl Ebener

      #3
      Re: &lt;string&g t; class with support of Null-Bytes?

      Little change:
      [color=blue]
      > I tried with standard <string> but this definitely does not support
      > them... :([/color]

      -> I tried using length()-method which stops at null-bytes and c_str()
      of course extracts only part till null-byte.
      Have I only not seen any possibility to extract the content as char* ?

      Tnx
      Karl

      Comment

      • Alf P. Steinbach

        #4
        Re: &lt;string&g t; class with support of Null-Bytes?

        * Karl Ebener:[color=blue]
        > Little change:
        >[color=green]
        > > I tried with standard <string> but this definitely does not support
        > > them... :([/color]
        >
        > -> I tried using length()-method which stops at null-bytes[/color]

        It doesn't.

        [color=blue]
        > and c_str() of course extracts only part till null-byte.[/color]

        It doesn't, see ยง21.3.6/1.

        [color=blue]
        > Have I only not seen any possibility to extract the content as char* ?[/color]

        Post some code.

        --
        A: Because it messes up the order in which people normally read text.
        Q: Why is it such a bad thing?
        A: Top-posting.
        Q: What is the most annoying thing on usenet and in e-mail?

        Comment

        • Karl Ebener

          #5
          Re: &lt;string&g t; class with support of Null-Bytes?

          Alf P. Steinbach schrieb:[color=blue]
          > Depends what you mean by "support", but with usual definitions that's
          > not correct.
          >
          > Perhaps post a simple program that shows what you mean by "not support"?
          >
          > Then we can see whether the problem is in the code or with std::string,
          > and give better suggestions on how to proceeed.
          >[/color]
          Okay, this is my test program.
          What I want to do finally, is read a complete (binary) file into a
          string and then send this via using socket to/from server.
          I am using socket-routines that use strings because it is much easier
          this way and I would love to leave it at that and not recode everything...

          Tnx
          Karl

          #include <string>
          #include <iostream>

          using namespace std;

          int main()
          {
          string abc = "abc\0abc\0 "; // string contains Null-bytes
          cout << abc << ":" << abc.length() << endl; // output is: 3
          FILE* fp;

          fp = fopen("ABC", "w");
          fwrite(abc.c_st r(), 8, 1, fp); // file will contain: "abc" and Garbage
          fclose(fp);
          }

          Comment

          • Rolf Magnus

            #6
            Re: &lt;string&g t; class with support of Null-Bytes?

            Karl Ebener wrote:
            [color=blue]
            > Alf P. Steinbach schrieb:[color=green]
            >> Depends what you mean by "support", but with usual definitions that's
            >> not correct.
            >>
            >> Perhaps post a simple program that shows what you mean by "not support"?
            >>
            >> Then we can see whether the problem is in the code or with std::string,
            >> and give better suggestions on how to proceeed.
            >>[/color]
            > Okay, this is my test program.
            > What I want to do finally, is read a complete (binary) file into a
            > string and then send this via using socket to/from server.
            > I am using socket-routines that use strings because it is much easier
            > this way and I would love to leave it at that and not recode everything...
            >
            > Tnx
            > Karl
            >
            > #include <string>
            > #include <iostream>
            >
            > using namespace std;
            >
            > int main()
            > {
            > string abc = "abc\0abc\0 "; // string contains Null-bytes[/color]

            No. Your literal contains 0-bytes. The conversion constructor from C style
            strings to std::string of course has to stop at \0, since that's the value
            that marks the end of a C style string. Try:

            const char c[] = "abc\0abc\0 ";

            string abc(c, sizeof(c));

            This tells the constructor to not stop at \0, but read the specified number
            of characters.
            [color=blue]
            > cout << abc << ":" << abc.length() << endl; // output is: 3[/color]

            That's because only the first 3 characters were actually copied into the
            string.
            [color=blue]
            > FILE* fp;
            >
            > fp = fopen("ABC", "w");
            > fwrite(abc.c_st r(), 8, 1, fp); // file will contain: "abc" and Garbage[/color]

            Again, that's because the string only contains the first 3 characters.
            [color=blue]
            > fclose(fp);
            > }[/color]

            Comment

            • Alf P. Steinbach

              #7
              Re: &lt;string&g t; class with support of Null-Bytes?

              * Karl Ebener:[color=blue]
              >
              > #include <string>
              > #include <iostream>
              >
              > using namespace std;
              >
              > int main()
              > {
              > string abc = "abc\0abc\0 "; // string contains Null-bytes
              > cout << abc << ":" << abc.length() << endl; // output is: 3
              > FILE* fp;
              >
              > fp = fopen("ABC", "w");
              > fwrite(abc.c_st r(), 8, 1, fp); // file will contain: "abc" and Garbage
              > fclose(fp);
              > }[/color]

              The problem in the abc declaration is that you invoke the constructor
              that takes a C string as argument, and by definition that C string ends
              at the first nullbyte.

              Try


              #include <string>
              #include <iostream>

              #define ELEMCOUNT( array ) (sizeof(array)/sizeof(*array))

              int main()
              {
              static char const abc_data[] = "abc\0abc\0 ";
              std::string abc( abc_data, ELEMCOUNT( abc_data );

              std::cout << abc << ":" << abc.length() << std::endl;
              }

              But you might instead (for efficiency) want to use std::vector<cha r>.

              Also, the file should be opened in binary mode.

              --
              A: Because it messes up the order in which people normally read text.
              Q: Why is it such a bad thing?
              A: Top-posting.
              Q: What is the most annoying thing on usenet and in e-mail?

              Comment

              • Dimitris Kamenopoulos

                #8
                Re: &lt;string&g t; class with support of Null-Bytes?

                Karl Ebener wrote:
                [color=blue]
                > Okay, this is my test program.[/color]

                My guess is that std::string's functions (including constructors) that take
                a C-Style string as an argument, *do* treat it as a C-style (i.e.
                null-terminated) string.

                Makes sense, doesn't it? You don't want

                char s[15] = "sth";
                string s1(s);

                to allocate 11 extra null characters in s1 for no reason :-)

                If, OTOH, you put a '\0' in an std::string, it will not be treated as a
                terminating character.

                Check out this example to see what I mean:

                #include <iostream>
                #include <string>

                int main(){
                std::string s("abc\0abc\0") ;
                std::cout<<s.le ngth()<<std::en dl; //prints 3, not 9
                std::string s2;
                s2.push_back('a ');
                s2.push_back('\ 0');
                s2.push_back('b ');
                std::cout<<s2.l ength()<<std::e ndl; //prints 3, not 1
                }


                Note: c_string() will return a const char *, which means that the string
                returned will always stop at the first null byte, for any code that cares
                about it (e.g. strlen or strcpy). Better use a vector<char> if you want
                byte semantics.


                Comment

                • Dave O'Hearn

                  #9
                  Re: &lt;string&g t; class with support of Null-Bytes?

                  Karl Ebener wrote:[color=blue]
                  > fwrite(abc.c_st r(), 8, 1, fp); // file will contain: "abc"
                  > // and Garbage[/color]

                  As a separate issue, data() would be better than c_str() here. c_str()
                  may expand the string's internal buffer, to make room for an extra null
                  character past the end. You don't need a null-terminated C-string to
                  call fwrite, so you can just use data().

                  --
                  Dave O'Hearn

                  Comment

                  • Rolf Magnus

                    #10
                    Re: &lt;string&g t; class with support of Null-Bytes?

                    Dimitris Kamenopoulos wrote:
                    [color=blue]
                    > Karl Ebener wrote:
                    >[color=green]
                    >> Okay, this is my test program.[/color]
                    >
                    > My guess is that std::string's functions (including constructors) that
                    > take a C-Style string as an argument, *do* treat it as a C-style (i.e.
                    > null-terminated) string.
                    >
                    > Makes sense, doesn't it? You don't want
                    >
                    > char s[15] = "sth";
                    > string s1(s);
                    >
                    > to allocate 11 extra null characters in s1 for no reason :-)[/color]

                    That's not the main point. The constructor takes a pointer, which doesn't
                    contain any information about the size of the array pointed to. So the \0
                    is the _only_ way at all to know where a C style string ends.

                    Comment

                    • Paul

                      #11
                      Re: &lt;string&g t; class with support of Null-Bytes?


                      "Karl Ebener" <myonlyb@vollbi o.de> wrote in message
                      news:41bec2d2$0 $29843$9b4e6d93 @newsread2.arco r-online.net...[color=blue]
                      > Little change:
                      >[color=green]
                      > > I tried with standard <string> but this definitely does not support
                      > > them... :([/color]
                      >
                      > -> I tried using length()-method which stops at null-bytes and c_str()
                      > of course extracts only part till null-byte.[/color]

                      What you are saying is totally false. std::string fully supports strings
                      with embedded NULLs. You just need to know the functions to use.

                      First, use the right constructor. The std::string has a few constructors --
                      a good C++ book that goes into the standard library will show you the
                      various constructors. The proper constructor is the one that takes a const
                      char * and an integer denoting the number of characters.

                      #include <string>
                      std::string s("abc\0123", 7);

                      Second, use the std::string::da ta( ) member function instead of
                      std::string::c_ str(). This respects the length of the string and does not
                      terminate on the first NULL.

                      Third, if you need to add binary data to a std::string, use the append( )
                      function. If you need to reassign binary data, use the
                      std::string::ap pend() on an empty string, or the std::string::as sign( )
                      member function.

                      Paul


                      Comment

                      • Ron Natalie

                        #12
                        Re: &lt;string&g t; class with support of Null-Bytes?

                        Karl Ebener wrote:[color=blue]
                        > Little change:
                        >[color=green]
                        >> I tried with standard <string> but this definitely does not support
                        >> them... :([/color]
                        >
                        >
                        > -> I tried using length()-method which stops at null-bytes and c_str()
                        > of course extracts only part till null-byte.
                        > Have I only not seen any possibility to extract the content as char* ?[/color]

                        Multibyte does not contain nulls. I'm confused as what you are asking.
                        Neither c_str() nor length() cares anything about embedded nulls.

                        Now that being said, there is NO real multibyte handling in std::string
                        either.

                        Comment

                        • Ron Natalie

                          #13
                          Re: &lt;string&g t; class with support of Null-Bytes?

                          Karl Ebener wrote:
                          [color=blue]
                          > So, does anyone know of a string-Class similar to the STL-<string> that
                          > supports null-bytes?[/color]

                          std:string handles null bytes just fine. The only thing that you have to
                          be careful with is that if you use the conversions to/from char*, you need
                          to pass/retrieve the actual length because the default strlen() calculations
                          won't work.

                          std::string s;
                          s.push_back('a' );
                          s.push_back('\0 ');
                          s.push_back('\b ');

                          cout << s.size(); // prints 3
                          const char* cp = s.c_str();

                          cout << cp[0] << cp[2]; // prints ab

                          Comment

                          • Old Wolf

                            #14
                            Re: &lt;string&g t; class with support of Null-Bytes?

                            Paul wrote:[color=blue]
                            > "Karl Ebener" <myonlyb@vollbi o.de> wrote:
                            >
                            > #include <string>
                            > std::string s("abc\0123", 7);[/color]

                            Undefined behaviour. "abc\0123" is an array of 6 chars:
                            {'a', 'b', 'c', '\012', '3', '\0'}
                            [color=blue]
                            > Second, use the std::string::da ta( ) member function instead of
                            > std::string::c_ str(). This respects the length of the string
                            > and does not terminate on the first NULL.[/color]

                            std::string::c_ str() does not terminate on the first null
                            character. The only difference between c_str() and data()
                            is that c_str() appends a null character.

                            std::string s("abc\0def", 7);
                            std::cout << (s.c_str() + 4) << std::endl;

                            will output "def".
                            BTW, the macro NULL is not really relevant to null characters.

                            Comment

                            • Old Wolf

                              #15
                              Re: &lt;string&g t; class with support of Null-Bytes?

                              Paul wrote:[color=blue]
                              > "Karl Ebener" <myonlyb@vollbi o.de> wrote:
                              >
                              > #include <string>
                              > std::string s("abc\0123", 7);[/color]

                              Undefined behaviour. "abc\0123" is an array of 6 chars:
                              {'a', 'b', 'c', '\012', '3', '\0'}
                              [color=blue]
                              > Second, use the std::string::da ta( ) member function instead of
                              > std::string::c_ str(). This respects the length of the string
                              > and does not terminate on the first NULL.[/color]

                              std::string::c_ str() does not terminate on the first null
                              character. The only difference between c_str() and data()
                              is that c_str() appends a null character.

                              std::string s("abc\0def", 7);
                              std::cout << (s.c_str() + 4) << std::endl;

                              will output "def".
                              BTW, the macro NULL is not really relevant to null characters.

                              Comment

                              Working...