making an istream from a char array

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • John Salmon

    making an istream from a char array


    I'm working with two libraries, one written
    in old school C, that returns a very large
    chunk of data in the form of a C-style,
    NUL-terminated string.

    The other written in a more modern C++
    is a parser for the chunk of bytes returned by
    the first. It expects a reference to a
    std::istream as its argument.

    The chunk of data is very large.
    I'd like to feed the output of the first to
    the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.

    My attempts to create an istringstream from the
    chunk of data all seem to at least double the
    amount of VM used. Here's a short program demonstrating
    what I've tried. Is there any way to get "inside"
    the istringstream and tell it to use the 'chunk'
    directly, rather than insisting on making a copy?

    Thanks,
    John Salmon

    [jsalmon@river c++]$ cat chararraytostre am.cpp
    #include <string>
    #include <sstream>
    #include <cstdlib>
    #include <cstring>
    #include <cstdio>
    using namespace std;

    char *getLotsOfBytes ();
    istream& streamParser(is tream &s);
    void linuxChkMem(con st char *msg);

    void withImplicitStr ing(){
    linuxChkMem("Be fore getLotsOfBytes: ");
    char *chunk = getLotsOfBytes( );
    linuxChkMem("Af ter getLotsOfBytes( ):");
    {
    istringstream iss(chunk);
    linuxChkMem("Af ter iss(p): ");
    streamParser(is s);
    linuxChkMem("Af ter streamParser(is s): ");
    }
    linuxChkMem("Af ter iss goes out of scope: ");
    free(chunk);
    linuxChkMem("Af ter free(p): ");
    }

    void withExplicitStr ing(){
    linuxChkMem("Be fore getLotsOfBytes: ");
    char *chunk = getLotsOfBytes( );
    linuxChkMem("Af ter getLotsOfBytes( ):");
    {
    string s(chunk);
    linuxChkMem("Af ter s(chunk): ");
    free(chunk);
    linuxChkMem("Af ter free(p): ");
    istringstream iss(s);
    linuxChkMem("Af ter iss(s): ");
    streamParser(is s);
    linuxChkMem("Af ter streamParser(is s): ");
    }
    linuxChkMem("Af ter iss goes out of scope: ");
    }

    int main(int argc, char **argv){
    printf("with an implicit string constructor\n") ;
    withImplicitStr ing();
    printf("\nwith an explicit string constructor\n") ;
    withExplicitStr ing();
    return 0;
    }

    // On linux, tell us how much data space we're using
    // in the VM.
    void linuxChkMem(con st char *msg){
    printf("%s", msg);
    fflush(stdout);
    char cmd[50];
    sprintf(cmd, "grep VmData /proc/%d/status", getpid());
    system(cmd);
    }

    static const int SZ = 100*1024*1024;
    // A rough approximation to getLotsOfBytes. In the
    // real application, getLotsOfBytes has these characteristics :
    // - it returns a malloced pointer to a NUL-terminated array of chars.
    // - it is out of my control. E.g., I can't rewrite it in a way
    // that might be more friendly to C++ streams.
    char *getLotsOfBytes (){
    char *p = (char *)malloc(SZ);
    memset(p, ' ', SZ);
    strcpy(p+SZ-50, "3.1415 2.718 1.414");
    return p;
    }

    // A rough approximation to streamParser. In the real
    // application, streamParser takes a ref to an istream
    // and does what it does. Again, I can't easily redefine
    // the interface.
    istream& streamParser(is tream& s){
    double x, y, z;
    s > x >y >z;
    printf("x: %f y: %f z: %f\n", x, y, z);
    return s;
    }

    [jsalmon@river c++]$ g++ -O3 chararraytostre am.cpp
    [jsalmon@river c++]$ a.out
    with an implicit string constructor
    Before getLotsOfBytes: VmData: 40 kB
    After getLotsOfBytes( ):VmData: 102444 kB
    After iss(p): VmData: 204848 kB
    x: 3.141500 y: 2.718000 z: 1.414000
    After streamParser(is s): VmData: 204980 kB
    After iss goes out of scope: VmData: 102576 kB
    After free(p): VmData: 172 kB

    with an explicit string constructor
    Before getLotsOfBytes: VmData: 172 kB
    After getLotsOfBytes( ):VmData: 102576 kB
    After s(chunk): VmData: 204980 kB
    After free(p): VmData: 102576 kB
    After iss(s): VmData: 204980 kB
    x: 3.141500 y: 2.718000 z: 1.414000
    After streamParser(is s): VmData: 204980 kB
    After iss goes out of scope: VmData: 172 kB
    [jsalmon@river c++]$

  • Denise Kleingeist

    #2
    Re: making an istream from a char array

    Hello John!
    John Salmon wrote:
    My attempts to create an istringstream from the
    chunk of data all seem to at least double the
    amount of VM used.
    std::istringstr eam takes a std::string. For creating this
    std::string from a char array, a copy is created. This copy
    is then copied into the std::istringstr eam. For this purpose,
    you probably don't want to use an std::istringstr eam. Instead,
    you could use a simple homegrown stream buffer (code see
    below).

    Good luck, Denise!
    --- CUT HERE ---
    #include <istream>
    #include <iostream>
    #include <streambuf>
    #include <string>
    #include <string.h>

    struct membuf:
    std::streambuf
    {
    membuf(char* b, char* e) { this->setg(b, b, e); }
    };

    int main()
    {
    char* buffer = get_huge_buffer _with_data();
    membuf sbuf(buffer, std::find(buffe r, buffer + strlen(buffer), 0));
    std::istream in(&sbuf);
    for (std::string line; std::getline(in , line); )
    std::cout << "line: " << line << "\n";
    }

    Comment

    • Gianni Mariani

      #3
      Re: making an istream from a char array

      John Salmon wrote:
      I'm working with two libraries, one written
      in old school C, that returns a very large
      chunk of data in the form of a C-style,
      NUL-terminated string.
      >
      The other written in a more modern C++
      is a parser for the chunk of bytes returned by
      the first. It expects a reference to a
      std::istream as its argument.
      >
      The chunk of data is very large.
      I'd like to feed the output of the first to
      the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.
      The "without making a copy" might be a little tricky with istringstream.

      I'm no expert on c++ streams but something like this might work.

      #include <istream>

      class Xistream
      : public std::istream,
      public std::streambuf
      {
      public:
      Xistream( const char * begin, const char * end )
      : std::istream( this )
      {
      setg( const_cast<char *>(begin), const_cast<char *>(begin),
      const_cast<char *>(end) );
      }
      };

      #include <iostream>

      int main()
      {
      const char xx[] = "1 22 33";

      Xistream xi( xx, xx + sizeof(xx) -1);

      int i;
      xi >i;

      std::cout << i << "\n";

      xi >i;

      std::cout << i << "\n";

      }

      Comment

      • John Salmon

        #4
        Re: making an istream from a char array

        >>>>"Denise" == Denise Kleingeist <denise.kleinge ist@googlemail. comwrites:

        DeniseHello John!
        DeniseJohn Salmon wrote:
        >My attempts to create an istringstream from the
        >chunk of data all seem to at least double the
        >amount of VM used.
        Denisestd::istr ingstream takes a std::string. For creating this
        Denisestd::stri ng from a char array, a copy is created. This copy
        Deniseis then copied into the std::istringstr eam. For this purpose,
        Deniseyou probably don't want to use an std::istringstr eam. Instead,
        Deniseyou could use a simple homegrown stream buffer (code see
        Denisebelow).

        DeniseGood luck, Denise!
        Denise--- CUT HERE ---
        Denise #include <istream>
        Denise #include <iostream>
        Denise #include <streambuf>
        Denise #include <string>
        Denise #include <string.h>

        Denise struct membuf:
        Denise std::streambuf
        Denise {
        Denise membuf(char* b, char* e) { this->setg(b, b, e); }
        Denise };

        Denise int main()
        Denise {
        Denise char* buffer = get_huge_buffer _with_data();
        Denise membuf sbuf(buffer, std::find(buffe r, buffer + strlen(buffer), 0));
        Denise std::istream in(&sbuf);
        Denise for (std::string line; std::getline(in , line); )
        Denise std::cout << "line: " << line << "\n";
        Denise }

        Thanks! This is exactly what I needed.

        One question - what's the point of the std::find()?

        I don't see how std::find(buffe r, buffer+strlen(b uffer), 0);
        could ever be different from buffer+strlen(b uffer)??

        Cheers,
        John Salmon

        Comment

        • Denise Kleingeist

          #5
          Re: making an istream from a char array

          Hello John!
          John Salmon wrote:
          >>>"Denise" == Denise Kleingeist <denise.kleinge ist@googlemail. comwrites:
          Denise membuf sbuf(buffer, std::find(buffe r, buffer + strlen(buffer), 0));
          One question - what's the point of the std::find()?
          >
          I don't see how std::find(buffe r, buffer+strlen(b uffer), 0);
          could ever be different from buffer+strlen(b uffer)??
          You are right: it is a left over from a discarded attempt to use
          std::find() instead of strlen()! Just use buffer + strlen(buffer)
          instead.

          Sorry for any confusion caused, Denise!

          Comment

          • P.J. Plauger

            #6
            Re: making an istream from a char array

            "John Salmon" <jsalmon@thesal mons.orgwrote in message
            news:m3psa2i1va .fsf@river.fish net...
            I'm working with two libraries, one written
            in old school C, that returns a very large
            chunk of data in the form of a C-style,
            NUL-terminated string.
            >
            The other written in a more modern C++
            is a parser for the chunk of bytes returned by
            the first. It expects a reference to a
            std::istream as its argument.
            >
            The chunk of data is very large.
            I'd like to feed the output of the first to
            the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.
            >
            My attempts to create an istringstream from the
            chunk of data all seem to at least double the
            amount of VM used. Here's a short program demonstrating
            what I've tried. Is there any way to get "inside"
            the istringstream and tell it to use the 'chunk'
            directly, rather than insisting on making a copy?
            See the header <strstream>. It does exactly what you want,
            and it's part of the C++ Standard (albeit a bit old
            fashioned).

            P.J. Plauger
            Dinkumware, Ltd.



            Comment

            • John Salmon

              #7
              Re: making an istream from a char array

              >>>>"PJ" == P J Plauger <pjp@dinkumware .comwrites:

              PJ"John Salmon" <jsalmon@thesal mons.orgwrote in message
              PJnews:m3psa2i1 va.fsf@river.fi shnet...
              >I'm working with two libraries, one written
              >in old school C, that returns a very large
              >chunk of data in the form of a C-style,
              >NUL-terminated string.
              >>
              >The other written in a more modern C++
              >is a parser for the chunk of bytes returned by
              >the first. It expects a reference to a
              >std::istream as its argument.
              >>
              >The chunk of data is very large.
              >I'd like to feed the output of the first to
              >the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.
              >>
              >My attempts to create an istringstream from the
              >chunk of data all seem to at least double the
              >amount of VM used. Here's a short program demonstrating
              >what I've tried. Is there any way to get "inside"
              >the istringstream and tell it to use the 'chunk'
              >directly, rather than insisting on making a copy?
              PJSee the header <strstream>. It does exactly what you want,
              PJand it's part of the C++ Standard (albeit a bit old
              PJfashioned).

              Thanks to Usenet, I now have two workable solutions.

              Googling for strstream turns up lots of warnings that "strstream is
              deprecated", with dire warnings that it may be removed from future
              versions of the standard. OTOH, an istrstream does exactly what I
              want, without any extra custom machinery ( struct membuf : public
              streambuf ).

              Other than simplicity and possible compatibility with future
              standards, is there any reason to prefer one approach over the
              other?

              Cheers,
              John Salmon


              Comment

              • P.J. Plauger

                #8
                Re: making an istream from a char array

                "John Salmon" <jsalmon@thesal mons.orgwrote in message
                news:m3ejqhifa9 .fsf@river.fish net...
                >>>>>"PJ" == P J Plauger <pjp@dinkumware .comwrites:
                >
                PJ"John Salmon" <jsalmon@thesal mons.orgwrote in message
                PJnews:m3psa2i1 va.fsf@river.fi shnet...
                >
                >>I'm working with two libraries, one written
                >>in old school C, that returns a very large
                >>chunk of data in the form of a C-style,
                >>NUL-terminated string.
                >>>
                >>The other written in a more modern C++
                >>is a parser for the chunk of bytes returned by
                >>the first. It expects a reference to a
                >>std::istrea m as its argument.
                >>>
                >>The chunk of data is very large.
                >>I'd like to feed the output of the first to
                >>the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.
                >>>
                >>My attempts to create an istringstream from the
                >>chunk of data all seem to at least double the
                >>amount of VM used. Here's a short program demonstrating
                >>what I've tried. Is there any way to get "inside"
                >>the istringstream and tell it to use the 'chunk'
                >>directly, rather than insisting on making a copy?
                >
                PJSee the header <strstream>. It does exactly what you want,
                PJand it's part of the C++ Standard (albeit a bit old
                PJfashioned).
                >
                Thanks to Usenet, I now have two workable solutions.
                >
                Googling for strstream turns up lots of warnings that "strstream is
                deprecated", with dire warnings that it may be removed from future
                versions of the standard. OTOH, an istrstream does exactly what I
                want, without any extra custom machinery ( struct membuf : public
                streambuf ).
                >
                Other than simplicity and possible compatibility with future
                standards, is there any reason to prefer one approach over the
                other?
                You should prefer strstream because:

                1) it's exactly what you need

                2) it's still part of the C++ Standard

                3) there's no reason to believe it'll become nonstandard anytime
                soon, despite the dire warnings

                4) even if it does officially go away, there's not a sane vendor
                who'll stop supporting it for the next decade

                So what the hell.

                P.J. Plauger
                Dinkumware, Ltd.



                Comment

                Working...