fileToString

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Phlip

    fileToString

    Newsgroupies:

    Everyone (who is anyone) has written a 'fileToString() ' function
    before:

    string fileToString(st ring fileName)
    {
    string result;
    ifstream in ( fileName.c_str( ) );
    char ch;
    while( in.get(ch) ) result += ch;
    return result;
    }

    It takes a file name and returns a string full of its contents. Text
    contents assumed.

    Here's the question: What's the fastest, or smallest, or smarmiest
    possible Standard Library implementation? Could something like
    std::copy() apply?

    I wouldn't be surprised if mine were acceptably fast; if both ifstream
    and std::string are buffered and delta-ed, respectively.

    Please specify your library, if your solution pushes the envelop of
    common C++ Standard compliance levels.

    --
    Phlip

    -- "I wasn't using my civil liberties, anyway..." --
  • Mike Wahler

    #2
    Re: fileToString


    "Phlip" <phlip_cpp@yaho o.com> wrote in message
    news:63604d2.03 12011510.6ef480 07@posting.goog le.com...[color=blue]
    > Newsgroupies:
    >
    > Everyone (who is anyone) has written a 'fileToString() ' function
    > before:
    >
    > string fileToString(st ring fileName)
    > {
    > string result;
    > ifstream in ( fileName.c_str( ) );
    > char ch;
    > while( in.get(ch) ) result += ch;
    > return result;
    > }
    >
    > It takes a file name and returns a string full of its contents. Text
    > contents assumed.
    >
    > Here's the question: What's the fastest,[/color]

    Can't tell without measuring.
    [color=blue]
    > or smallest,[/color]

    Can't tell without measuring.
    [color=blue]
    >or smarmiest[/color]

    Subjective term. :-)
    [color=blue]
    > possible Standard Library implementation? Could something like
    > std::copy() apply?[/color]

    std::copy could be used, sure.
    [color=blue]
    >
    > I wouldn't be surprised if mine were acceptably fast;[/color]

    Nor I. Remember your compiler usually gets a crack at
    optimizing, and many OS's have clever file buffering
    mechanisms. Measure, measure, measure. :-)
    [color=blue]
    >if both ifstream
    > and std::string are buffered and delta-ed, respectively.
    >
    > Please specify your library, if your solution pushes the envelop of
    > common C++ Standard compliance levels.[/color]

    Here's what I'd probably write, if presented with your
    specification (error checking omitted):

    #include <fstream>
    #include <iostream>
    #include <sstream>
    #include <string>

    std::string fileToString(co nst std::string& name)
    {
    std::ifstream in(name.c_str() );
    std::ostringstr eam oss;
    oss << in.rdbuf();
    return oss.str();
    }

    int main()
    {
    std::cout << fileToString("f ile.txt") << '\n';
    return 0;
    }


    Josuttis calls this way "probably the fastest way to
    copy files with C++ IOStreams". (p 683).

    -Mike


    Comment

    • red floyd

      #3
      Re: fileToString

      Phlip wrote:[color=blue]
      > Newsgroupies:
      >
      > Everyone (who is anyone) has written a 'fileToString() ' function
      > before:
      >
      > string fileToString(st ring fileName)
      > {
      > string result;
      > ifstream in ( fileName.c_str( ) );
      > char ch;
      > while( in.get(ch) ) result += ch;
      > return result;
      > }
      >
      > It takes a file name and returns a string full of its contents. Text
      > contents assumed.
      >
      > Here's the question: What's the fastest, or smallest, or smarmiest
      > possible Standard Library implementation? Could something like
      > std::copy() apply?
      >
      > I wouldn't be surprised if mine were acceptably fast; if both ifstream
      > and std::string are buffered and delta-ed, respectively.
      >
      > Please specify your library, if your solution pushes the envelop of
      > common C++ Standard compliance levels.
      >[/color]

      #include <iostream>
      #include <string>
      #include <iterator>
      using namespace std;
      string fileToString(st ring fileName)
      {
      ifstream in(fileName.c_s tr());
      string s;
      if (in)
      s.assign(istrea m_iterator<char >(in), istream_iterato r<char>());
      return s;
      }



      Comment

      • Dietmar Kuehl

        #4
        Re: fileToString

        phlip_cpp@yahoo .com (Phlip) wrote:[color=blue]
        > Here's the question: What's the fastest, or smallest, or smarmiest
        > possible Standard Library implementation? Could something like
        > std::copy() apply?[/color]

        Here is what should be the fastest approach:

        std::string fileToString(st d::string const& name) {
        std::ifstream in(name.c_str() );
        return std::string(std ::istreambuf_it erator<char>(in ),
        std::istreambuf _iterator<char> ());
        }

        Since this is a pretty specialized call and a few optimizations are
        necessary for this to be fast, it is likely that the constructor of the
        string internally actually does something like

        std::copy(begin , end, std::back_inser ter(*this));

        .... and the actual optimizations are applied to 'std::copy()'. I'm, however,
        not aware of any standard C++ library which currently optimizes stuff like
        this to an extreme level: for this to be fast, it would be necessary to
        'reserve()' memory before going ahead and actually reading the string. To
        figure out the size, in turn, it would be necessary to have a difference
        function which is specialized for 'std::istreambu f_iterator<>()' which
        takes the underlying code conversion facet into account and which is indeed
        used: 'std::istreambu f_iterator' is an input iterator and thus using
        'std::distance( )' would consume the sequence. On the other hand, it is
        definitely doable. Of course, the copies are not necessarily that expensive
        and it may be viable to dump the file blockwise into a string.

        My guess is that the fastest approach to reading a file into a string using
        current standard C++ library implementations involves using an
        'std::ostringst ream':

        std::string fileToString(st d::string const& name) {
        std::ifstream in(name.c_str() );
        std::ostringstr eam out;
        out << in.rdbuf();
        return out.str();
        }

        Stream buffer operate block-oriented internally anyway while the iterator
        approach requires that the "segmented sequence" optimization is implemented.
        --
        <mailto:dietmar _kuehl@yahoo.co m> <http://www.dietmar-kuehl.de/>
        Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.co m/>

        Comment

        • tom_usenet

          #5
          Re: fileToString

          On Tue, 02 Dec 2003 00:20:42 GMT, red floyd <no.spam@here.d ude> wrote:
          [color=blue]
          >#include <iostream>
          >#include <string>
          >#include <iterator>
          >using namespace std;
          >string fileToString(st ring fileName)
          >{
          > ifstream in(fileName.c_s tr());
          > string s;
          > if (in)
          > s.assign(istrea m_iterator<char >(in), istream_iterato r<char>());[/color]

          That code skips whitespace, and is very slow. Much better would be:

          if (in)
          s.assign(istrea mbuf_iterator<c har>(in), istreambuf_iter ator<char>());

          But this relies on s.assign being efficient for input iterators (and
          it sometimes isn't). Better would be this semi-standard version:

          string fileToString(st ring fileName)
          {
          ifstream in(fileName.c_s tr());
          in.seekg(0, ios_base::end);
          int pos = in.tellg();
          in.seekg(0, ios_base::beg);
          if (in)
          {
          std::vector<cha r> v(pos); //string might not be contiguous :(
          in.read(&v[0], v.size());
          return string(&v[0], v.size());
          }
          else
          return string();
          }

          Tom

          C++ FAQ: http://www.parashift.com/c++-faq-lite/
          C FAQ: http://www.eskimo.com/~scs/C-faq/top.html

          Comment

          • Alex Vinokur

            #6
            Re: fileToString


            "Dietmar Kuehl" <dietmar_kuehl@ yahoo.com> wrote in message news:5b15f8fd.0 312020347.63712 1b2@posting.goo gle.com...
            [snip][color=blue]
            >
            > My guess is that the fastest approach to reading a file into a string using
            > current standard C++ library implementations involves using an
            > 'std::ostringst ream':
            >
            > std::string fileToString(st d::string const& name) {
            > std::ifstream in(name.c_str() );
            > std::ostringstr eam out;
            > out << in.rdbuf();
            > return out.str();
            > }
            >
            > Stream buffer operate block-oriented internally anyway while the iterator
            > approach requires that the "segmented sequence" optimization is implemented.[/color]
            [snip]


            Comparative performance tests can be seen at :

            The "ostringstr eam out; out << in.rdbuf()" method is the fastest if standard C++ library is used (Dietmar is right).
            Outside C++ : mmap (http://www.opengroup.org/onlinepubs/...ions/mmap.html) is the fastest method

            See also :




            =============== =============== =======
            Alex Vinokur
            mailto:alexvn@c onnect.to
            The site contains algorithms, programs and newsgroup postings : C++ Program Perfometer, C++ Stream Compatible TCP/IP Sockets, Huffman coding, Fibonacci numbers, Turing Machine, Post Machine, Flexible Vector/Matrix, C++ Simulators, C++ Wrappers, etc.

            =============== =============== =======



            Comment

            • Phlip

              #7
              Re: fileToString

              Dietmar Kuehl wrote:
              [color=blue]
              > phlip_cpp@yahoo .com (Phlip) wrote:[color=green]
              > > Here's the question: What's the fastest, or smallest, or smarmiest
              > > possible Standard Library implementation? Could something like
              > > std::copy() apply?[/color]
              >
              > Here is what should be the fastest approach:
              >
              > std::string fileToString(st d::string const& name) {
              > std::ifstream in(name.c_str() );
              > return std::string(std ::istreambuf_it erator<char>(in ),
              > std::istreambuf _iterator<char> ());
              > }[/color]

              I like it because it's the smarmiest. But VC++6 dislikes it, and
              emites the usual "screaming at a template" error message:

              error C2665: 'basic_string<c har,struct std::char_trait s<char>,class
              std::allocator< char> >::basic_string <char,struct
              std::char_trait s<char>,class std::allocator< char> >' : none of the 7
              overloads can convert parameter 1 from type 'class
              std::istreambuf _iterator<char, struct std::char_trait s<char> >'
              [color=blue]
              > My guess is that the fastest approach to reading a file into a string using
              > current standard C++ library implementations involves using an
              > 'std::ostringst ream':
              >
              > std::string fileToString(st d::string const& name) {
              > std::ifstream in(name.c_str() );
              > std::ostringstr eam out;
              > out << in.rdbuf();
              > return out.str();
              > }
              >
              > Stream buffer operate block-oriented internally anyway while the iterator
              > approach requires that the "segmented sequence" optimization is implemented.[/color]

              Ding! VC++ liked that one.

              Thanks, to you and red floyd, for the suggestions!

              --
              Phlip

              Comment

              • Phlip

                #8
                Re: fileToString

                Mike Wahler wrote:
                [color=blue]
                > Can't tell without measuring.[/color]

                Josuttis disagrees with you - see below. And feel free to Google for
                any of my lectures about premature optimization, or guessing.

                In this case, we all agree that the C++ Standard Library enforces
                certain performance profiles...
                [color=blue]
                > std::string fileToString(co nst std::string& name)
                > {
                > std::ifstream in(name.c_str() );
                > std::ostringstr eam oss;
                > oss << in.rdbuf();
                > return oss.str();
                > }[/color]
                [color=blue]
                > Josuttis calls this way "probably the fastest way to
                > copy files with C++ IOStreams". (p 683).[/color]

                I humbly thank the newsgroup for reading Josuttis for me. ;-)

                --
                Phlip

                Comment

                • tom_usenet

                  #9
                  Re: fileToString

                  On Tue, 2 Dec 2003 20:24:10 +0200, "Alex Vinokur" <alexvn@bigfoot .com>
                  wrote:
                  [color=blue]
                  >
                  >"Dietmar Kuehl" <dietmar_kuehl@ yahoo.com> wrote in message news:5b15f8fd.0 312020347.63712 1b2@posting.goo gle.com...
                  >[snip][color=green]
                  >>
                  >> My guess is that the fastest approach to reading a file into a string using
                  >> current standard C++ library implementations involves using an
                  >> 'std::ostringst ream':
                  >>
                  >> std::string fileToString(st d::string const& name) {
                  >> std::ifstream in(name.c_str() );
                  >> std::ostringstr eam out;
                  >> out << in.rdbuf();
                  >> return out.str();
                  >> }
                  >>
                  >> Stream buffer operate block-oriented internally anyway while the iterator
                  >> approach requires that the "segmented sequence" optimization is implemented.[/color]
                  >[snip]
                  >
                  >
                  >Comparative performance tests can be seen at :
                  >http://article.gmane.org/gmane.comp.....perfometer/22
                  >The "ostringstr eam out; out << in.rdbuf()" method is the fastest if standard C++ library is used (Dietmar is right).
                  >Outside C++ : mmap (http://www.opengroup.org/onlinepubs/...ions/mmap.html) is the fastest method[/color]

                  Repeated calls to istream::get should be avoided, for performance
                  reasons (each call constructs a sentry object, checks error states,
                  etc.). Instead use rdbuf() and streambuf::sget c. e.g.

                  std::streambuf* rdbuf = in.rdbuf();
                  int c;
                  while ((c = rdbuf->sgetc()) != EOF)
                  s[i++] = static_cast<uns igned char>(c);

                  Tom

                  C++ FAQ: http://www.parashift.com/c++-faq-lite/
                  C FAQ: http://www.eskimo.com/~scs/C-faq/top.html

                  Comment

                  • Mike Wahler

                    #10
                    Re: fileToString

                    "Phlip" <phlip_cpp@yaho o.com> wrote in message
                    news:63604d2.03 12021116.bb3e6b b@posting.googl e.com...[color=blue]
                    > Mike Wahler wrote:
                    >[color=green]
                    > > Can't tell without measuring.[/color]
                    >
                    > Josuttis disagrees with you - see below.[/color]

                    I don't see anything below which indicates such a disagreement.
                    As a matter of fact, there cannot be a disagreement, since what
                    I presented was from his book, and not my own idea.
                    [color=blue]
                    > And feel free to Google for
                    > any of my lectures about premature optimization, or guessing.[/color]

                    What did I guess? I simply repeated what I'd read. Also,
                    my several admonitions to 'measure' don't indicate advocation
                    of 'guessing'. Wasn't it you who first asked about 'fastest'
                    and 'smallest'?

                    Or have I completely misunderstood you? Clarifications welcome.
                    [color=blue]
                    >
                    > In this case, we all agree that the C++ Standard Library enforces
                    > certain performance profiles...
                    >[color=green]
                    > > std::string fileToString(co nst std::string& name)
                    > > {
                    > > std::ifstream in(name.c_str() );
                    > > std::ostringstr eam oss;
                    > > oss << in.rdbuf();
                    > > return oss.str();
                    > > }[/color]
                    >[color=green]
                    > > Josuttis calls this way "probably the fastest way to
                    > > copy files with C++ IOStreams". (p 683).[/color]
                    >
                    > I humbly thank the newsgroup for reading Josuttis for me. ;-)[/color]

                    You're welcome. :-)

                    -Mike


                    Comment

                    Working...