filecopy with std::copy()

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Thomas J. Clancy

    filecopy with std::copy()

    I was wondering if anyone knew of a way to use std::copy() and
    istream_iterato r<>/ostream_iterato r<> write a file copy function that is
    quick and efficient.

    Doing this messes up the file because it seems to ignore '\n'

    ifstream in("somefile") ;
    ofstream out("someOtherF ile");

    std::copy(std:: istream_iterato r<unsigned char>(in),
    std::istream_it erator<unsigned char>(),
    std::ostream_it erator<unsigned char>(out));

    Now, I figured out how to do it correctly but it is dog slow. I was
    wondering if anyone knew how to do this in an ellegant manner?

    thomas j. clancy


  • Ivan Vecerina

    #2
    Re: filecopy with std::copy()

    Hi Thomas,
    "Thomas J. Clancy" <tjclancy@comca st.net> wrote in message
    news:Rq-cnUGo6qy0PMeiXT WJhg@comcast.co m...[color=blue]
    > I was wondering if anyone knew of a way to use std::copy() and
    > istream_iterato r<>/ostream_iterato r<> write a file copy function that is
    > quick and efficient.[/color]
    ....[color=blue]
    > Now, I figured out how to do it correctly but it is dog slow. I was
    > wondering if anyone knew how to do this in an ellegant manner?[/color]

    Unless you insist on using std::copy, the elegant and efficient
    manner to copy an entire file (or stream):
    dstStream << srcStream.rdbuf ();
    A C++ implementation should be able to ultimately optimize this
    operation (but performance may vary...).

    hth,
    Ivan
    --
    http://www.post1.com/~ivec <> Ivan Vecerina


    Comment

    • Thomas J. Clancy

      #3
      Re: filecopy with std::copy()

      "Ivan Vecerina" <ivecATmyrealbo xDOTcom> wrote in message
      news:3f5ad280$1 @news.swissonli ne.ch...[color=blue]
      > Hi Thomas,
      > "Thomas J. Clancy" <tjclancy@comca st.net> wrote in message
      > news:Rq-cnUGo6qy0PMeiXT WJhg@comcast.co m...[color=green]
      > > I was wondering if anyone knew of a way to use std::copy() and
      > > istream_iterato r<>/ostream_iterato r<> write a file copy function that is
      > > quick and efficient.[/color]
      > ...[color=green]
      > > Now, I figured out how to do it correctly but it is dog slow. I was
      > > wondering if anyone knew how to do this in an ellegant manner?[/color]
      >
      > Unless you insist on using std::copy, the elegant and efficient
      > manner to copy an entire file (or stream):
      > dstStream << srcStream.rdbuf ();
      > A C++ implementation should be able to ultimately optimize this
      > operation (but performance may vary...).[/color]


      Elegant, yes... this I already knew about, but boy is it
      slooooooooooooo ooowwwwww.... I came up with a different solution using the
      std::copy and a type (class) that contains a buffer of chars and uses
      stream::read() and stream::write() within the input stream operator (>>) and
      the output stream operator (<<), respectively. And man does it scream.

      Anyway, I was just wondering if there were alternatives to creating this
      sort of thing using or extending the stream stuff.


      [color=blue]
      >
      > hth,
      > Ivan
      > --
      > http://www.post1.com/~ivec <> Ivan Vecerina
      >
      >[/color]


      Comment

      • Josh Sebastian

        #4
        Re: filecopy with std::copy()

        On Sun, 07 Sep 2003 09:44:26 -0400, Thomas J. Clancy wrote:
        [color=blue]
        > "Ivan Vecerina" <ivecATmyrealbo xDOTcom> wrote in message
        > news:3f5ad280$1 @news.swissonli ne.ch...
        >[color=green]
        >> Unless you insist on using std::copy, the elegant and efficient
        >> manner to copy an entire file (or stream):
        >> dstStream << srcStream.rdbuf ();
        >> A C++ implementation should be able to ultimately optimize this
        >> operation (but performance may vary...).[/color]
        >
        >
        > Elegant, yes... this I already knew about, but boy is it
        > slooooooooooooo ooowwwwww....[/color]

        Nothing using IOStreams is going to be faster. File copies are best left
        to OS routines.

        Josh

        Comment

        • Josh Sebastian

          #5
          Re: filecopy with std::copy()

          On Sun, 07 Sep 2003 10:11:13 -0400, Thomas J. Clancy wrote:
          [color=blue]
          > Ummm... the rest of my previous reply talks about what I did to make it
          > much, much faster than the solution you mentioned, so what do you mean by
          > your statement above?[/color]

          It was actually faster using a copy than rdbuf? That's a messed-up
          IOStreams implementation. :-}

          Comment

          • Josh Sebastian

            #6
            Re: filecopy with std::copy()

            On Sun, 07 Sep 2003 11:05:28 -0400, Thomas J. Clancy wrote:
            [color=blue]
            >
            > Not at all... when you use the output iterator of rdbuf(), I believe that it
            > is doing it byte by byte and not in chunks. At least this is the behaviour
            > I am seeing with VC7.1's implementation, which they get from Dinkumware, I
            > believe. Now I could try this using STLPort.[/color]

            It shouldn't be, there should be buffering done both by your OS and by
            IOStreams. For example

            curien@balar:~/prog$ uname -a
            Linux balar 2.4.18 #1 Sun Aug 10 12:24:29 EDT 2003 i686 GNU/Linux
            curien@balar:~/prog$ cat blah.cpp
            #include <fstream>
            #include <ios>

            int main() {
            std::ifstream infile("test.da t", std::ios_base:: binary);
            std::ofstream outfile("test~. dat", std::ios_base:: binary);

            outfile << infile.rdbuf();
            }
            curien@balar:~/prog$ dd if=/dev/zero of=test.dat bs=1024 count=50K
            51200+0 records in
            51200+0 records out
            52428800 bytes transferred in 0.700340 seconds (74861922 bytes/sec)
            curien@balar:~/prog$ g++ -v
            Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.2/specs
            Configured with: ../src/configure -v --enable-languages=c,c++ ,java,f77,pasca l,objc,ada,tree lang --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu --enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux
            Thread model: posix
            gcc version 3.3.2 20030812 (Debian prerelease)
            curien@balar:~/prog$ g++ -ansi -pedantic -W -Wall -O2 blah.cpp
            curien@balar:~/prog$ time ./a.out

            real 0m0.619s
            user 0m0.030s
            sys 0m0.540s

            Josh

            Comment

            • Thomas J. Clancy

              #7
              Re: filecopy with std::copy()


              "Josh Sebastian" <curien@cox.net > wrote in message
              news:pan.2003.0 9.07.16.12.24.3 53080@cox.net.. .[color=blue]
              > On Sun, 07 Sep 2003 11:05:28 -0400, Thomas J. Clancy wrote:
              >[color=green]
              > >
              > > Not at all... when you use the output iterator of rdbuf(), I believe[/color][/color]
              that it[color=blue][color=green]
              > > is doing it byte by byte and not in chunks. At least this is the[/color][/color]
              behaviour[color=blue][color=green]
              > > I am seeing with VC7.1's implementation, which they get from Dinkumware,[/color][/color]
              I[color=blue][color=green]
              > > believe. Now I could try this using STLPort.[/color]
              >
              > It shouldn't be, there should be buffering done both by your OS and by
              > IOStreams. For example
              >
              > curien@balar:~/prog$ uname -a
              > Linux balar 2.4.18 #1 Sun Aug 10 12:24:29 EDT 2003 i686 GNU/Linux
              > curien@balar:~/prog$ cat blah.cpp
              > #include <fstream>
              > #include <ios>
              >
              > int main() {
              > std::ifstream infile("test.da t", std::ios_base:: binary);
              > std::ofstream outfile("test~. dat", std::ios_base:: binary);
              >
              > outfile << infile.rdbuf();
              > }
              > curien@balar:~/prog$ dd if=/dev/zero of=test.dat bs=1024 count=50K
              > 51200+0 records in
              > 51200+0 records out
              > 52428800 bytes transferred in 0.700340 seconds (74861922 bytes/sec)[/color]

              Hey man, I have the numbers, too, and believe me, they suck. I wonder if
              Microsoft is pulling a fast one? :-)
              [color=blue]
              > curien@balar:~/prog$ g++ -v
              > Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.2/specs
              > Configured with:[/color]
              .../src/configure -v --enable-languages=c,c++ ,java,f77,pasca l,objc,ada,tree la
              ng --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gx
              x-include-dir=/usr/include/c++/3.3 --enable-shared --with-system-zlib --enab
              le-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu
              --enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-
              gc i486-linux[color=blue]
              > Thread model: posix
              > gcc version 3.3.2 20030812 (Debian prerelease)
              > curien@balar:~/prog$ g++ -ansi -pedantic -W -Wall -O2 blah.cpp
              > curien@balar:~/prog$ time ./a.out
              >
              > real 0m0.619s
              > user 0m0.030s
              > sys 0m0.540s
              >
              > Josh[/color]


              Comment

              • Thomas J. Clancy

                #8
                Re: filecopy with std::copy()

                > It shouldn't be, there should be buffering done both by your OS and by[color=blue]
                > IOStreams. For example
                >[/color]

                My bad, you're right. Under microsoft, if you build this little application
                in debug, it is dog slow. I thought I had been building in release mode.
                Once I set it to release mode and rebuilt the thing flew! Thanks for the
                information on this.

                Tom


                Comment

                • Kevin Goodsell

                  #9
                  Re: filecopy with std::copy()

                  Thomas J. Clancy wrote:
                  [color=blue]
                  > "Ivan Vecerina" <ivecATmyrealbo xDOTcom> wrote in message
                  > news:3f5ad280$1 @news.swissonli ne.ch...[color=green]
                  >>
                  >>Unless you insist on using std::copy, the elegant and efficient
                  >>manner to copy an entire file (or stream):
                  >> dstStream << srcStream.rdbuf ();
                  >>A C++ implementation should be able to ultimately optimize this
                  >>operation (but performance may vary...).[/color]
                  >
                  >
                  >
                  > Elegant, yes... this I already knew about, but boy is it
                  > slooooooooooooo ooowwwwww....[/color]

                  Are you using Visual C++ 5 or 6 by chance? There's a known bug in the
                  iostream library that causes buffering to be wrongly disabled in file
                  streams that are opened by name. That could account for this being slow,
                  I think. Check here:


                  [color=blue]
                  > I came up with a different solution using the
                  > std::copy and a type (class) that contains a buffer of chars and uses
                  > stream::read() and stream::write() within the input stream operator (>>) and
                  > the output stream operator (<<), respectively. And man does it scream.[/color]

                  Buffering should be automatic, making this unnecessary. I guess that it
                  should be possible to make the solution using standard stream classes
                  perform just as well or better than this solution, but I wouldn't know
                  exactly how to do it.

                  -Kevin
                  --
                  My email address is valid, but changes periodically.
                  To contact me please use the address from a recent posting.

                  Comment

                  • Thomas J. Clancy

                    #10
                    Re: filecopy with std::copy()


                    "Kevin Goodsell" <usenet1.spamfr ee.fusion@never box.com> wrote in message
                    news:B0L6b.2685 $PE6.2267@newsr ead3.news.pas.e arthlink.net...[color=blue]
                    > Thomas J. Clancy wrote:
                    >[color=green]
                    > > "Ivan Vecerina" <ivecATmyrealbo xDOTcom> wrote in message
                    > > news:3f5ad280$1 @news.swissonli ne.ch...[color=darkred]
                    > >>[/color][/color]
                    > Buffering should be automatic, making this unnecessary. I guess that it
                    > should be possible to make the solution using standard stream classes
                    > perform just as well or better than this solution, but I wouldn't know
                    > exactly how to do it.
                    >[/color]

                    Here was my solution before I realized that using stream::rdbuf() worked
                    well while NOT in debug mode using VC++7.1 (.NET 2003).

                    /**
                    * A block buffer type that can be used with std::copy() and
                    istream_iterato rs without
                    * having to write a special form of copy or an istream_iterato r.
                    */
                    class ByteBlock
                    {
                    public:
                    ByteBlock()
                    : m_bytesRead(0),
                    m_fileSize(-1),
                    m_totalRead(0)
                    {
                    }

                    private:
                    unsigned char m_block[10240];
                    int m_bytesRead;
                    long m_fileSize;
                    long m_totalRead;
                    friend std::istream& operator >> (std::istream& stream, ByteBlock& byte);
                    friend std::ostream& operator << (std::ostream& stream, const ByteBlock&
                    byte);
                    };

                    std::istream& operator >> (std::istream& stream, ByteBlock& block)
                    {
                    if (block.m_fileSi ze == -1)
                    {
                    stream.seekg(0, std::ios::end);
                    block.m_fileSiz e = stream.tellg();
                    stream.seekg(0, std::ios::beg);
                    }
                    std::size_t leftToRead = block.m_fileSiz e - block.m_totalRe ad;
                    if (leftToRead)
                    {
                    stream.read((ch ar*)block.m_blo ck, std::min(sizeof (block.m_block) ,
                    leftToRead));
                    block.m_bytesRe ad = stream.gcount() ;
                    block.m_totalRe ad += block.m_bytesRe ad;
                    }
                    else
                    {
                    stream.setstate (std::ios_base: :eofbit | std::ios_base:: badbit);
                    }
                    return stream;
                    }

                    std::ostream& operator << (std::ostream& stream, const ByteBlock& block)
                    {
                    stream.write((c har*)block.m_bl ock, block.m_bytesRe ad);
                    return stream;
                    }

                    void blockCopyFile(c onst char* source, const char* dest)
                    {
                    ifstream in(source, ios::in | ios::binary);
                    ofstream out(dest, ios::out | ios::binary);
                    copy(istream_it erator<tjc_std: :ByteBlock>(in) ,
                    istream_iterato r<tjc_std::Byte Block>(),
                    ostream_iterato r<tjc_std::Byte Block>(out));
                    }

                    Yes, this was a naive approach, but it worked quickly and in fact for some
                    reason this approach still seemed to work slightly faster than:

                    out << in.rdbuf();

                    I don't know why that would be, especially since what I've recently read and
                    what I've been told by others here in this newsgroup. But hey, I just need
                    a way to copy files without relying on the OS, so both of these ideas seems
                    to work just fine.

                    Tom


                    Comment

                    Working...