Faster way to read text from datafile

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Scholar
    New Member
    • Aug 2008
    • 9

    Faster way to read text from datafile

    Hi friends, again i need ur help..
    I hve a text datafile which consists of one word per line like this...

    program
    byte
    this
    example


    and i am using the following code to read the words from the datafile..



    fstream f1;
    f1.open("file.t xt",ios::in|ios ::out);
    f1.getline(vari able,10,'\n');
    f1.close();

    The datafile consists of more than 75000 words and it takes much time to read the whole file .Can anybody suggest a better code to speed up the reading of file.
  • Lintwurm
    New Member
    • Sep 2008
    • 7

    #2
    Hello, I am also pretty new to this so if it isn't more efficient, I am sorry...

    Maybe try this...

    int wordcount = 0;
    string word[(number)]; //maybe just make the array a really large number?
    while( !file.txt.eof() )
    {
    wordcount++;
    (ifstreamname)> >word[wordcount];
    }

    I don't know how this will go but something in these lines should work fine =)

    I hope this helps...

    Comment

    • Banfa
      Recognized Expert Expert
      • Feb 2006
      • 9067

      #3
      I would certainly suggest that you do not open and close the file for each line read (especially as that is likely to cause you to always read the first line only so only read the first word.

      Additionally way back when I used to do disk access (not much call for it in the embedded projects I tend to work on now) popular wisdom was that it was slow to read small bits of data from the disk because the minimum amount of data a disk can read is 512 bytes (1 sector), so each word read actually read 512 bytes. You ended up reading the bytes all several times.

      It was quick to read the entire file into ram and then parse it there (or at least read the file in large chunks say 4kb).

      However since modern disks have read (and write) caches I am not sure that this would still be true (retrieving the data from the caches would be much much faster than having to re-read it. But you could experiment with large reads to prove that.

      Comment

      • Jondlar
        New Member
        • Sep 2008
        • 9

        #4
        You should use a while or do-while loop for reading the file until the end is reached.

        When you close and open a file again and again, it takes alot of time.

        Open the file.

        Loop till end of file is reached
        read a line each time.
        End Loop

        Close the file.

        Comment

        • JosAH
          Recognized Expert MVP
          • Mar 2007
          • 11453

          #5
          If you use Unix/Linux a far faster way is to directly m(un)map() the file to virtual
          memory; maybe a similar functionality is available in Windows as well. That
          little technique bypasses a couple of buffers and caches. If you copy the content
          of the mapped buffer to another buffer you can substitute the \n character or the
          first character of the pair \r\n to \0 effecively creating C strings on the fly. Copying
          the file content is necessary because you don't want those changes to show up
          in the file itself.

          kind regards,

          Jos

          Comment

          Working...