word count problem

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • waynejr25
    New Member
    • Nov 2007
    • 3

    word count problem

    can anyone help me add a function that will count the occurance of each word in an input file. here's the code i have so far it counts the number of characters, words, and lines but i need the occurance of each word.

    [CODE=cpp]#include <fstream>
    #include <iostream>
    #include <string>
    #include <cstdlib>

    using namespace std;



    string getInputFileNam e(); // a function to prompt for the complete file name

    int numCharsInFile( ifstream &in, int &numLines ); // a function to count the
    // number of characters and
    // lines in a text file

    int numWordsInFile( ifstream &in, int &numWords ); // a function to count words in file



    main ()
    {

    char c;
    int nLines, // number of lines in the text file
    nChars, // number of characters in the text file
    avgCharsPerLine , // average number of characters per line
    nWords; // number of words in the text file


    ifstream inFile; // handle for the input text file

    string fileName; // complete file name including the path

    fileName = getInputFileNam e(); // prompt and obtain the full file name

    inFile.open(fil eName.c_str()); // try to open the file

    if( !inFile.is_open () ) // test for unsuccessfull file opening
    {
    cerr << "Cannot open file: " << fileName << endl << endl;
    exit (0);
    }


    nChars = numCharsInFile( inFile, nLines ); // determine the number of lines
    // and characters in the file
    nWords = numWordsInFile( inFile, nWords); // determine the number of words

    avgCharsPerLine = nChars / nLines;


    cout << "The number of characters in the file: " << fileName
    << " is = " << nChars << endl << endl;

    cout << "The number of lines in the file: " << fileName
    << " is = " << nLines << endl << endl;


    cout << "The number of Words in the file: " << fileName
    << " is = " << nWords << endl << endl;

    cout << "The average number of characters per line in the text file: "
    << fileName << " is: " << avgCharsPerLine << endl << endl;
    cin>>c;
    inFile.close(); // close the input file

    }



    string getInputFileNam e()
    {
    string fName; // fully qualified name of the file

    cout << "Please enter the fully qualified name of the " << endl
    << "input text file (i.e. including the path): ";
    cin >> fName; // cannot handle blanks in a file name or path
    cout << endl;

    return fName;
    }





    int numCharsInFile( ifstream &in, int &numLines )
    {
    int numChars = 0;

    char ch; // character holder;

    numLines = 0; // initialize the number of lines to zero

    while ( in.get(ch) ) // get the next character from the file
    // the function get will also get whitespace
    // i.e. blanks, tabs and end of line characters
    {
    if (ch != ' ' )
    {
    if(ch != '\n')
    numChars++;// increase the count of characters by one if ch is NOT '\n' AND NOT a blank space
    else
    {
    numLines++; // increase the count of lines by one if ch IS '\n'
    }
    }
    }
    numLines += 1; // for some reason it needs to add one and the results are correct
    return numChars;
    }




    int numWordsInFile( ifstream &in, int &nWords)
    {
    in.clear();

    in.seekg(0, ios_base::beg);

    int numWords = 0 ;

    char ch;


    while (in.get(ch))
    {

    if ( ch == ' ' || ch == '\n' || ch == '\t' )
    numWords++;


    }

    return numWords+1;
    }[/CODE]
    Last edited by Ganon11; Nov 6 '07, 06:35 PM. Reason: Please use the [CODE] tags provided.
  • scruggsy
    New Member
    • Mar 2007
    • 147

    #2
    Originally posted by waynejr25
    can anyone help me add a function that will count the occurance of each word in an input file. here's the code i have so far it counts the number of characters, words, and lines but i need the occurance of each word.
    I'm not going to write the code, but think about it: If you're going to count the occurrence of each distinct word, you'll need to remember those words. So as you read words in, you'll need to store them so that subsequent words can be compared to them. How you store them is up to you, as is how you compare them. STL containers can be a big help there. Take a look at std::set if you're not familiar with it; it's a container which can't hold duplicate elements, which lets you easily determine if a word occurs more than once in the file. Another good way to do this might be to just store each word as it is written, then sort the words in alphabetical order: recurring words will appear next to each other, making it easy to count them.

    Comment

    • Laharl
      Recognized Expert Contributor
      • Sep 2007
      • 849

      #3
      std::map would probably be better than std::set, since that way you can map strings (words) to integers (frequency counts).

      Comment

      • weaknessforcats
        Recognized Expert Expert
        • Mar 2007
        • 9214

        #4
        Also keep in mind that the >> operator stops on whitespace. You can fetch one word by:
        [code=cpp]
        sting str;
        fileName >> str;
        [/code]

        Also, you are not required to declare your variables at the beginning of each function. It looks like you have a C background and are just starting out on C++.

        Comment

        Working...