word count

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • agent mike

    word count

    I am trying to count words in a text file. I am using the following code:

    in_stream.get(c );
    if(c == ' ' || c == '.' || c == ',')
    word_count++;

    and the word count is too low. If I include " .... || c == '\n'
    the word count is too high as it counts returns of blank lines
    as a word.

    I am at my wits end. I'm in a beginning c++ class and this assignment
    is nearly due.

    HELP

    me

    over 'n' out
    agent mike
  • red floyd

    #2
    Re: word count

    agent mike wrote:[color=blue]
    > I am trying to count words in a text file. I am using the following code:
    >
    > in_stream.get(c );
    > if(c == ' ' || c == '.' || c == ',')
    > word_count++;
    >
    > and the word count is too low. If I include " .... || c == '\n'
    > the word count is too high as it counts returns of blank lines
    > as a word.[/color]

    You have the problem that a stream of delimiters (blanks, commas and periods) will also
    bump the word count too high.

    Hint:

    Once you find a delimiter ('.', ',', ' ', '\t', '\n'), then flush all consecutive delimiters ("while" is your friend).

    No code, you should be able to figure it out.

    Comment

    • Karl Heinz Buchegger

      #3
      Re: word count



      agent mike wrote:[color=blue]
      >
      > I am trying to count words in a text file. I am using the following code:
      >
      > in_stream.get(c );
      > if(c == ' ' || c == '.' || c == ',')
      > word_count++;
      >
      > and the word count is too low. If I include " .... || c == '\n'
      > the word count is too high as it counts returns of blank lines
      > as a word.
      >
      > I am at my wits end. I'm in a beginning c++ class and this assignment
      > is nearly due.[/color]

      Hint: Often it is a good idea to include additional output statements to figure
      out why a program behaves the way it does. So I suggest:

      in_stream.get(c );
      cout << "Read '" << c << "'\n";
      if(c == ' ' || c == '.' || c == ',') {
      cout << "Found new word"\n";
      word_count++;
      }

      Run your program again on a faulty input that is not to large
      and use the additional output to analyse why your program doesn't
      count the way you think it should.

      Thats one of the simplest debug technique and believe me: you will
      spend lots of time in debugging. So learning how to figure out where
      your thinking was flawed is something you need to learn early.
      And of course its fun: You think of a system (an algorithm) and
      the machine shows you where your system breaks down.

      --
      Karl Heinz Buchegger
      kbuchegg@gascad .at

      Comment

      • Gianni Mariani

        #4
        Re: word count

        red floyd wrote:[color=blue]
        > agent mike wrote:
        >[color=green]
        >> I am trying to count words in a text file. I am using the following
        >> code:
        >>
        >> in_stream.get(c );
        >> if(c == ' ' || c == '.' || c == ',')
        >> word_count++;
        >>
        >> and the word count is too low. If I include " .... || c == '\n'
        >> the word count is too high as it counts returns of blank lines
        >> as a word.[/color]
        >
        >
        > You have the problem that a stream of delimiters (blanks, commas and
        > periods) will also
        > bump the word count too high.
        >
        > Hint:
        >
        > Once you find a delimiter ('.', ',', ' ', '\t', '\n'), then flush all
        > consecutive delimiters ("while" is your friend).
        >
        > No code, you should be able to figure it out.
        >[/color]

        Or you can create a state machine and count pertintent state changes.

        here is some psuedo code.

        state = start_state

        while ( get( c ) )
        {

        target_state = is_in_word( c ) ? in_word_state : out_word_state;

        if ( state == in_word_state ) {
        if ( target_state == out_word_state ) {
        ++ word_count;
        }
        }

        state = target_state;
        }

        if ( state == in_word_state )
        ++ word_count;
        }


        Comment

        Working...