Counting Punctuation Marks (in a text file)

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • sv5perl
    New Member
    • Aug 2007
    • 3

    Counting Punctuation Marks (in a text file)

    I want to ask for some advice on a script that will count all the punctuation marks in a text file, I know it's probably quite a basic thing but I am new to Perl and would really appreciate the help, the output would also need to display how many occurances of each there are (such as ?, ., ,, ..., ! -, ;, : e.t.c)

    Thanks in advance,

    Gary
  • numberwhun
    Recognized Expert Moderator Specialist
    • May 2007
    • 3467

    #2
    Hi Gary! Welcome!

    Please know that when you post to the forum, we really need to see the code that you have tried thus far. That way we can assist you with the issue(s)/problem(s) you are having and can help get your code working.

    Also, I don't know about everyone else, but this sounds distinctly like it is a homework assignment. If we wrote it for you, you wouldn't learn a thing. Why don't you give it a shot and see what you come up with and we will guide you from there.

    Personally, I would cycle through each line of the file using a while loop, and in the while loop, test each line for the punctuation marks(with a regex). It would also be a good idea to have a counter going. If you need to count each individual punctuation, then you may want multiple counters, but that is just me.

    Perl is such a wonderful language that this exercise should show you a few different possibilities.

    Regards,

    Jeff

    Comment

    • sv5perl
      New Member
      • Aug 2007
      • 3

      #3
      Thanks Jeff,

      You are right in thinking this looks like a homework assignment.

      I have been stuck on this one question for nearly four months now and it is really bugging me.

      I have read the Perl folder of my course nearly nine times and it is the counter that I am having the problem with.

      I know how to search for any characters, including punctuation marks but not sure on how to return the count for each one, I was think of assigning them to an array and then printing the contents of the array at the end, but its the code for generating a count of each character that is troubling me.

      When I get home tonight I will post the code I have created so far, your quick response and help so far is very, very much appreciated... I need to know this because it's the only part of the course I am having problems with, but If I can't do this I can't finish the course and so it's £2,500 down the drain (and hundreds & hundreds of hours of study time I have put in) I don't feel that posting this is cheating though, not really because considering I have read this section of the course nine times, my opinion is that it was not sufficiently covered in my study material - (I can only know what I have learnt!)

      Very best regards,

      Gary Colman (sv5perl)

      Comment

      • numberwhun
        Recognized Expert Moderator Specialist
        • May 2007
        • 3467

        #4
        Well, once you feed the file into the while loop, you can cycle through with a bunch of if statements for each one. Here is an example. Feel free to use this and come up with something complete.

        [CODE=perl]
        #/usr/bin/perl

        use strict;
        use warnings;

        ############### ############### #####
        # Setup the counters
        ############### ############### #####
        my $qm_counter = 0; # question mark counter
        my $period_counter = 0; # period counter
        my $comma_counter = 0; # comma counter


        ############### ############### #####
        # Open a file
        ############### ############### #####
        open(FILE, "./myFile.txt");

        ############### ############### #####
        # Process the file
        ############### ############### #####
        while(<FILE>)
        {
        if( $_ = m/\?/)
        {
        qm_counter++;
        }

        if( $_ = m/\,/)
        {
        comma_counter++ ;
        }

        if( $_ = m/\./)
        {
        period_counter+ +;
        }
        }

        print("Number of question marks: $qm_counter");
        print("Number of commas: $comma_counter" );
        print("Number of periods: $period_counter ");

        [/CODE]

        That should get you started. Granted, as always, TIMTOWTDI, but this isn't that bad and should get you to where you want to go. That will cycle through each line, checking the line for each punctuation listed. If you have others to check for, then just add other instances inside of the while loop.

        Regards,

        Jeff

        Comment

        • KevinADC
          Recognized Expert Specialist
          • Jan 2007
          • 4092

          #5
          You are basically counting sub strings:

          how do I count sub strings

          Comment

          • miller
            Recognized Expert Top Contributor
            • Oct 2006
            • 1086

            #6
            Only because ... well ... you know:

            Code:
            >perl -MData::Dumper -ne "END {print Dumper(\%c)}; $c{$1}++ while (/([^\w\s])/g);" scratch.txt
            $VAR1 = {
            		  ':' => 1,
            		  ',' => 10,
            		  '?' => 1,
            		  '-' => 1,
            		  ')' => 1,
            		  '.' => 6,
            		  '\'' => 1,
            		  '(' => 1,
            		  ';' => 1,
            		  '!' => 1
            		};
            - Miller

            Comment

            • sv5perl
              New Member
              • Aug 2007
              • 3

              #7
              thanks kindly for your helpful posting.

              Regards,

              Gary.

              Comment

              Working...