Help using Regular Expressions?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • gumby21
    New Member
    • Oct 2006
    • 1

    Help using Regular Expressions?

    Is there a metacharacter that i can use to use to search for consonants? I need to search a file for words that have four consecutive consonants in them.

    The file I am searching through is a story two paragraphs long. I also need to know how to extract the file word by word.
  • geek491
    New Member
    • Oct 2006
    • 21

    #2
    Originally posted by gumby21
    Is there a metacharacter that i can use to use to search for consonants? I need to search a file for words that have four consecutive consonants in them.

    The file I am searching through is a story two paragraphs long. I also need to know how to extract the file word by word.
    using regular expressions would complicate things here i guess... what you can do is use substr(), lenght() functions and get your job done.

    to read a file word by word Hmmmmmm.... lot of options but this one is simple

    open(F1,"file.t xt");
    @lines = <F1>;
    foreach $line(@lines)
    {
    @words = split(/ /,$line);
    foreach $word (@words)
    {
    print "$word \n";
    }
    }

    close (F1);

    let me know how it helped to geek491@yahoo.c o.in
    Last edited by geek491; Oct 13 '06, 06:00 AM. Reason: forgot extra code

    Comment

    • miller
      Recognized Expert Top Contributor
      • Oct 2006
      • 1086

      #3
      I agree with geek491, using regular expressions would simply complicate matters. This is one of those problems that you could solve with a masterful single regular expression, but you'd be wasting your time to bother writing it, and wasting computer resources to run it. Instead, working off of geek491's code, you can solve with the following code:

      Code:
      #!/usr/bin/perl
      
      my $inFile = $ARGV[0] or die "no file specified";
      
      open(IN, $inFile) or die "open $inFile: $!";
      
      my $wordCount = 0;
      my $count = 0;
      
      while (my $line = <IN>) {
      	while ($line =~ m{([a-z]+)}ig) {
      		my $word = $1;
      		$wordCount++;
      		if ($word =~ m{[^aeiou]{4}}i) {
      			print "$word\n";
      			$count++;
      		}
      	}
      }
      
      print "$count of $wordCount have 4 consonants\n";
      
      close(IN) or die "close $inFile: $!";
      
      1;
      
      __END__
      You'll notice two important changes to the code that I provided. First off, using split to determine the words in a line does not work completely because it will include punctuation characters. Secondly, I've included the code to test for 4 consonants. As there is not a character class in perl for this, I simply used "not a vowel" to test for this. I've also included a little counter to verify that this code actually ran just in case your text doesn't include any such words.

      Finally, you'll have to decide for yourself if "y" is a vowel. You can include the word "demonstrat ion" if your text doesn't have any long words and you want to verify that this script works.

      Comment

      Working...