Regular Expression Question

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • crystal2005
    New Member
    • Apr 2007
    • 44

    Regular Expression Question

    Hi all,

    My short program below is to read directories names form command line. It should only start with alphanumeric characters and contain no other characters except spaces, underscores and dashes.

    So far what i have done below is only accepting alphanumeric characters and underscores. My question is, how do i add another character exception instead of using "\W" that reads only alphanumerics and underscores?

    Thanks of any kind help.

    [CODE=perl]
    #!/usr/bin/perl -w
    use strict;

    my $path1=$ARGV[0];
    my $path2=$ARGV[1];
    my @dir1;
    my @dir2;

    #Read The First Directory
    if($path1 =~ /\W/) {
    print "Invalid directory!\n";
    exit;
    }
    else {
    opendir DIR1, $path1 or die exit;
    @dir1 = sort readdir DIR1;
    closedir DIR1;
    }

    #Read The Second Directory
    if($path2 =~ /\W/) {
    print "Invalid directory!\n";
    exit;
    }
    else {
    opendir DIR2, $path2 or die exit;
    @dir2 = sort readdir DIR2;
    closedir DIR2;
    }
    [/CODE]
  • nithinpes
    Recognized Expert Contributor
    • Dec 2007
    • 410

    #2
    Originally posted by crystal2005
    Hi all,

    My short program below is to read directories names form command line. It should only start with alphanumeric characters and contain no other characters except spaces, underscores and dashes.

    So far what i have done below is only accepting alphanumeric characters and underscores. My question is, how do i add another character exception instead of using "\W" that reads only alphanumerics and underscores?

    Thanks of any kind help.

    [CODE=perl]
    #!/usr/bin/perl -w
    use strict;

    my $path1=$ARGV[0];
    my $path2=$ARGV[1];
    my @dir1;
    my @dir2;

    #Read The First Directory
    if($path1 =~ /\W/) {
    print "Invalid directory!\n";
    exit;
    }
    else {
    opendir DIR1, $path1 or die exit;
    @dir1 = sort readdir DIR1;
    closedir DIR1;
    }

    #Read The Second Directory
    if($path2 =~ /\W/) {
    print "Invalid directory!\n";
    exit;
    }
    else {
    opendir DIR2, $path2 or die exit;
    @dir2 = sort readdir DIR2;
    closedir DIR2;
    }
    [/CODE]
    Apart from not including spaces in the exception of special characters, you are also not checking that name should begin with alphanumeric characters.
    \W also includes spaces or in other words \w doesn't include spaces. To exclude spaces use the expanded form.

    Replace:

    Code:
    if($path1 =~ /\W/)
    with
    Code:
    if($path1 =~ /[^A-Za-z0-9_ ]/|| /^[^A-Za-z0-9]/)
    The pattern /^[^A-Za-z0-9]/ will match words that begin with non-alphanumeric characters.

    Comment

    • crystal2005
      New Member
      • Apr 2007
      • 44

      #3
      Originally posted by nithinpes
      Apart from not including spaces in the exception of special characters, you are also not checking that name should begin with alphanumeric characters.
      \W also includes spaces or in other words \w doesn't include spaces. To exclude spaces use the expanded form.

      Replace:

      Code:
      if($path1 =~ /\W/)
      with
      Code:
      if($path1 =~ /[^A-Za-z0-9_ ]/|| /^[^A-Za-z0-9]/)
      The pattern /^[^A-Za-z0-9]/ will match words that begin with non-alphanumeric characters.
      Hi, thanks for the help.

      The pattern /^[^A-Za-z0-9]/, does it mean it will accept all special characters?
      From what i have read about directory naming, it doesn't accept \ / : * ? " < > |.

      Comment

      • eWish
        Recognized Expert Contributor
        • Jul 2007
        • 973

        #4
        Check out an article that was posted in the Howto's section titled "Character Classes and Special Characters" maybe that will help clear things up.

        --Kevin

        Comment

        • nithinpes
          Recognized Expert Contributor
          • Dec 2007
          • 410

          #5
          Originally posted by crystal2005
          Hi, thanks for the help.

          The pattern /^[^A-Za-z0-9]/, does it mean it will accept all special characters?
          From what i have read about directory naming, it doesn't accept \ / : * ? " < > |.
          In the pattern, the caret(^) symbol denotes begining of line/string. But when used within the character class([ ]), caret stands for negation. In this case [^A-Za-z0-9] would match any character other than A-Z,a-z and 0-9.
          Hence, /^[^A-Za-z0-9]/ will match all lines that do not begin with alphanumeric characters.

          For better understanding of character class and special characters, go through the link that Kevin has posted.

          Comment

          • crystal2005
            New Member
            • Apr 2007
            • 44

            #6
            Thanks a lot, i got it useful . . .

            Comment

            Working...