Trying to create this type of regular expression...

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • T12
    New Member
    • Jun 2007
    • 2

    Trying to create this type of regular expression...

    I am trying to create a regular expression that will find matches for select keywords both in front and after different phrases. I'll try to explain it.. for example:

    I have three keywords: word1|word2|wor d3 that I want matched with three different phrases phrase1|phrase2 |phrase3

    So I'd like the following to "hit"

    word1 phrase1
    phrase2 word 1
    etc..

    But I do not want word1,2,3 or phrase1,2,3 to be matched alone, and I'm trying to have this all done in a single string. Hopefully this makes sense to some of you out there... thanks!!
  • miller
    Recognized Expert Top Contributor
    • Oct 2006
    • 1086

    #2
    Greetings and Welcome,

    Why not just keep this simple and use two regular expressions?

    [CODE=perl]
    while (<DATA>) {
    if ($_ =~ /word1|word2|wor d3/ && $_ =~ /phrase1|phrase2 |phrase3/) {
    print "Match found: $_";
    }
    }

    __DATA__
    word1 no phrace
    no phrase word2
    word1 phrase1
    phrase2 word3
    phrase3 phrase2 phrase1
    word1 word3 word2
    phrase2 phrase3 word1
    [/CODE]

    Output:
    Code:
    >perl scratch.pl
    Match found: word1 phrase1
    Match found: phrase2 word3
    Match found: phrase2 phrase3 word1
    Yes it is possible to do this in a single regular expression. But why complicate things? However, there is one small bug in the above code and that's the fact that if a word is within a phrase, that will always match.

    One way to do the equivalent of the above in a single regex using zero-width positive lookahead assertions. You can read about them here:

    perldoc perlretut

    - Miller

    Comment

    • T12
      New Member
      • Jun 2007
      • 2

      #3
      Thank you for the information. I've had a look at the site, but unfortunately I don't really understand how to compose the single string I need to get these type of matches. I'm still curious about how to accomplish this in one line. Would you be able to post the complete string? I'll continue to research and hopefully figure it out at some point...

      Comment

      • miller
        Recognized Expert Top Contributor
        • Oct 2006
        • 1086

        #4
        Yes, I'll give you the method that I was talking about. However, I discourage you from using this until you actually know how it works. You'll even notice that it actually takes more code to do this in one regex. Nevertheless, here's the example:

        [CODE=perl]
        while (<DATA>) {
        # if ($_ =~ /word1|word2|wor d3/ && $_ =~ /phrase1|phrase2 |phrase3/) {
        if ($_ =~ /^(?=.*?(?:word1 |word2|word3))( ?=.*?(?:phrase1 |phrase2|phrase 3))/) {
        print "Match found: $_";
        }
        }

        __DATA__
        word1 no phrace
        no phrase word2
        word1 phrase1
        phrase2 word3
        phrase3 phrase2 phrase1
        word1 word3 word2
        phrase2 phrase3 word1
        [/CODE]

        - Miller

        Comment

        Working...