REGEX: Quotes still captured, don't want them

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Robert Oschler

    REGEX: Quotes still captured, don't want them

    I am trying to strip out the contents of all double-quoted phrases in a
    string. I tried the following:

    preg_match_all( "/(?:\").*?(?:\")/i", $theString, $matches,
    PREG_PATTERN_OR DER);

    Given the following input string, whose quotes are escaped (slashed):

    dogs \"bean bags\" cats \"in quotes\"

    I get these two matches:

    "bean bags\"
    "in quotes\"

    As you can see, I'm still capturing the quotes and one, but curiously not
    both, of the slashes.

    I want to get just the contents of the string inside the slashed
    double-quotes, so here are my questions:

    1) How can I get just the contents of the string inside the slashed
    double-quotes?

    2) Why are the double-quotes still being captured despite my use of
    non-capturing parentheses?

    3) Why did only one of the slashes get captured?

    Thanks



  • Janwillem Borleffs

    #2
    Re: REGEX: Quotes still captured, don't want them

    Robert Oschler wrote:[color=blue]
    > 1) How can I get just the contents of the string inside the slashed
    > double-quotes?
    >
    > 2) Why are the double-quotes still being captured despite my use of
    > non-capturing parentheses?
    >[/color]

    The (?:...) syntax does capture the specified character sequences; what you
    are really after are assertions:

    '/(?<=\\\")([^\"]+)[^\s](?=\\\")/i'

    Mind the [^\s]; I have put it there to distinguish between \"bean bags\" and
    \" cats \", which is also captured as a quoted substring otherwise.
    [color=blue]
    > 3) Why did only one of the slashes get captured?
    >[/color]

    Because you didn't mark the backslashes as literal characters, as in:

    '/(?:\\\").*?(?:\ \\")/i' (Also note the single quotes enclosing the pattern)


    JW



    Comment

    • Robert Oschler

      #3
      Re: REGEX: Quotes still captured, don't want them


      "Janwillem Borleffs" <jw@jwscripts.c om> wrote in message
      news:42ed3994$0 $6122$dbd49001@ news.euronet.nl ...
      Robert Oschler wrote:[color=blue][color=green]
      > > 1) How can I get just the contents of the string inside the slashed
      > > double-quotes?
      > >
      > > 2) Why are the double-quotes still being captured despite my use of
      > > non-capturing parentheses?
      > >[/color]
      >
      > The (?:...) syntax does capture the specified character sequences; what[/color]
      you[color=blue]
      > are really after are assertions:
      >
      > '/(?<=\\\")([^\"]+)[^\s](?=\\\")/i'
      >
      > Mind the [^\s]; I have put it there to distinguish between \"bean bags\"[/color]
      and[color=blue]
      > \" cats \", which is also captured as a quoted substring otherwise.
      >[color=green]
      > > 3) Why did only one of the slashes get captured?
      > >[/color]
      >
      > Because you didn't mark the backslashes as literal characters, as in:
      >
      > '/(?:\\\").*?(?:\ \\")/i' (Also note the single quotes enclosing the[/color]
      pattern)[color=blue]
      >
      >
      > JW[/color]

      Janwillem,

      Thanks! That does it. Another chapter in the neverending battle between
      the characters we use to program in, being the same characters we have to
      manipulate in text strings. :)

      Robert


      Comment

      • Alexey Kulentsov

        #4
        Re: REGEX: Quotes still captured, don't want them

        Robert Oschler wrote:[color=blue]
        > I am trying to strip out the contents of all double-quoted phrases in a
        > string. I tried the following:
        >
        > preg_match_all( "/(?:\").*?(?:\")/i", $theString, $matches,
        > PREG_PATTERN_OR DER);
        >
        > Given the following input string, whose quotes are escaped (slashed):
        >
        > dogs \"bean bags\" cats \"in quotes\"[/color]
        ....
        [color=blue]
        > I want to get just the contents of the string inside the slashed
        > double-quotes, so here are my questions:[/color]

        preg_match_all( '/(?:\\\\")(.*?)( ?:\\\\")/i', $theString, $matches,
        PREG_PATTERN_OR DER);

        Use $matches[1];

        Comment

        Working...