Problem in multiline regular expression

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • rizwan6feb
    New Member
    • Jul 2007
    • 108

    Problem in multiline regular expression

    Hi everyone! I am having a problem in getting the correct regular expression. I have the following string, and i want to get everything between [htmlcode] and [/htmlcode]

    [htmlcode]
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Untitl ed Document</title>
    <style>
    [/htmlcode]


    using the following regular expression but this is not working, may be due to multiline issue

    $regex="/\[htmlcode\](.+)\[\/htmlcode\]/"
  • Markus
    Recognized Expert Expert
    • Jun 2007
    • 6092

    #2
    Originally posted by rizwan6feb
    Hi everyone! I am having a problem in getting the correct regular expression. I have the following string, and i want to get everything between [htmlcode] and [/htmlcode]

    [htmlcode]
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Untitl ed Document</title>
    <style>
    [/htmlcode]


    using the following regular expression but this is not working, may be due to multiline issue

    $regex="/\[htmlcode\](.+)\[\/htmlcode\]/"
    i don't know if ' .+ ' is the correct syntax; I normally use ' .* '

    Therefore:
    [php]
    $regex="/\[htmlcode\](.*)\[\/htmlcode\]/"
    [/php]

    Comment

    • Atli
      Recognized Expert Expert
      • Nov 2006
      • 5062

      #3
      Try
      Code:
      /\[htmlcode\](.|\s)*\[\/htmlcode\]/
      If memory serves, the . class does not include white-space characters, such as \n and \t, so it would stop at new-lines.

      Comment

      • rizwan6feb
        New Member
        • Jul 2007
        • 108

        #4
        Originally posted by Atli
        Try
        Code:
        /\[htmlcode\](.|\s)*\[\/htmlcode\]/
        If memory serves, the . class does not include white-space characters, such as \n and \t, so it would stop at new-lines.
        Thanks! It works but also returns the part which contains [htmlcode] i.e returns the pattern as well, the second array contains nothing. I am using preg_match (PHP function). Passing only first 2 parameters.

        Comment

        • Atli
          Recognized Expert Expert
          • Nov 2006
          • 5062

          #5
          Ahh, the problem is probably that the * is outside the parenthesized subpattern.

          Maybe this will work:
          Code:
          /\[htmlcode\]([\w\W]*)\[\/htmlcode\]/
          It basically creates a class with all the alpha-numeric characters... and all the non-alpha-numeric characters.

          The entire string should be int the first element of the match array, but the sub-pattern in the second (at index 1).

          Comment

          • rizwan6feb
            New Member
            • Jul 2007
            • 108

            #6
            Yes the * was outside parenthesis. This time its working perfectly.
            Thank you very much.

            Comment

            Working...