Need help with Regex

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Danny Ni

    Need help with Regex

    Hi,

    The following code snippet is causing CPU to max out on my local machine and
    production servers. It looks fine on Expresso though.

    Regex rgxVideo = new
    Regex(@"<embed( \s+[a-z]+\s*=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s+src=\s*( ""|')?http://www.g4tv.com/i?sv3?/(?<videokey>\d+ )(""|')?(\s+[a-z]+\s*=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s*(/\s*>|>\s*</embed>)",
    RegexOptions.Ig noreCase);
    string strBody = "<embed name=\"VideoPla yer\"
    src=\"http://localhost/lv3/26757\" width=\"480\" height=\"418\"
    scale=\"ShowAll \" loop=\"loop\" menu=\"menu\" wmode=\"Window\ " quality=\"1\"
    type=\"applicat ion/x-shockwave-flash\"></embed>" +
    "<embed name=\"VideoPla yer\" src=\"http://localhost/lv3/19251\"
    width=\"480\" height=\"418\" scale=\"ShowAll \" loop=\"loop\" menu=\"menu\"
    wmode=\"Window\ " quality=\"1\"
    type=\"applicat ion/x-shockwave-flash\"></embed>" +
    "<embed name=\"VideoPla yer\" src=\"http://localhost/lv3/20202\"
    width=\"480\" height=\"418\" scale=\"ShowAll \" loop=\"loop\" menu=\"menu\"
    wmode=\"Window\ " quality=\"1\"
    type=\"applicat ion/x-shockwave-flash\"></embed>" +
    "<embed name=\"VideoPla yer\" src=\"http://localhost/lv3/16549\"
    width=\"480\" height=\"418\" scale=\"ShowAll \" loop=\"loop\" menu=\"menu\"
    wmode=\"Window\ " quality=\"1\"
    type=\"applicat ion/x-shockwave-flash\"></embed>";
    foreach (Match objMatch in rgxVideo.Matche s(strBody)) // loop
    indefinitely here
    {


    }

    TIA





  • The Colorado Kid

    #2
    Re: Need help with Regex

    Hello Danny,

    I've found Expresso doesn't work well enough for .Net regex. Use the regex
    designer at http://www.radsoftware.com.au/ it's free. Check it with that.
    Hi,
    >
    The following code snippet is causing CPU to max out on my local
    machine and production servers. It looks fine on Expresso though.
    >
    Regex rgxVideo = new
    Regex(@"<embed( \s+[a-z]+\s*=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s+src=\s*
    (""|')?http://www.g4tv.com/i?sv3?/(?<videokey>\d+ )(""|')?(\s+[a-z]+\s*
    =\s*(""[^""]*""|'[^']*'|[^\s]*))*\s*(/\s*>|>\s*</embed>)",
    RegexOptions.Ig noreCase);
    string strBody = "<embed name=\"VideoPla yer\"
    src=\"http://localhost/lv3/26757\" width=\"480\" height=\"418\"
    scale=\"ShowAll \" loop=\"loop\" menu=\"menu\" wmode=\"Window\ "
    quality=\"1\"
    type=\"applicat ion/x-shockwave-flash\"></embed>" +
    "<embed name=\"VideoPla yer\"
    src=\"http://localhost/lv3/19251\"
    width=\"480\" height=\"418\" scale=\"ShowAll \" loop=\"loop\"
    menu=\"menu\"
    wmode=\"Window\ " quality=\"1\"
    type=\"applicat ion/x-shockwave-flash\"></embed>" +
    "<embed name=\"VideoPla yer\"
    src=\"http://localhost/lv3/20202\"
    width=\"480\" height=\"418\" scale=\"ShowAll \" loop=\"loop\"
    menu=\"menu\"
    wmode=\"Window\ " quality=\"1\"
    type=\"applicat ion/x-shockwave-flash\"></embed>" +
    "<embed name=\"VideoPla yer\"
    src=\"http://localhost/lv3/16549\"
    width=\"480\" height=\"418\" scale=\"ShowAll \" loop=\"loop\"
    menu=\"menu\"
    wmode=\"Window\ " quality=\"1\"
    type=\"applicat ion/x-shockwave-flash\"></embed>";
    foreach (Match objMatch in rgxVideo.Matche s(strBody)) //
    loop
    indefinitely here
    {
    }
    >
    TIA
    >

    Comment

    • The Colorado Kid

      #3
      Re: Need help with Regex

      Hello Danny,

      I've found Expresso doesn't work well enough for .Net regex. Use the regex
      designer at http://www.radsoftware.com.au/ it's free. Check it with that.
      Hi,
      >
      The following code snippet is causing CPU to max out on my local
      machine and production servers. It looks fine on Expresso though.
      >
      Regex rgxVideo = new
      Regex(@"<embed( \s+[a-z]+\s*=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s+src=\s*
      (""|')?http://www.g4tv.com/i?sv3?/(?<videokey>\d+ )(""|')?(\s+[a-z]+\s*
      =\s*(""[^""]*""|'[^']*'|[^\s]*))*\s*(/\s*>|>\s*</embed>)",
      RegexOptions.Ig noreCase);
      string strBody = "<embed name=\"VideoPla yer\"
      src=\"http://localhost/lv3/26757\" width=\"480\" height=\"418\"
      scale=\"ShowAll \" loop=\"loop\" menu=\"menu\" wmode=\"Window\ "
      quality=\"1\"
      type=\"applicat ion/x-shockwave-flash\"></embed>" +
      "<embed name=\"VideoPla yer\"
      src=\"http://localhost/lv3/19251\"
      width=\"480\" height=\"418\" scale=\"ShowAll \" loop=\"loop\"
      menu=\"menu\"
      wmode=\"Window\ " quality=\"1\"
      type=\"applicat ion/x-shockwave-flash\"></embed>" +
      "<embed name=\"VideoPla yer\"
      src=\"http://localhost/lv3/20202\"
      width=\"480\" height=\"418\" scale=\"ShowAll \" loop=\"loop\"
      menu=\"menu\"
      wmode=\"Window\ " quality=\"1\"
      type=\"applicat ion/x-shockwave-flash\"></embed>" +
      "<embed name=\"VideoPla yer\"
      src=\"http://localhost/lv3/16549\"
      width=\"480\" height=\"418\" scale=\"ShowAll \" loop=\"loop\"
      menu=\"menu\"
      wmode=\"Window\ " quality=\"1\"
      type=\"applicat ion/x-shockwave-flash\"></embed>";
      foreach (Match objMatch in rgxVideo.Matche s(strBody)) //
      loop
      indefinitely here
      {
      }
      >
      TIA
      >

      Comment

      • =?Utf-8?B?S290dGVrb2U=?=

        #4
        RE: Need help with Regex

        Danny,

        I tried this in Expresso and it predicts the same behavior you should see in
        code, namely that the execution time of your regex grows exponentially with
        the size of the input string. I'm guessing that when you tested it in
        Expresso, you used a shorter input string or one that easily found a match,
        thereofe it terminated quickly. The example in your code does not have a
        match (for example, "g4tv" will never match). The regex engine has to try
        every possible permutation of your regex hunting for a match. The number of
        permutations grows exponentially with the size of the string, so your
        application hangs, while it continues to try new permutations. There are
        dangerous things in your regex design that cause this. Be very careful with
        nested quantifiers, especially when applied to wildcards, like (.*)*. Things
        like this can cause the execution time to double every time a single
        character is added to the input text. It may work fine for 100 characters,
        but add 10 more and the execution time goes up by a factor of 1000, or add 20
        characters (a 20% increase in length) and the times goes up by one million
        times.

        JWT

        P.S. I don't know what the Colorado Kid is talking about. Expresso is
        specifically designed to work with .NET regex.

        "Danny Ni" wrote:
        Hi,
        >
        The following code snippet is causing CPU to max out on my local machine and
        production servers. It looks fine on Expresso though.
        >
        Regex rgxVideo = new
        Regex(@"<embed( \s+[a-z]+\s*=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s+src=\s*( ""|')?http://www.g4tv.com/i?sv3?/(?<videokey>\d+ )(""|')?(\s+[a-z]+\s*=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s*(/\s*>|>\s*</embed>)",
        RegexOptions.Ig noreCase);
        string strBody = "<embed name=\"VideoPla yer\"
        src=\"http://localhost/lv3/26757\" width=\"480\" height=\"418\"
        scale=\"ShowAll \" loop=\"loop\" menu=\"menu\" wmode=\"Window\ " quality=\"1\"
        type=\"applicat ion/x-shockwave-flash\"></embed>" +
        "<embed name=\"VideoPla yer\" src=\"http://localhost/lv3/19251\"
        width=\"480\" height=\"418\" scale=\"ShowAll \" loop=\"loop\" menu=\"menu\"
        wmode=\"Window\ " quality=\"1\"
        type=\"applicat ion/x-shockwave-flash\"></embed>" +
        "<embed name=\"VideoPla yer\" src=\"http://localhost/lv3/20202\"
        width=\"480\" height=\"418\" scale=\"ShowAll \" loop=\"loop\" menu=\"menu\"
        wmode=\"Window\ " quality=\"1\"
        type=\"applicat ion/x-shockwave-flash\"></embed>" +
        "<embed name=\"VideoPla yer\" src=\"http://localhost/lv3/16549\"
        width=\"480\" height=\"418\" scale=\"ShowAll \" loop=\"loop\" menu=\"menu\"
        wmode=\"Window\ " quality=\"1\"
        type=\"applicat ion/x-shockwave-flash\"></embed>";
        foreach (Match objMatch in rgxVideo.Matche s(strBody)) // loop
        indefinitely here
        {
        >
        >
        }
        >
        TIA
        >
        >
        >
        >
        >
        >

        Comment

        • The Colorado Kid

          #5
          RE: Need help with Regex

          Kottekoe,

          I used to use Expresso for regex testing in my .Net programs, but one day,
          something worked in Expresso, but didn't in actual .Net so I ditched it for
          the Rad Regex Designer, which is a great tool. I liked Expresso, but find
          Rad's better.
          Danny,
          >
          I tried this in Expresso and it predicts the same behavior you should
          see in code, namely that the execution time of your regex grows
          exponentially with the size of the input string. I'm guessing that
          when you tested it in Expresso, you used a shorter input string or one
          that easily found a match, thereofe it terminated quickly. The example
          in your code does not have a match (for example, "g4tv" will never
          match). The regex engine has to try every possible permutation of your
          regex hunting for a match. The number of permutations grows
          exponentially with the size of the string, so your application hangs,
          while it continues to try new permutations. There are dangerous things
          in your regex design that cause this. Be very careful with nested
          quantifiers, especially when applied to wildcards, like (.*)*. Things
          like this can cause the execution time to double every time a single
          character is added to the input text. It may work fine for 100
          characters, but add 10 more and the execution time goes up by a factor
          of 1000, or add 20 characters (a 20% increase in length) and the times
          goes up by one million times.
          >
          JWT
          >
          P.S. I don't know what the Colorado Kid is talking about. Expresso is
          specifically designed to work with .NET regex.
          >
          "Danny Ni" wrote:
          >
          >Hi,
          >>
          >The following code snippet is causing CPU to max out on my local
          >machine and production servers. It looks fine on Expresso though.
          >>
          >Regex rgxVideo = new
          >>
          >Regex(@"<embed (\s+[a-z]+\s*=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s+src=\s
          >*(""|')?http ://www.g4tv.com/i?sv3?/(?<videokey>\d+ )(""|')?(\s+[a-z]+\
          >s*=\s*(""[^""]*""|'[^']*'|[^\s]*))*\s*(/\s*>|>\s*</embed>)",
          >>
          >RegexOptions.I gnoreCase);
          >>
          >string strBody = "<embed name=\"VideoPla yer\"
          >>
          >src=\"http://localhost/lv3/26757\" width=\"480\" height=\"418\"
          >>
          >scale=\"ShowAl l\" loop=\"loop\" menu=\"menu\" wmode=\"Window\ "
          >quality=\"1\ "
          >>
          >type=\"applica tion/x-shockwave-flash\"></embed>" +
          >>
          >"<embed name=\"VideoPla yer\" src=\"http://localhost/lv3/19251\"
          >>
          >width=\"480\ " height=\"418\" scale=\"ShowAll \" loop=\"loop\"
          >menu=\"menu\ "
          >>
          >wmode=\"Window \" quality=\"1\"
          >>
          >type=\"applica tion/x-shockwave-flash\"></embed>" +
          >>
          >"<embed name=\"VideoPla yer\" src=\"http://localhost/lv3/20202\"
          >>
          >width=\"480\ " height=\"418\" scale=\"ShowAll \" loop=\"loop\"
          >menu=\"menu\ "
          >>
          >wmode=\"Window \" quality=\"1\"
          >>
          >type=\"applica tion/x-shockwave-flash\"></embed>" +
          >>
          >"<embed name=\"VideoPla yer\" src=\"http://localhost/lv3/16549\"
          >>
          >width=\"480\ " height=\"418\" scale=\"ShowAll \" loop=\"loop\"
          >menu=\"menu\ "
          >>
          >wmode=\"Window \" quality=\"1\"
          >>
          >type=\"applica tion/x-shockwave-flash\"></embed>";
          >>
          >foreach (Match objMatch in rgxVideo.Matche s(strBody)) // loop
          >>
          >indefinitely here
          >>
          >{
          >>
          >}
          >>
          >TIA
          >>

          Comment

          Working...