Regex, matching with required and some optional

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • jagged
    New Member
    • Feb 2008
    • 23

    Regex, matching with required and some optional

    I'm new to C# so am not completely familiar with its regular expression flavor and I don't understand why I can't get this to work. I set the singline | ignorecase options in the regex ctor.

    Given the following example data

    Code:
    <tr>test1:567.98  </tr>
    <tr>test1:567.98    test2:999  </tr>
    <tr>test1:267.98    test2:959  </tr>
    <tr>test1:547.98    test2:699  </tr>
    <tr>test1:567.98  </tr>
    I want a regex that will match each <tr></tr> block (which I'll call lines from now on) and capture the numbers after test1 and test2, if it's there, in named groups. In other words, I want
    Code:
    Debug.Print(Match.Groups["t1"].Value + "--" + Match.Groups["t2"].Value) // should equal 567.98-- for line #1
    Debug.Print(Match.Groups["t1"].Value + "--" + Match.Groups["t2"].Value) // should equal 567.98--999 for line #2
    Debug.Print(Match.Groups["t1"].Value + "--" + Match.Groups["t2"].Value) // should equal 267.98--959 for line #3 ... etc
    so I attempted:

    Code:
    <tr>.*?test1:(?<t1>[\d.,]+).*?(?:test2:(?<t2>[\d]+))?.*?</tr>
    but Match.Groups["t2"] is empty, even though that particular match shows the entire line 2 (including test2:999)

    so then I tried an alternation:

    Code:
    <tr>.*?test1:(?<t1>[\d.,]+).*?(?:(?:test2:(?<t2>[\d]+))|)?.*?</tr>
    same result...

    I tried a few other patterns and they would either return only the first and last line, or only those lines where test2 was present. I tried many permutations (various groupings with named or unnamed), lookbehinds, etc... I just don't get it.

    I have watch statements on all possible groups (i.e: debug.print(m.G roups[0] ... m.Groups[10]) and I can *never* match anything related to test2, even if I'm specifically looking for "test2:999" ...

    Please help.
Working...