Regex help on a nested table.

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Matt T.

    Regex help on a nested table.

    I am trying to replace the nested table tag in the follow string[1]
    using a regular expression, but I am not having any success. I am new
    at using regular expressions, so I am sure I am just overlooking
    something simple.

    I thought <table.*>.*<tab le.*>.*</table> or some variation of that
    would work to create a match, but it does not. Anyone have an idea as
    to what would work?

    [1]
    <table *deleted attributes*>
    <tr>
    <td>
    <table *deleted attributes*>
    <tr>
    <td><a>Nov</a></td>
    <td>December 2003</td>
    <td><a>Jan</a></td>
    </tr>
    </table>
    </table>
  • Jerry Negrelli

    #2
    Regex help on a nested table.

    You need to use a backward-looking operator (there's a
    much better jargon word for this which escapes me at the
    moment).

    *? will match as few characters as possible, so you
    should be able to do this:

    <table[^>]*>.*?<table[^>]*>.*?</table>

    I have a feeling that this operator can cause performance
    lag, but it should work fine in your situation.

    I haven't tested the code, so give it a try -- did that
    work?

    JER

    [color=blue]
    >-----Original Message-----
    >I am trying to replace the nested table tag in the[/color]
    follow string[1][color=blue]
    >using a regular expression, but I am not having any[/color]
    success. I am new[color=blue]
    >at using regular expressions, so I am sure I am just[/color]
    overlooking[color=blue]
    >something simple.
    >
    >I thought <table.*>.*<tab le.*>.*</table> or some[/color]
    variation of that[color=blue]
    >would work to create a match, but it does not. Anyone[/color]
    have an idea as[color=blue]
    >to what would work?
    >
    >[1]
    ><table *deleted attributes*>
    > <tr>
    > <td>
    > <table *deleted attributes*>
    > <tr>
    > <td><a>Nov</a></td>
    > <td>December 2003</td>
    > <td><a>Jan</a></td>
    > </tr>
    > </table>
    ></table>
    >.
    >[/color]

    Comment

    • Brian Davis

      #3
      Re: Regex help on a nested table.


      When you need the '.' to span multiple lines, use singleline mode. Try some
      variation of this expression:

      (?s)(?<=<table[^>]*>.*?)<table[^>]*>.*?</table>

      The (?s) turns on the singleline option, which you could also do in code
      like this:

      Regex r = new
      Regex(@"(?<=<ta ble[^>]*>.*?)<table[^>]*>.*?</table>",RegexOp tions.Singlelin e
      );

      The (?<=...) construct is a zero-width positive look-behind assertion,
      which means "match this before what comes next, but don't include it in the
      resulting match". Like the previous poster said, you also need to use lazy
      quantifiers - *? - and negated character classes - [^>]. Lots of
      big-sounding words and regex jargon here, but once you get a handle on it,
      it offers a lot of power.


      Brian Davis




      "Matt T." <gofishtn@hotma il.com> wrote in message
      news:f4bdd291.0 312120822.2b6a7 4c6@posting.goo gle.com...[color=blue]
      > I am trying to replace the nested table tag in the follow string[1]
      > using a regular expression, but I am not having any success. I am new
      > at using regular expressions, so I am sure I am just overlooking
      > something simple.
      >
      > I thought <table.*>.*<tab le.*>.*</table> or some variation of that
      > would work to create a match, but it does not. Anyone have an idea as
      > to what would work?
      >
      > [1]
      > <table *deleted attributes*>
      > <tr>
      > <td>
      > <table *deleted attributes*>
      > <tr>
      > <td><a>Nov</a></td>
      > <td>December 2003</td>
      > <td><a>Jan</a></td>
      > </tr>
      > </table>
      > </table>[/color]


      Comment

      Working...