A (hopefully) simple regular expression question...

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • alexrussell101@gmail.com

    A (hopefully) simple regular expression question...

    For anyone who can't be bothered to read my code and examples, scroll
    to the bottom, the question's there. Thanks.

    I'm using php and regular expressions to convert bbcode style things to
    html. My code to convert something like this:

    [quote Bob]
    Hello there
    [/quote]

    to something like this:

    <fieldset>
    <legend>&nbsp;B ob&nbsp;</legend>
    <div class="quote">H ello there</div>
    </fieldset>

    goes something like this:

    $ret = preg_replace("/\[quote (.+?)\](.+?)\[\/quote\]/i", "\n
    <fieldset>\n <legend>&nbsp;\ \1&nbsp;</legend>\n <div
    class=\"quote\" >\\2</div>\n </fieldset>\n ", $ret);


    Now I have some code that cuts the text at a certain point (in order to
    get a general preview of a message) using this function:

    function cut_post($strin g, $length = DEFAULT_CUT_LEN GTH) {
    if (strlen($string ) > $length) { // if the string IS too long...
    $ret = substr($string, 0, $length); // cut it

    for ($i = strlen($ret) - 1; $i >= 0; $i--) { // going from the
    end to the beginning
    if (substr($ret, $i, 1) == "<") { // if the current
    character is an html open tag (<)
    if (substr($ret, $i + 1, 1) != "/") { // if we found
    anything except "</" (i.e. an actual start tag)
    $ret = substr($ret, 0, $i); // return everything up
    to that point
    $i = -1; // and set our pointer to before the
    string starts so we don't carry on looking
    } else {
    $i = -1;
    }
    }
    }

    if (substr($ret, strlen($ret) - 1, 1) == " ")
    $ret = substr($ret, 0, strlen($ret) - 1);

    $ret .= "...";
    } else { // if it's not too long...
    $ret = $string; // don't cut it. amazing!
    }

    return $ret;
    }


    I have a problem when it just happens to cut half way through a long
    quote the post cutter does its job and cuts everything from the last
    HTML tag it may have been half way through (the <div> inside the
    <fieldset>) but unfortunately doesn't account for the open <fieldset>
    and it never gets closed as the post was cut there, ending up with HTML
    similar to:

    <fieldset>
    <legend>&nbsp;B ob&nbsp;</legend>
    <div class="quote">. ..

    and then the page continues, messing up my nice formatting. So my
    question is:

    Is there a way to use a regular expression to search for an open
    <fieldset> and close it?
    I'm pretty sure there is, but I'm not so good at those negative
    regexps, I get that you'd have to search the output HTML for <fieldset>
    followed by any string of characters and NOT </fieldset> and then if
    that matches, then add </fieldset> to the end. Although that also
    doesn't close the </div> although I'm not sure how it manages not to
    close that (well actually cut it out) itself.

    Hmm maybe I don't want to just close stuff at all... it might just be
    best to remove the quote altogether or something. Anyway, can anyone
    help? It's not a major problem, but it ruins my layout, and I like my
    layout.

    Well thanks to anyone who can help, and sorry I've rambled on. Just
    trying to give as much info as I can.

  • Justin Koivisto

    #2
    Re: A (hopefully) simple regular expression question...

    alexrussell101@ gmail.com wrote:

    [snip]
    [color=blue]
    > Is there a way to use a regular expression to search for an open
    > <fieldset> and close it?
    > I'm pretty sure there is, but I'm not so good at those negative
    > regexps, I get that you'd have to search the output HTML for <fieldset>
    > followed by any string of characters and NOT </fieldset> and then if
    > that matches, then add </fieldset> to the end. Although that also
    > doesn't close the </div> although I'm not sure how it manages not to
    > close that (well actually cut it out) itself.
    >
    > Hmm maybe I don't want to just close stuff at all... it might just be
    > best to remove the quote altogether or something. Anyway, can anyone
    > help? It's not a major problem, but it ruins my layout, and I like my
    > layout.
    >
    > Well thanks to anyone who can help, and sorry I've rambled on. Just
    > trying to give as much info as I can.[/color]

    You can use all the negative look-aheads you want, but if you are just
    using it with preg_match in an if statement, you may be able to get away
    with something along these lines:

    $pattern='`<fie ldset>.*</fieldset>`isU';
    if(preg_match($ pattern,$str)){
    // fieldset is closed
    }else{
    // fieldset set is *not* closed
    }

    --
    Justin Koivisto, ZCE - justin@koivi.co m

    Comment

    • Al

      #3
      Re: A (hopefully) simple regular expression question...

      I just tested this and i get fieldset not closed for all 3 cases
      (fieldset works properly, no fieldset was used at all and fieldset not
      closed)

      I was mainly testing to see whether it caught no fieldset used at all
      as fieldset not closed.

      Comment

      • Justin Koivisto

        #4
        Re: A (hopefully) simple regular expression question...

        Al wrote:[color=blue]
        > I just tested this and i get fieldset not closed for all 3 cases
        > (fieldset works properly, no fieldset was used at all and fieldset not
        > closed)[/color]

        Really?? Here's what I get:

        <?php
        $str=array();
        $str[]=<<<EOS
        <fieldset>
        <legend>&nbsp;B ob&nbsp;</legend>
        <div class="quote">H ello there</div>
        </fieldset>

        EOS;

        $str[]=<<<EOS
        <fieldset>
        <legend>&nbsp;B ob&nbsp;</legend>
        <div class="quote">H ello there</div>

        EOS;

        $str[]=<<<EOS
        <legend>&nbsp;B ob&nbsp;</legend>
        <div class="quote">H ello there</div>

        EOS;

        $pattern='`<fie ldset>.*</fieldset>`isU';

        foreach($str as $html){
        echo $html,"---\n";
        if(preg_match($ pattern,$html)) {
        // fieldset is closed
        echo 'fieldset closed',"\n\n";
        }else{
        // fieldset set is *not* closed
        echo 'fieldset not closed',"\n\n";
        }
        }
        ?>


        Result:
        <fieldset>
        <legend>&nbsp;B ob&nbsp;</legend>
        <div class="quote">H ello there</div>
        </fieldset>
        ---
        fieldset closed

        <fieldset>
        <legend>&nbsp;B ob&nbsp;</legend>
        <div class="quote">H ello there</div>
        ---
        fieldset not closed

        <legend>&nbsp;B ob&nbsp;</legend>
        <div class="quote">H ello there</div>
        ---
        fieldset not closed


        --
        Justin Koivisto, ZCE - justin@koivi.co m

        Comment

        • Al

          #5
          Re: A (hopefully) simple regular expression question...

          I do apologise. It seems to work well, and I realised that my mistake
          was in copying your code directly. the string in my code is $ret and
          yours is $str. So basically I'm an idiot :)

          Nonetheless, thanks for the help. And sorry I've been so long getting
          back on the matter, I've been away for the weekend.

          I'll implement the code as soon as I can and work out how to go about
          fixing my output (whether I should cut the fieldset off entirely or
          leave it and just ad trailing '...'s etc.)

          I've thought of a few problems I could encounter. If I just leave the
          '...'s and close the fieldset, I run the risk of one particular quote
          being just long enough to finish the internal <div> but not the
          fieldset, e.g.:

          <fieldset>
          <legend>Bob</legend>
          <div>Just long enough to get the end of this in, but not the rest</div>
          ....

          That'll be alright, but the '...'s won't be enclosed on the div, not
          that that's a *major* problem, just if I want to style the divs more
          than now, it won't style the '...' (although the same styling applied
          to fieldset, fieldset * {} will work.

          Anyway thanks again for helping me out.

          Comment

          • Justin Koivisto

            #6
            Re: A (hopefully) simple regular expression question...

            Al wrote:[color=blue]
            > I do apologise. It seems to work well, and I realised that my mistake
            > was in copying your code directly. the string in my code is $ret and
            > yours is $str. So basically I'm an idiot :)
            >
            > Nonetheless, thanks for the help. And sorry I've been so long getting
            > back on the matter, I've been away for the weekend.
            >
            > I'll implement the code as soon as I can and work out how to go about
            > fixing my output (whether I should cut the fieldset off entirely or
            > leave it and just ad trailing '...'s etc.)
            >
            > I've thought of a few problems I could encounter. If I just leave the
            > '...'s and close the fieldset, I run the risk of one particular quote
            > being just long enough to finish the internal <div> but not the
            > fieldset, e.g.:
            >
            > <fieldset>
            > <legend>Bob</legend>
            > <div>Just long enough to get the end of this in, but not the rest</div>
            > ...
            >
            > That'll be alright, but the '...'s won't be enclosed on the div, not
            > that that's a *major* problem, just if I want to style the divs more
            > than now, it won't style the '...' (although the same styling applied
            > to fieldset, fieldset * {} will work.
            >
            > Anyway thanks again for helping me out.[/color]

            Rather than Using a fieldset, why not just use blockquote? Those are
            meant for quoting others anyway.

            --
            Justin Koivisto, ZCE - justin@koivi.co m

            Comment

            • Al

              #7
              Re: A (hopefully) simple regular expression question...

              I know, but I wanted something that had a title. The quotes look a bit
              like this:

              ____
              --| Al |---------------------
              | ¯¯¯¯ |
              | Hello there |
              |______________ _______________ |

              Currently a fieldset works quite nicely for this task, although I may
              eventually move on to a blockquote with a border and a relatively
              positioned (top + some pixels) span before it in HTML.

              But thanks for the info, and I know my method wasn't exaclty
              stylistically sound.

              Comment

              Working...