reg ex expression - finding long character strings

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • lawrence

    reg ex expression - finding long character strings

    "Garp" <garp7@no7.blue yonder.co.uk> wrote in message news:<_vpuc.142 4$j_3.13346038@ news-text.cableinet. net>...[color=blue]
    > "lawrence" <lkrubner@geoci ties.com> wrote in message
    > news:da7e68e8.0 405300907.3c8ea bf7@posting.goo gle.com...[color=green]
    > > This reg ex will find me any block of strings 40 or more characters in
    > > length without a white space, yes?
    > >
    > > [^ ]{40}
    > >
    > >
    > > To get it to include tabs and newlines, do I to this?
    > >
    > > [^ \n\t]{40}[/color]
    >
    > \s is the whitespace token, if that's easier for you.[/color]

    Good, but now my question is how to insert the white space that I
    want. If I do this:

    $string = ereg_replace ([^\s]{40}, " ", $string);

    Then the text gets obliterated and replaced by a white space. That is
    not what I want. I simply want to break up long strings (mostly urls)
    that threaten to destroy the format of a page. This is especially true
    of Internet Explorer, which tends to expand DIV tags to fit the
    contents (Netscape lets long urls burts outside the boundries of the
    DIV.)

    Go look at this page using IE 5 or 6:

    168体彩幸运飞行艇记录历史全国数据+飞行艇168结果历史记录查询-综合版正规精准计划、飞艇全场实时直播、168官方记录最新现场号码.


    You'll see a comment (right now it is the second one down) that looks
    like this:
    [color=blue][color=green][color=darkred]
    >>>>>>>>>>>>> >>[/color][/color][/color]
    Misty, I assume you're the one who came up with these interesting
    photos of vegetables? Are they from the ARE garden?
    http://www.publicdomainsoftware.org/...egetables2.JPG ...read
    more[color=blue][color=green][color=darkred]
    >>>>>>>>>>>>> >>[/color][/color][/color]

    That long url is distorting the whole page. I need to break it up.

    I suppose I could hit the whole string with explode() and break them
    on the white space and then loop through the array and test each entry
    for a length of more than 30 or 40 or so, and then stitch it all back
    together with implode, but I was assuming I could do it all more
    elegantly with regular expressions. I don't know much about regular
    expressions, but if someone does, please let me know.
  • Pedro Graca

    #2
    Re: reg ex expression - finding long character strings

    lawrence wrote:[color=blue]
    > I suppose I could hit the whole string with explode() and break them
    > on the white space and then loop through the array and test each entry
    > for a length of more than 30 or 40 or so, and then stitch it all back
    > together with implode, but I was assuming I could do it all more
    > elegantly with regular expressions. I don't know much about regular
    > expressions, but if someone does, please let me know.[/color]

    Try this. Change as you see fit.


    <?php
    function compress_url($t xt, $size=40) {
    $rx = '=(http://\S{' . ($size-7) . ',})=e';
    $compressed_txt = preg_replace($r x,
    "'[<a class=\"compres sed\" href=\"$1\" title=\"$1\">'
    . substr('$1', 0, $size-10)
    . '...'
    . substr('$1', -7)
    . '</a>]'",
    $txt);
    return $compressed_txt ;
    }


    $txt = '
    Misty, I assume you\'re the one who came up with these interesting
    photos of vegetables? Are they from the ARE garden?
    http://www.publicdomainsoftware.org/...egetables2.JPG ...read
    more';

    # # # # # # # # # # # # # # # # # # # # # # # #
    #
    # Remember to define a "compressed " class in your stylesheet
    #
    # # # # # # # # # # # # # # # # # # # # # # # #

    echo compress_url($t xt);
    ?>


    Happy Coding :-)

    --
    USENET would be a better place if everybody read: : mail address :
    http://www.catb.org/~esr/faqs/smart-questions.html : is valid for :
    http://www.netmeister.org/news/learn2quote2.html : "text/plain" :
    http://www.expita.com/nomime.html : to 10K bytes :

    Comment

    • lawrence

      #3
      Re: reg ex expression - finding long character strings

      Pedro Graca <hexkid@hotpop. com> wrote in message news:<slrncbpha 5.15k.hexkid@ID-203069.user.uni-berlin.de>...[color=blue]
      > lawrence wrote:[color=green]
      > > I suppose I could hit the whole string with explode() and break them
      > > on the white space and then loop through the array and test each entry
      > > for a length of more than 30 or 40 or so, and then stitch it all back
      > > together with implode, but I was assuming I could do it all more
      > > elegantly with regular expressions. I don't know much about regular
      > > expressions, but if someone does, please let me know.[/color]
      >
      > Try this. Change as you see fit.
      >
      >
      > <?php
      > function compress_url($t xt, $size=40) {
      > $rx = '=(http://\S{' . ($size-7) . ',})=e';
      > $compressed_txt = preg_replace($r x,
      > "'[<a class=\"compres sed\" href=\"$1\" title=\"$1\">'
      > . substr('$1', 0, $size-10)
      > . '...'
      > . substr('$1', -7)
      > . '</a>]'",
      > $txt);
      > return $compressed_txt ;
      > }[/color]

      That looks brilliant, though I have trouble reading it. When you write:

      http://\S{' . ($size-7) . ',

      are the dots saying "one or more of this white space"?

      Comment

      • Pedro Graca

        #4
        Re: reg ex expression - finding long character strings

        lawrence wrote:[color=blue]
        > Pedro Graca <hexkid@hotpop. com> wrote in message news:<slrncbpha 5.15k.hexkid@ID-203069.user.uni-berlin.de>...[color=green]
        >> Try this. Change as you see fit.
        >>
        >>
        >> <?php
        >> function compress_url($t xt, $size=40) {
        >> $rx = '=(http://\S{' . ($size-7) . ',})=e';
        >> $compressed_txt = preg_replace($r x,
        >> "'[<a class=\"compres sed\" href=\"$1\" title=\"$1\">'
        >> . substr('$1', 0, $size-10)
        >> . '...'
        >> . substr('$1', -7)
        >> . '</a>]'",
        >> $txt);
        >> return $compressed_txt ;
        >> }[/color]
        >
        > That looks brilliant, though I have trouble reading it. When you write:
        >
        > http://\S{' . ($size-7) . ',
        >
        > are the dots saying "one or more of this white space"?[/color]

        No. They are the string concatenator; they are not part of the regular
        expression.

        If I want to find 40 or more non-whitespace characters in a regular
        expression I do

        \S{40,}

        In the function, I made the length a parameter, so that should be

        \S{$size,} *** DOES NOT WORK LIKE THIS!

        but, for that specific function I'm already using "http://" (7 chars),
        so, that part of the regexp is

        \S{$size-7,} *** DOES NOT WORK LIKE THIS!

        So, that $rx line concatenates these three strings:
        =(http://\S{
        $size - 7 *** the result of the subtraction
        ,})=e

        giving, for $size=40

        =(http://\S{33,})=e

        so it will match http urls (and not https, ftp, mailto, ...) longer than
        40 characters.


        HTH

        --
        USENET would be a better place if everybody read: : mail address :
        http://www.catb.org/~esr/faqs/smart-questions.html : is valid for :
        http://www.netmeister.org/news/learn2quote2.html : "text/plain" :
        http://www.expita.com/nomime.html : to 10K bytes :

        Comment

        • lawrence

          #5
          Re: reg ex expression - finding long character strings

          Pedro Graca <hexkid@hotpop. com> wrote in message news:<slrncbr3t h.15k.hexkid@ID-203069.user.uni-berlin.de>...[color=blue]
          > lawrence wrote:[color=green]
          > > Pedro Graca <hexkid@hotpop. com> wrote in message news:<slrncbpha 5.15k.hexkid@ID-203069.user.uni-berlin.de>...[color=darkred]
          > >> Try this. Change as you see fit.
          > >>
          > >>
          > >> <?php
          > >> function compress_url($t xt, $size=40) {
          > >> $rx = '=(http://\S{' . ($size-7) . ',})=e';
          > >> $compressed_txt = preg_replace($r x,
          > >> "'[<a class=\"compres sed\" href=\"$1\" title=\"$1\">'
          > >> . substr('$1', 0, $size-10)
          > >> . '...'
          > >> . substr('$1', -7)
          > >> . '</a>]'",
          > >> $txt);
          > >> return $compressed_txt ;
          > >> }[/color]
          > >
          > > That looks brilliant, though I have trouble reading it. When you write:
          > >
          > > http://\S{' . ($size-7) . ',
          > >
          > > are the dots saying "one or more of this white space"?[/color]
          >
          > No. They are the string concatenator; they are not part of the regular
          > expression.
          >
          > If I want to find 40 or more non-whitespace characters in a regular
          > expression I do
          >
          > \S{40,}
          >
          > In the function, I made the length a parameter, so that should be
          >
          > \S{$size,} *** DOES NOT WORK LIKE THIS!
          >
          > but, for that specific function I'm already using "http://" (7 chars),
          > so, that part of the regexp is
          >
          > \S{$size-7,} *** DOES NOT WORK LIKE THIS!
          >
          > So, that $rx line concatenates these three strings:
          > =(http://\S{
          > $size - 7 *** the result of the subtraction
          > ,})=e
          >
          > giving, for $size=40
          >
          > =(http://\S{33,})=e
          >
          > so it will match http urls (and not https, ftp, mailto, ...) longer than
          > 40 characters.[/color]


          So you can, so to speak, go in and out of "regex mode" by using a
          single quote:

          '

          I assume this is simply the way PHP is built. And when I wanted to use
          a real ' I suppose I would do this:

          \'

          Comment

          • Pedro Graca

            #6
            Re: reg ex expression - finding long character strings

            lawrence wrote:[color=blue]
            > Pedro Graca <hexkid@hotpop. com> wrote in message news:<slrncbr3t h.15k.hexkid@ID-203069.user.uni-berlin.de>...[color=green]
            >> lawrence wrote:[color=darkred]
            >> > Pedro Graca <hexkid@hotpop. com> wrote in message news:<slrncbpha 5.15k.hexkid@ID-203069.user.uni-berlin.de>...
            >> >> Try this. Change as you see fit.
            >> >>
            >> >>
            >> >> <?php
            >> >> function compress_url($t xt, $size=40) {
            >> >> $rx = '=(http://\S{' . ($size-7) . ',})=e';
            >> >> $compressed_txt = preg_replace($r x,
            >> >> "'[<a class=\"compres sed\" href=\"$1\" title=\"$1\">'
            >> >> . substr('$1', 0, $size-10)
            >> >> . '...'
            >> >> . substr('$1', -7)
            >> >> . '</a>]'",
            >> >> $txt);
            >> >> return $compressed_txt ;
            >> >> }
            >> >
            >> > That looks brilliant, though I have trouble reading it. When you write:
            >> >
            >> > http://\S{' . ($size-7) . ',
            >> >
            >> > are the dots saying "one or more of this white space"?[/color]
            >>
            >> No. They are the string concatenator; they are not part of the regular
            >> expression.
            >>
            >> If I want to find 40 or more non-whitespace characters in a regular
            >> expression I do
            >>
            >> \S{40,}
            >>
            >> In the function, I made the length a parameter, so that should be
            >>
            >> \S{$size,} *** DOES NOT WORK LIKE THIS!
            >>
            >> but, for that specific function I'm already using "http://" (7 chars),
            >> so, that part of the regexp is
            >>
            >> \S{$size-7,} *** DOES NOT WORK LIKE THIS!
            >>
            >> So, that $rx line concatenates these three strings:
            >> =(http://\S{
            >> $size - 7 *** the result of the subtraction
            >> ,})=e
            >>
            >> giving, for $size=40
            >>
            >> =(http://\S{33,})=e
            >>
            >> so it will match http urls (and not https, ftp, mailto, ...) longer than
            >> 40 characters.[/color]
            >
            >
            > So you can, so to speak, go in and out of "regex mode" by using a
            > single quote:
            >
            > '[/color]

            No, not quite!


            This is all standard string management:


            I prefer to use single quotes most of the time.

            $x = 'abc'; // $x holds a three-character string
            $x = $x . 14; // PHP automagically transforms the number 14 into a
            // two-character string; $x now holds a five-character
            // string
            $x .= 'yz'; // add tow more characters to $x
            // making it "abc14xy" (without the quotes)

            It's the exact same thing with the regexp above :)
            Instead of it being constant, it is a /dynamic/ regexp.

            [color=blue]
            > I assume this is simply the way PHP is built. And when I wanted to use
            > a real ' I suppose I would do this:
            >
            > \'[/color]

            If it's inside single quotes, yes.

            --
            USENET would be a better place if everybody read: | to email me: use |
            http://www.catb.org/~esr/faqs/smart-questions.html | my name in "To:" |
            http://www.netmeister.org/news/learn2quote2.html | header, textonly |
            http://www.expita.com/nomime.html | no attachments. |

            Comment

            Working...