RegExp: Problems with matching a(ny) URI

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Jesper Stocholm

    RegExp: Problems with matching a(ny) URI

    I need to be able to detect URIs in some text and after this replace
    dem with HTML-anchors, that is

    Ancient yet vibrant, twisted yet resilient, olive


    should be replaced with

    <a href="http://www.tempuri.org/page.html">http ://www.tempuri.org/page.html</a>

    I have made the following code:

    re = new RegExp('(((http |https|ftp):\/\/)?\w+[.\w]+([^\w]*[\s]+))');
    str = document.forms[0].longtext.value ; //textarea with text to replace
    newstr = str.replace(re, '<a href="$1">$1</a>');
    document.write( newstr);

    However, it doesn't quite work as I would like it to. It seems to only
    make a single match, and it seems to ingore leading and trailing
    whitespaces.

    Can you help me solving this?

    I test of the code I have so far can be found at


    Thanks,

    :o)

    --
    Jesper Stocholm - http://stocholm.dk
    Copenhagen, Denmark
  • Janwillem Borleffs

    #2
    Re: RegExp: Problems with matching a(ny) URI


    "Jesper Stocholm" <jespers@stocho lm.invalid> schreef in bericht
    news:Xns93D7D79 E735D5stocholmd k@130.226.1.34. ..[color=blue]
    > I need to be able to detect URIs in some text and after this replace
    > dem with HTML-anchors, that is[/color]
    ....[color=blue]
    > I have made the following code:
    >
    > re = new RegExp('(((http |https|ftp):\/\/)?\w+[.\w]+([^\w]*[\s]+))');[/color]
    ....[color=blue]
    > However, it doesn't quite work as I would like it to. It seems to only
    > make a single match, and it seems to ingore leading and trailing
    > whitespaces.[/color]

    Try this regexp:

    re = /(http|https|ftp )([^ ]+)/ig;

    (The i at the end means ignore case, the g means global for multiple
    matches)

    This doesn't allow spaces in URL's (which isn't allowed anyway).


    JW



    Comment

    • Janwillem Borleffs

      #3
      Re: RegExp: Problems with matching a(ny) URI


      "Janwillem Borleffs" <jwb@jwbfoto.de mon.nl> schreef in bericht
      news:3f3be68c$0 $28912$1b62eedf @news.euronet.n l...[color=blue]
      >
      > Try this regexp:
      >
      > re = /(http|https|ftp )([^ ]+)/ig;
      >[/color]

      Forgot to mention that the replacement should be done as follows:

      newstr = str.replace(re, '<a href="$1$2">$1$ 2</a>');


      JW



      Comment

      • Lasse Reichstein Nielsen

        #4
        Re: RegExp: Problems with matching a(ny) URI

        Jesper Stocholm <jespers@stocho lm.invalid> writes:
        [color=blue]
        > I need to be able to detect URIs in some text and after this replace
        > dem with HTML-anchors,[/color]
        [color=blue]
        > I have made the following code:
        >
        > re = new RegExp('(((http |https|ftp):\/\/)?\w+[.\w]+([^\w]*[\s]+))');[/color]
        ....[color=blue]
        > However, it doesn't quite work as I would like it to.[/color]

        Step back from the technical part and answer this question:
        What is an URL?
        or more precisely:
        What will you accept as an URL?
        (The formal definition is in RFC2396
        <URL:http://rfc.sunsite.dk/rfc/rfc2396.html>)

        If you can answer it, in detail, then I bet it is easier to make a
        regular expression to match it (or get help doing it, because then
        the job is precisely specified).

        /L
        --
        Lasse Reichstein Nielsen - lrn@hotpop.com
        Art D'HTML: <URL:http://www.infimum.dk/HTML/randomArtSplit. html>
        'Faith without judgement merely degrades the spirit divine.'

        Comment

        Working...