Regular expression problem - rewriting of internal links

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • tallyce

    Regular expression problem - rewriting of internal links

    I've spent most of the day trying to solve this problem, and much
    searching has also failed to find a solution! I wonder if anyone can
    suggest a solution?

    Basically I need to rewrite all *internal* links on a page to add a
    query string to each. For instance,

    <a href="/path/to/file.html" target="_blank" >
    should become
    <a href="/path/to/file.html?edit" target="_blank" >

    and

    <a href="" target="_blank" >
    should become
    <a href="?edit" target="_blank" >

    but

    <a href="http://example.com/path/to/file.html" target="_blank" >
    should stay unchanged.


    I've tried things like:

    $search = '(<a([^>]*) href="(?<!http://)([^"]*)"';
    $html = preg_replace ("~{$regexp}~i" , '\1?edit', $html);

    but the presence of the [^"]* seems to make the negative look-behind
    assertion fail.

    Any ideas gratefully received!
  • Sjoerd

    #2
    Re: Regular expression problem - rewriting of internal links

    tallyce wrote:
    Basically I need to rewrite all *internal* links on a page to add a
    query string to each. [...]
    $search = '(<a([^>]*) href="(?<!http://)([^"]*)"'; $html = preg_replace
    ("~{$regexp}~i" , '\1?edit', $html);
    >
    but the presence of the [^"]* seems to make the negative look-behind
    assertion fail.
    Are you sure that you need negative look behind?

    Note that your regex does not match these links:
    <a title="..." href="...">
    <a href='...'>

    This seem to work:
    preg_replace('| href=[\'"](?!http://)([^\'"]*)[\'"]|',
    'href="\1?query =bla"', $page);

    This matches href, quote, not starting with http://, zero or more
    characters which are not a quote, ended with a quote. Where a quote is '
    or ".

    Comment

    Working...