parsing out links in HTML docs

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • marco

    parsing out links in HTML docs

    Hello to you all,

    A question from a PHP newby who is disorientated by the overwhelming
    amount of existing example scripts.


    -- What is the best/simplest way to parse out the links in a a HTML
    document and putting them in an array? --

    Some hints, functions or snipplets would be highly appreciated.


    Thanks.

    Marco
  • Janwillem Borleffs

    #2
    Re: parsing out links in HTML docs

    marco wrote:[color=blue]
    > Hello to you all,
    >
    > A question from a PHP newby who is disorientated by the overwhelming
    > amount of existing example scripts.
    >
    >
    > -- What is the best/simplest way to parse out the links in a a HTML
    > document and putting them in an array? --
    >
    > Some hints, functions or snipplets would be highly appreciated.
    >
    >[/color]

    Here's a way:

    <?

    $file = file_get_conten ts("http://www.php.net/");
    preg_match_all( "/<a[^>]+href\s*=\s*(\" |')?([^\"'\s>]+)/i", $file, $links);

    print "<pre>";
    print_r($links[2]);
    print "</pre>";

    ?>


    JW



    Comment

    • marco

      #3
      Re: parsing out links in HTML docs

      O Yeah, the one magic line...

      << preg_match_all( "/<a[^>]+href\s*=\s*(\" |')?([^\"'\s>]+)/i", $file, $links); >>

      Sweet.

      Thanks a lot.

      marco

      Comment

      Working...