parsing out links in HTML docs

**Janwillem Borleffs** · Jul 17 '05, 08:28 AM

Re: parsing out links in HTML docs

marco wrote:[color=blue]
> Hello to you all,
>
> A question from a PHP newby who is disorientated by the overwhelming
> amount of existing example scripts.
>
>
> -- What is the best/simplest way to parse out the links in a a HTML
> document and putting them in an array? --
>
> Some hints, functions or snipplets would be highly appreciated.
>
>[/color]

Here's a way:

<?

$file = file_get_conten ts("http://www.php.net/");
preg_match_all( "/<a[^>]+href\s*=\s*(\" |')?([^\"'\s>]+)/i", $file, $links);

print "<pre>";
print_r($links[2]);
print "</pre>";

?>

JW

**marco** · Jul 17 '05, 08:28 AM

Re: parsing out links in HTML docs

O Yeah, the one magic line...

<< preg_match_all( "/<a[^>]+href\s*=\s*(\" |')?([^\"'\s>]+)/i", $file, $links); >>

Sweet.

Thanks a lot.

marco

parsing out links in HTML docs

parsing out links in HTML docs

Comment

Comment