Parsing Html

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Colum

    Parsing Html

    Anyone have any ideas how to parse a html document.

    I am trying to extract out specific information from the page.
    Also, what do you do if the page is dynamic (e.g. a cgi generated page) how
    do you find it??

    Thanks
    Colum.


  • Pedro

    #2
    Re: Parsing Html

    Colum wrote:[color=blue]
    > I am trying to extract out specific information from the page.
    > Also, what do you do if the page is dynamic (e.g. a cgi generated page) how
    > do you find it??[/color]

    It depends *very*much* on what you're trying to extract.
    I once had my motd come from

    <?php
    $x = `curl -s http://www.care2.com/`;
    $t = strpos($x, 'DAILY QUACK UP');
    $y = substr($x, $t, 300);
    $t = strpos($y, '</td>');
    $z = substr($y, 0, $t);
    $z = str_replace('</b></font></a><br>', '', $z);
    $z = str_replace('<B R>', '', $z);
    echo $z;
    ?>

    Just retested this ... still works :)

    --
    I have a spam filter working.
    To mail me include "urkxvq" (with or without the quotes)
    in the subject line, or your mail will be ruthlessly discarded.

    Comment

    • Manuel Lemos

      #3
      Re: Parsing Html

      Hello,

      On 10/30/2003 07:46 PM, Colum wrote:[color=blue]
      > Anyone have any ideas how to parse a html document.
      >
      > I am trying to extract out specific information from the page.
      > Also, what do you do if the page is dynamic (e.g. a cgi generated page) how
      > do you find it??[/color]

      You may want to try these classes:

      Class: HTMLparser


      Class: HTMLSax



      --

      Regards,
      Manuel Lemos

      Free ready to use OOP components written in PHP


      Comment

      Working...