How To Extract/Fetch HTML source code from another website?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • zerodevice
    New Member
    • Jun 2007
    • 1

    How To Extract/Fetch HTML source code from another website?

    Hi, I'm trying to code my php that allows me to extract or fetch the html codes from another website, then i'll filter it myself to get only the specific text i want and display or echo it directly to my page.

    e.g. you goto my page, and it will display a list of google's search result based on a fixed search string i code into the page.

    e.g.search "asdf"

    in google it will show "http://www.google.com. my/search?hl=en&q= asdf&btnG=Googl e+Search&meta="

    in my page it will show:

    asdf
    www.asdf.com/ - 3k - Cached - Similar pages

    What is asdf?
    www.asdf.com/whatisasdf.html - 5k - Cached - Similar pages

    CLiki : asdf
    www.cliki.net/asdf - 17k - Cached - Similar pages

    CLiki : ASDF-Install
    www.cliki.net/ASDF-Install - 34k - Cached - Similar pages

    Association Of Synchronous Data Formats
    www.asdf.org/ - 4k - Cached - Similar pages

    Home row - Wikipedia, the free encyclopedia
    en.wikipedia.or g/wiki/Home_row - 16k - Cached - Similar pages

    asdf Manual
    constantly.at/lisp/asdf/ - 11k - Cached - Similar pages

    ASDF - A Simple DVD Frontend for MPlayer
    asdf-mplayer.sourcef orge.net/ - 4k - Cached - Similar pages

    asdf-jkl - Google Code
    code.google.com/p/asdf-jkl/ - 7k - Cached - Similar pages


    profile.myspace .com/index.cfm?fusea ction=user.view profile&friendi d=31856324 - 138k - 21 Jun 2007 - Cached -
    these text adn hyperlinks are extracted instantly the moment they goto my site.


    i know its a dumb function, but i have my reasons.

    please help me.

    thanks.
  • henryrhenryr
    New Member
    • Jun 2007
    • 103

    #2
    I don't know if it will work on remote sites but on my local server, I indexed pages just by using file_get_conten ts('http://localhost/xyz").

    Try checking the functions in the php manual eg - http://www.php.net/manual/en/functio...t-contents.php

    Henry

    Comment

    • pbmods
      Recognized Expert Expert
      • Apr 2007
      • 5821

      #3
      Heya, zerodevice. Welcome to TSDN!

      As an extension to henryrhenryr's suggestion, once you've loaded the HTML source from Google's results, parsing it isn't too difficult.

      All you have to do is examine Google's result page source code. Look for common HTML tags that precede every search result. Then just explode() or preg_split() by that string. Then you can just harvest your results from the beginning of each resulting array index, after the 0th one.

      Comment

      Working...