Pull part of an external webpage into another

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • orware
    New Member
    • Oct 2006
    • 3

    Pull part of an external webpage into another

    Hi! This is my first post...and when I begin working on this project, it will also be my first PHP project! I do have some programming experience from my first year at college, but that was in Scheme :-). I have not started work on this problem yet, because I am still doing research and figuring out how I can do it using PHP. My goal is to turn this into a Joomla module that I can offer for free on Joomla.org and use on a few of the websites I create. So here is the overview of the problem: at this location [HTML]http://apps.cbp.gov/bwt/index.asp[/HTML] there is a website that has all of the border wait times for the US for both Canadian and Mexican ports of entry. In particular, I am interested in the Calexico ports, but I am sure I can generalize the problem enough to allow any of the other ports to be chosen as well. I want to use PHP to save a local copy of the page on the server, or alternately, to grab the page, parse it for the parts that are needed, and only cache that portion on the webpage (the cache or local copy would have to expire after only a few minutes, but that would at least minimize redundant calls for every page reload on my site, I figure). I know that cURL or fopen (I have both working on my server) allow you to grab external webpages, but I don't really know what I would need to do from there. How do I parse the page to only keep what I need? How can I set PHP to only run the function every couple of minutes, rather than every time the page loads? Those are my main questions, so if anybody could help point me in the right direction to some functions or webpages I could use they would be much appreciated :-).

    -Omar
  • ronverdonk
    Recognized Expert Specialist
    • Jul 2006
    • 4259

    #2
    Many government orgs offer rss feeds (xml) that you can tap into. Are you sure that this cannot be done via an rss feed from that organization?

    Ronald :cool:

    Comment

    • orware
      New Member
      • Oct 2006
      • 3

      #3
      Thanks for the idea! Not that I would have any idea about what to do with the feed, but because of the XML I'm sure it would have been easier to parse through. I went to see the main area of the site, and checked out the RSS feeds, but the ones available are not related to the border wait times. Also, the page I referred to in my last post is on a subdomain...app s.cbp.gov, and they only have one other script (one for airport wait times), but no feeds that go along with them.

      As I continued my research I read an article about output buffering that should prove helpful, however I still do not know how to parse a page to extract the data that I need. Again, any helpful functions you can mention would lower my search time reading up on php.net :-).

      -Omar

      Comment

      • ronverdonk
        Recognized Expert Specialist
        • Jul 2006
        • 4259

        #4
        If you'd have an RSS feed, you could tap into it and extract the info (xml structured) into a buffer and parse for info that you are looking for. For now, you'll have to 'scrape' a web page. Have a look at the following free php class at Scraper class that provides scraper functions.

        Its description states:
        Originally posted by scrape class
        This class is meant to fetch remote HTML pages and parse them to extract structured information into arrays.

        It can take a model of the definition of the structure of a given page and process it to clip the relevant fields of information.
        Ronald :cool:

        Comment

        • orware
          New Member
          • Oct 2006
          • 3

          #5
          Nice...I'll take a look at it and I'll post back :-).

          Thank you!

          -Omar Ramos

          Comment

          Working...