Functionality similar to PHP's SimpleXML?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Phillip B Oldham

    Functionality similar to PHP's SimpleXML?

    I'm sure I'll soon figure out how to find these things out for myself,
    but I'd like to get the community's advice on something.

    I'm going to throw together a quick project over the weekend: a
    spider. I want to scan a website for certain elements.

    I come from a PHP background, so normally I'd:
    - throw together a quick REST script to handle http request/responses
    - load the pages into a simplexml object and
    - run an xpath over the dom to find the nodes I need to test

    One of the benefits of PHP's dom implementation is that you can easily
    load both XML and HTML4 documents - the HTML gets normalised to XML
    during the import.

    So, my questions are:

    Is there a python module to easily handle http request/responses?

    Is there a python dom module that works similar to php's when working
    with older html?

    What python module would I use to apply an XPath expression over a dom
    and return the results?

  • Stefan Behnel

    #2
    Re: Functionality similar to PHP's SimpleXML?

    Phillip B Oldham wrote:
    I'm going to throw together a quick project over the weekend: a
    spider. I want to scan a website for certain elements.
    >
    I come from a PHP background, so normally I'd:
    - throw together a quick REST script to handle http request/responses
    Use the urllib/urllib2 module in the stdlib for GET/POST with parameters or
    lxml for simple page requests.

    - load the pages into a simplexml object and
    - run an xpath over the dom to find the nodes I need to test
    Use lxml.



    Stefan

    Comment

    Working...