Best tool to convert html into XHTML for XML parsing?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Sebastien B.

    Best tool to convert html into XHTML for XML parsing?

    I'm looking for the best tool to convert 'every day' html into proper XHTML
    so that I can parse it as an XML document.

    So far I've been using Tidylib to do this, but it doesn't handle things as
    gracefully as browsers do. For example, take the page at
    http://mail.yahoo.com - all browsers display it properly, but tidying it up
    with Tidy (using the tool at http://cgi.w3.org/cgi-bin/tidy) will give a
    result that renders quite differently than the original.

    So are there any tools that would allow me to properly convert html into
    proper xhtml, but without it producing output that would render differently
    when viewed in a browser (ie. parse it as a browser would, and create proper
    xhtml from that)?

    I'm programming in C, if you need to know.

    Thx,
    Seb



  • hawat.thufir@gmail.com

    #2
    Re: Best tool to convert html into XHTML for XML parsing?

    in Java, JTidy. it's at sourceforge.

    Thufir

    Comment

    Working...