HTML Parser doubt

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • shajias
    New Member
    • Nov 2006
    • 8

    HTML Parser doubt

    Hi ,
    I am trying to parse HTML data and retrive the contents. I am facing a problem which I have explained below.

    I have imported HTMLParser class and using the handle_data function. The issue here is the '<' and '>' data which is represented as &le and &ge is getting stripped off.

    For eg: if the html representation is like &lt;This&gt is an example which will read as <This> is an example . When I parse it, I am getting the value only as This is an example.

    ie... '<' and '>' got stripped off....

    Please help
  • shajias
    New Member
    • Nov 2006
    • 8

    #2
    Hi ,
    Any one has any clue about this one... i am in need of this info very urgently..... :-(

    Comment

    • bartonc
      Recognized Expert Expert
      • Sep 2006
      • 6478

      #3
      Originally posted by shajias
      Hi ,
      Any one has any clue about this one... i am in need of this info very urgently..... :-(
      I don't do html, but have you tried "<this> is an example"?

      Comment

      • shajias
        New Member
        • Nov 2006
        • 8

        #4
        Putting in quotes wont work..... This scipt is a genric one which takes many files as input.... Issue here is whenever &le; and &ge; which stands for < and > the handle_data function in HTMLParser class strips those '<' and '>'

        Comment

        Working...