Getting the <!DOCTYPE> from an XHTML document that was parsed using SAX - possible?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Joshua Beall

    Getting the <!DOCTYPE> from an XHTML document that was parsed using SAX - possible?

    Hi All,

    I have been using the SAX library in PHP to parse XHTML documents, and one
    thing I have noted is that the <!DOCTYPE> line is ignored.

    I am wondering is there any way to get the <!DOCTYPE> using the SAX
    functions in PHP? I am looking over the manual, but nothing is jumping out
    at me...

    reference: http://us4.php.net/manual/en/ref.xml.php

    I have thought about loading it using DOM, but I'd rather not consume the
    memory if possible. And another option would be to just using simple string
    parsing methods to pull it out of the original document, but again I am
    hoping that I would be able to do it somehow using the SAX functions... any
    chance of this?

    Sincerely,
    -Josh


  • Ashmodai

    #2
    Re: Getting the &lt;!DOCTYPE&gt ; from an XHTML document that was parsedusing SAX - possible?

    Joshua Beall scribbled something along the lines of:[color=blue]
    > I have been using the SAX library in PHP to parse XHTML documents, and one
    > thing I have noted is that the <!DOCTYPE> line is ignored.
    >
    > I am wondering is there any way to get the <!DOCTYPE> using the SAX
    > functions in PHP? I am looking over the manual, but nothing is jumping out
    > at me...
    >
    > reference: http://us4.php.net/manual/en/ref.xml.php[/color]



    The default handler handles "the XML declaration, document type
    declaration, entities or other data for which no other handler exists".

    Sometimes reading the function descriptions helps, y'know ;)


    --
    Ashmo

    Comment

    • Joshua Beall

      #3
      Re: Getting the &lt;!DOCTYPE&gt ; from an XHTML document that was parsed using SAX - possible?

      "Ashmodai" <ashmodai@mushr oom-cloud.com> wrote in message
      news:cth4kq$n5n $03$1@news.t-online.com...[color=blue]
      > Joshua Beall scribbled something along the lines of:[color=green]
      >> I have been using the SAX library in PHP to parse XHTML documents, and
      >> one thing I have noted is that the <!DOCTYPE> line is ignored.>[/color]
      > http://us4.php.net/manual/en/functio...lt-handler.php
      >
      > The default handler handles "the XML declaration, document type
      > declaration, entities or other data for which no other handler exists".
      >
      > Sometimes reading the function descriptions helps, y'know ;)[/color]

      Unfortunately this function still ignores the <!DOCTYPE> element, despite
      claims to the contrary in the manual. Works fine for comments and entities,
      though. Might be a PHP bug; I'm running 5.0.3.

      I've resorted to some simple string manipulation to get the pieces that SAX
      won't let me at. When I have the time, though, I'll work up a short code
      example and post a bug report to php.net.

      -jb


      Comment

      Working...