Opera guesses encoding for "application/xml"

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Christoph Schneegans

    Opera guesses encoding for "application/xml"

    Hi!

    Okay, so positions on "text/html" XHTML are totally contradicting. Anyway!
    I hope there's more consensus about "applicatio n/xml" XHTML.

    I've recently learned that Opera 9.0b2 does not only evaluate HTTP header,
    BOM and XML declaration to determine the character encoding of an XHTML
    document sent as "applicatio n/xml", but also the "meta" element. For
    example, <http://schneegans.de/sv/test-cases/?case=meta-only-encoding> is
    rendered as "Česká republika". In contrast, Firefox displays "?esk?
    republika", and IE even aborts parsing.

    If you agree with me and think that this behavior is wrong, you might
    want to post a follow-up to <news:op.tatddz el952okdpkn@new s.opera.com>.

    --
    All free men, wherever they may live, are citizens of Denmark. And
    therefore, as a free man, I take pride in the words "Jeg er dansker!"
  • Spartanicus

    #2
    Re: Opera guesses encoding for &quot;applicati on/xml&quot;

    Christoph Schneegans <Christoph@Schn eegans.de> wrote:
    [color=blue]
    >Okay, so positions on "text/html" XHTML are totally contradicting. Anyway![/color]

    I've yet to see anyone successfully defend a single argument for sending
    XHTML as text/html. The consensus is that Appendix C was a mistake.
    [color=blue]
    >I hope there's more consensus about "applicatio n/xml" XHTML.[/color]

    XHTML should be served as application/xhtml+xml.

    --
    Spartanicus

    Comment

    • Garmt de Vries

      #3
      Re: Opera guesses encoding for &quot;applicati on/xml&quot;

      On Fri, 09 Jun 2006 00:24:19 +0200, Christoph Schneegans
      <Christoph@Schn eegans.de> wrote:
      [color=blue]
      > I've recently learned that Opera 9.0b2 does not only evaluate HTTP
      > header,
      > BOM and XML declaration to determine the character encoding of an XHTML
      > document sent as "applicatio n/xml", but also the "meta" element. For
      > example, <http://schneegans.de/sv/test-cases/?case=meta-only-encoding> is
      > rendered as "Česká republika". In contrast, Firefox displays "?esk?
      > republika", and IE even aborts parsing.[/color]

      When I go to that page, I see:

      Encoding from server (used by Opera):
      iso-8859-2 (iso-8859-2)

      Doesn't really look like "meta only"...

      --
      Garmt de Vries

      Comment

      • Christoph Schneegans

        #4
        Re: Opera guesses encoding for &quot;applicati on/xml&quot;

        "Spartanicu s" wrote:
        [color=blue]
        > I've yet to see anyone successfully defend a single argument for
        > sending XHTML as text/html.[/color]

        For XHTML? More powerful means for validation, simpler syntax. For
        text/html? IE wouldn't support it otherwise.
        [color=blue]
        > The consensus is that Appendix C was a mistake.[/color]

        Your consensus.
        [color=blue][color=green]
        >> I hope there's more consensus about "applicatio n/xml" XHTML.[/color]
        >
        > XHTML should be served as application/xhtml+xml.[/color]

        That wouldn't change anything, and I guess you know that.

        --
        All free men, wherever they may live, are citizens of Denmark. And
        therefore, as a free man, I take pride in the words "Jeg er dansker!"

        Comment

        • Christoph Schneegans

          #5
          Re: Opera guesses encoding for &quot;applicati on/xml&quot;

          Garmt de Vries wrote:
          [color=blue][color=green]
          >> <http://schneegans.de/sv/test-cases/?case=meta-only-encoding>[/color]
          >
          > When I go to that page, I see:
          >
          > Encoding from server (used by Opera):
          > iso-8859-2 (iso-8859-2)
          >
          > Doesn't really look like "meta only"...[/color]

          <http://web-sniffer.net/?url=http://schneegans.de/sv/test-cases/%3Fcase=meta-only-encoding>

          When still in doubt, use Telnet.

          --
          All free men, wherever they may live, are citizens of Denmark. And
          therefore, as a free man, I take pride in the words "Jeg er dansker!"

          Comment

          • Spartanicus

            #6
            Re: Opera guesses encoding for &quot;applicati on/xml&quot;

            Christoph Schneegans <Christoph@Schn eegans.de> wrote:
            [color=blue][color=green]
            >> I've yet to see anyone successfully defend a single argument for
            >> sending XHTML as text/html.[/color]
            >
            >For XHTML? More powerful means for validation,[/color]

            Yawn, we've disproved that plenty of times

            [color=blue]
            >simpler syntax.[/color]

            Nonsense.
            [color=blue]
            >For text/html? IE wouldn't support it otherwise.[/color]

            IE doesn't support XHTML period, falsely labeling it as text/html merely
            prevents that from being demonstrated more clearly.
            [color=blue][color=green]
            >> The consensus is that Appendix C was a mistake.[/color]
            >
            >Your consensus.[/color]

            The consensus amongst the members of this group. To date no-one has
            managed to uphold an argument for serving XHTML as text/html, feel free
            to confirm this via the archives.

            [color=blue][color=green][color=darkred]
            >>> I hope there's more consensus about "applicatio n/xml" XHTML.[/color]
            >>
            >> XHTML should be served as application/xhtml+xml.[/color]
            >
            >That wouldn't change anything, and I guess you know that.[/color]

            You professed a hope for "consensus" about serving XHTML as
            application/xml from. There is broad consensus, but it is to serve XHTML
            as application/xhtml+xml as per w3c's recommendation:


            Specifically:

            3.3. 'application/xml'

            The 'application/xml' media type [RFC3023] is a generic media type for
            XML documents, and the definition of 'application/xml' does not preclude
            serving XHTML documents as that media type. Any XHTML Family document
            MAY be served as 'application/xml'.

            However, authors should be aware that such a document may not always be
            processed as XHTML (e.g. hyperlinks may not be recognized), depending on
            user agents. Generic XML processors might recognize it as just an XML
            document which includes elements and attributes from the XHTML namespace
            (and others), and may not have a priori knowledge what to do with such a
            document beyond they can do for generic XML documents.
            The World Wide Web Consortium (W3C) is an international community where Member organizations, a full-time staff, and the public work together to develop Web standards.


            --
            Spartanicus

            Comment

            • Andreas Prilop

              #7
              Re: Opera guesses encoding for &quot;applicati on/xml&quot;

              On Fri, 9 Jun 2006, Spartanicus wrote:
              [color=blue]
              > IE doesn't support XHTML period, falsely labeling it as text/html merely
              > prevents that from being demonstrated more clearly.[/color]

              What means "support" or "doesn't support"?





              are identical resources - only the URL is different. Yet IE 6
              behaves differently because one of the URLs ends in ".html".
              *.x.html is displayed, *.xhtml is offered for download.
              Silly IE!

              Comment

              • Spartanicus

                #8
                Re: Opera guesses encoding for &quot;applicati on/xml&quot;

                Andreas Prilop <nhtcapri@rrz n-user.uni-hannover.de> wrote:
                [color=blue][color=green]
                >> IE doesn't support XHTML period, falsely labeling it as text/html merely
                >> prevents that from being demonstrated more clearly.[/color]
                >
                >What means "support" or "doesn't support"?[/color]

                Parse it as XHTML, not HTML.

                --
                Spartanicus

                Comment

                • Christoph Schneegans

                  #9
                  Re: Opera guesses encoding for &quot;applicati on/xml&quot;

                  "Spartanicu s" wrote:
                  [color=blue][color=green]
                  >> For XHTML? More powerful means for validation,[/color]
                  >
                  > Yawn, we've disproved that plenty of times
                  > http://www.spartanicus.utvinternet.ie/no-xhtml.htm[/color]

                  You surely intended to point to
                  <http://www.spartanicus .utvinternet.ie/custom_dtd.htm> . I'd like to know
                  how you want to spot improper or error-prone markup such as

                  <p title=""style=" "></p>

                  <!------><hr><!------>

                  <span lang="klingon"> ...</span>

                  with this custom DTD.
                  [color=blue][color=green]
                  >> simpler syntax.[/color]
                  >
                  > Nonsense.[/color]

                  Yeah,

                  <p ltr<span></span</p>

                  is obviously simpler syntax than

                  <p dir="ltr"><span ></span></p>

                  because it's shorter.
                  [color=blue]
                  > IE doesn't support XHTML period,[/color]

                  It doesn't support HTML either, see e.g.
                  <http://schneegans.de/web/xhtml/shorttag/>. Now I want you to present an
                  XHTML 1.0 document that conforms to Appendix C and is not supported by IE.
                  [color=blue]
                  > You professed a hope for "consensus" about serving XHTML as
                  > application/xml from. There is broad consensus, but it is to serve
                  > XHTML as application/xhtml+xml as per w3c's recommendation:[/color]

                  Just show me an user agent that supports "applicatio n/xhtml+xml" but does
                  not support "applicatio n/xml".
                  [color=blue]
                  > http://www.w3.org/TR/xhtml-media-types/[/color]

                  Did you really overlook
                  <http://www.w3.org/TR/xhtml-media-types/#text-html>?

                  --
                  All free men, wherever they may live, are citizens of Denmark. And
                  therefore, as a free man, I take pride in the words "Jeg er dansker!"

                  Comment

                  • Spartanicus

                    #10
                    Re: Opera guesses encoding for &quot;applicati on/xml&quot;

                    Christoph Schneegans <Christoph@Schn eegans.de> wrote:
                    [color=blue][color=green][color=darkred]
                    >>> For XHTML? More powerful means for validation,[/color]
                    >>
                    >> Yawn, we've disproved that plenty of times
                    >> http://www.spartanicus.utvinternet.ie/no-xhtml.htm[/color]
                    >
                    >You surely intended to point to
                    ><http://www.spartanicus .utvinternet.ie/custom_dtd.htm> .[/color]

                    A misconception like "XHTML is stricter" is common. If you are not
                    specific and only say "More powerful means for validation", then all I
                    can do is point to the main page that includes a refute of this most
                    common misconception.
                    [color=blue]
                    >I'd like to know
                    >how you want to spot improper or error-prone markup such as
                    >
                    > <p title=""style=" "></p>
                    >
                    > <!------><hr><!------>
                    >
                    > <span lang="klingon"> ...</span>
                    >
                    >with this custom DTD.[/color]

                    All additional XHTML constraints can be emulated for HTML validation.
                    The resource you quoted makes no claim that it emulates all these
                    additional constraints, it merely demonstrates how it can be done using
                    a common subset.

                    Not all DTD validator checkable constraints are governed by the DTD.
                    This is noted on the quoted resource, and it links to an explanation of
                    how to change non DTD constraints, again using a common example.

                    That said, the practical value of being able to machine check even for
                    all additional constraints that are part of XHTML 1.x (some of which are
                    only part of XHTML 1.1 such as your "<span lang="klingon"> ...</span>"
                    example) is nil as long as the result is parsed by HTML clients.
                    Ultimately this invalidates all potential arguments in favour of XHTML.
                    [color=blue][color=green][color=darkred]
                    >>> simpler syntax.[/color]
                    >>
                    >> Nonsense.[/color]
                    >
                    >Yeah,[/color]

                    Again, if you want to make a point you need to be specific. Loose
                    unqualified remarks such as "simpler syntax" don't allow for a proper
                    response.
                    [color=blue]
                    > <p ltr<span></span</p>
                    >
                    >is obviously simpler syntax than
                    >
                    > <p dir="ltr"><span ></span></p>
                    >
                    >because it's shorter.[/color]

                    You've lost me, are you suggesting that "<p ltr<span></span</p>" is
                    proper syntax and/or valid under XHTML?
                    [color=blue][color=green]
                    >> IE doesn't support XHTML period,[/color]
                    >
                    >It doesn't support HTML either, see e.g.
                    ><http://schneegans.de/web/xhtml/shorttag/>.[/color]

                    Other flaws do not form an argument for a claim that IE supports XHTML.
                    [color=blue]
                    >Now I want you to present an
                    >XHTML 1.0 document that conforms to Appendix C and is not supported by IE.
                    >[color=green]
                    >> You professed a hope for "consensus" about serving XHTML as
                    >> application/xml from. There is broad consensus, but it is to serve
                    >> XHTML as application/xhtml+xml as per w3c's recommendation:[/color]
                    >
                    >Just show me an user agent that supports "applicatio n/xhtml+xml" but does
                    >not support "applicatio n/xml".[/color]

                    You are avoiding the point made that contrary to your claim that the
                    media type used made no difference, that a document served as
                    application/xml may not be recognized as XHTML.
                    [color=blue][color=green]
                    >> http://www.w3.org/TR/xhtml-media-types/[/color]
                    >
                    >Did you really overlook
                    ><http://www.w3.org/TR/xhtml-media-types/#text-html>?[/color]

                    The consensus I referred to pertained to this group, it has rejected
                    serving XHTML as text/html in favour of the view that if XHTML is to be
                    used at all then it should be served as application/xhtml+xml

                    --
                    Spartanicus

                    Comment

                    • Christoph Schneegans

                      #11
                      Re: Opera guesses encoding for &quot;applicati on/xml&quot;

                      "Spartanicu s" wrote:
                      [color=blue]
                      > If you are not specific and only say "More powerful means for
                      > validation", then all I can do is point to the main page that
                      > includes a refute of this most common misconception.[/color]

                      Do you dispute the fact that XML Schema validation is more powerful than
                      DTD validation?
                      [color=blue]
                      > All additional XHTML constraints can be emulated for HTML validation.[/color]

                      Using freaky regular expressions?
                      [color=blue]
                      > This is noted on the quoted resource, and it links to an explanation of
                      > how to change non DTD constraints, again using a common example.[/color]

                      Most authors don't want to create a custom DTD or a custom SGML
                      declaration. As long as there's no "strict" HTML validation service
                      available on the web, your remarks have no practical implications.
                      [color=blue]
                      > That said, the practical value of being able to machine check even for
                      > all additional constraints that are part of XHTML 1.x (some of which are
                      > only part of XHTML 1.1 such as your "<span lang="klingon"> ...</span>"
                      > example) is nil as long as the result is parsed by HTML clients.[/color]

                      Thank you. Appendix C documents are parsed by HTML clients as well.
                      [color=blue]
                      > Loose unqualified remarks such as "simpler syntax" don't allow for a
                      > proper response.[/color]

                      Do you dispute the fact that XML syntax is simpler than SGML syntax?
                      [color=blue]
                      > You've lost me, are you suggesting that "<p ltr<span></span</p>" is
                      > proper syntax and/or valid under XHTML?[/color]

                      '<p ltr<span></span</p>' is valid HTML, '<p dir="ltr"><span ></span></p>'
                      is the corresponding XHTML syntax. So which one is simpler IYO?
                      [color=blue]
                      > Other flaws do not form an argument for a claim that IE supports XHTML.[/color]

                      Nobody in this thread claims that IE supports XHTML.

                      Dou you dispute the fact that IE neither supports XHTML nor HTML?
                      [color=blue][color=green]
                      >> Now I want you to present an XHTML 1.0 document that conforms to
                      >> Appendix C and is not supported by IE.[/color][/color]

                      You forgot to answer this one.
                      [color=blue]
                      > You are avoiding the point made that contrary to your claim that the
                      > media type used made no difference, that a document served as
                      > application/xml may not be recognized as XHTML.[/color]

                      That's what the W3C says. It does not happen nevertheless.

                      --
                      All free men, wherever they may live, are citizens of Denmark. And
                      therefore, as a free man, I take pride in the words "Jeg er dansker!"

                      Comment

                      • Eric B. Bednarz

                        #12
                        Re: Opera guesses encoding for &quot;applicati on/xml&quot;

                        Christoph Schneegans <Christoph@Schn eegans.de> writes:
                        [color=blue]
                        > Most authors don't want to create a custom DTD or a custom SGML
                        > declaration. As long as there's no "strict" HTML validation service
                        > available on the web, your remarks have no practical implications.[/color]

                        Authors don't necessarily have to *create* them theirselves. For the
                        creation of static files the idea of online validation is totally lost
                        on me anyway. That should and easily can be part of the local authoring
                        process, whatever way you like it.

                        I know that you are also aware of the inherent SGML syntax features that
                        cannot be made XML(or even HTML UA)-compatible with any custom SGML
                        declaration, and I'm surprised that you didn't mention them :)

                        (For dynamic content -- any CMS of sorts -- I'd actually prefer XHTML
                        syntax, since even lowest-end Linux hosting usually comes with PHP
                        bundled with the expat library, and _trivially_ allows for verifying
                        fully-tagged input and some kind of homegrown content model restraints.)


                        As a personal note, I am a bit flabbergasted by the origin of discussion
                        at large. Any service that doesn't advertise itself inappropriately is
                        useful within the bounds of its documentation. While I actually do
                        believe that Christoph's service is not about to hit (and educate) its
                        target audience at all, the W3C markup validation service (in the SGML
                        sense, I won't even mention X(HT)ML) is deliberately aiming at the
                        clue-underprivileged and seldom criticised as such over here.


                        --
                        ||| hexadecimal EBB
                        o-o decimal 3771
                        --oOo--( )--oOo-- octal 7273
                        205 goodbye binary 111010111011

                        Comment

                        • Jim Moe

                          #13
                          Re: Opera guesses encoding for &quot;applicati on/xml&quot;

                          Christoph Schneegans wrote:[color=blue]
                          >
                          > '<p ltr<span></span</p>' is valid HTML, [...]
                          >[/color]
                          HTML, yes. XHTML, no.
                          It is, however, meaningless. A parser ignores all the mystery components
                          and it reduces to '<p>'.

                          --
                          jmm (hyphen) list (at) sohnen-moe (dot) com
                          (Remove .AXSPAMGN for email)

                          Comment

                          • Christoph Schneegans

                            #14
                            Re: Opera guesses encoding for &quot;applicati on/xml&quot;

                            Jim Moe wrote:
                            [color=blue][color=green]
                            >> '<p ltr<span></span</p>' is valid HTML, [...][/color]
                            >
                            > It is, however, meaningless.[/color]

                            It is exactly equivalent to '<p dir="ltr"><span ></span></p>' in HTML. If
                            you don't believe it, feed <http://validator.w3.or g/fragment-upload.html>
                            with

                            <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
                            <title></title>
                            <p ltr<span></span</p>

                            and

                            <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
                            <title></title>
                            <p dir="ltr"><span ></span></p>

                            and enable the "Show Parse Tree" option.
                            [color=blue]
                            > A parser ignores all the mystery components and it reduces to '<p>'.[/color]

                            That depends on the parser.

                            --
                            All free men, wherever they may live, are citizens of Denmark. And
                            therefore, as a free man, I take pride in the words "Jeg er dansker!"

                            Comment

                            • Spartanicus

                              #15
                              Re: Opera guesses encoding for &quot;applicati on/xml&quot;

                              Christoph Schneegans <Christoph@Schn eegans.de> wrote:
                              [color=blue][color=green]
                              >> If you are not specific and only say "More powerful means for
                              >> validation", then all I can do is point to the main page that
                              >> includes a refute of this most common misconception.[/color]
                              >
                              >Do you dispute the fact that XML Schema validation is more powerful than
                              >DTD validation?[/color]

                              Before changing to another topic, first let's finish off your original
                              claim; do you agree that XHTML DTD based validation is not "more
                              powerful" than HTML DTD validation?
                              [color=blue][color=green]
                              >> All additional XHTML constraints can be emulated for HTML validation.[/color]
                              >
                              >Using freaky regular expressions?[/color]

                              By editing the files used by the validator, this doesn't involve regular
                              expressions.
                              [color=blue][color=green]
                              >> This is noted on the quoted resource, and it links to an explanation of
                              >> how to change non DTD constraints, again using a common example.[/color]
                              >
                              >Most authors don't want to create a custom DTD or a custom SGML
                              >declaration.[/color]

                              No-one has anything to gain from checking text/html documents for the
                              extra constraints that XHTML requires, by definition this includes "most
                              authors".
                              [color=blue]
                              >As long as there's no "strict" HTML validation service
                              >available on the web, your remarks have no practical implications.[/color]

                              If you want to claim practical relevance then you need to acknowledge
                              the at best very limited value of validation in the whole, and the fact
                              that checking for the extra constraints has no benefit at all for
                              content served as text/html.

                              That said, I'm aware of at least one online HTML validator that has an
                              option to impose additional constraints (Nick Kew's Page Valet).
                              [color=blue][color=green]
                              >> That said, the practical value of being able to machine check even for
                              >> all additional constraints that are part of XHTML 1.x (some of which are
                              >> only part of XHTML 1.1 such as your "<span lang="klingon"> ...</span>"
                              >> example) is nil as long as the result is parsed by HTML clients.[/color]
                              >
                              >Thank you. Appendix C documents are parsed by HTML clients as well.[/color]

                              So do you now concede that there is no basis whatsoever for your
                              suggestion that checking text/html documents for the additional XHTML
                              constraints has any practical relevance or value?
                              [color=blue][color=green]
                              >> Loose unqualified remarks such as "simpler syntax" don't allow for a
                              >> proper response.[/color]
                              >
                              >Do you dispute the fact that XML syntax is simpler than SGML syntax?[/color]

                              Again before having another change of subject, I'm still trying to find
                              out what you meant by your original claim that XHTML syntax is simpler
                              than HTML syntax.
                              [color=blue][color=green]
                              >> You've lost me, are you suggesting that "<p ltr<span></span</p>" is
                              >> proper syntax and/or valid under XHTML?[/color]
                              >
                              >'<p ltr<span></span</p>' is valid HTML, '<p dir="ltr"><span ></span></p>'
                              >is the corresponding XHTML syntax. So which one is simpler IYO?[/color]

                              What you refer to as "valid HTML" is an error that isn't picked up by a
                              validator using the public DTD. This does not form an argument to
                              declare XHTML "simpler" (strange way to raise a point about a certain
                              HTML error being missed). If the fact that this error isn't flagged
                              under the public DTD bothers you, it is easily fixed, "<p
                              ltr<span></span</p>" doesn't validate in my HTML validation process.
                              [color=blue][color=green]
                              >> Other flaws do not form an argument for a claim that IE supports XHTML.[/color]
                              >
                              >Nobody in this thread claims that IE supports XHTML.[/color]

                              You have a short or selective memory:
                              [color=blue][color=green][color=darkred]
                              >>>For XHTML? More powerful means for validation, simpler syntax. For
                              >>>text/html? IE wouldn't support it otherwise.[/color][/color][/color]
                              [color=blue]
                              >Dou you dispute the fact that IE neither supports XHTML nor HTML?[/color]

                              Again: other flaws such as possibly not parsing a certain HTML construct
                              correctly do not form an argument for your claim that IE supports XHTML.
                              [color=blue][color=green][color=darkred]
                              >>> Now I want you to present an XHTML 1.0 document that conforms to
                              >>> Appendix C and is not supported by IE.[/color][/color]
                              >
                              >You forgot to answer this one.[/color]

                              Again: IE does not support XHTML at all. Like almost all other HTML
                              parsers IE's error recovery mechanism allows XHTML served as text/html
                              to be parsed as pseudo HTML without necessarily causing problems. This
                              does not demonstrate "support", it merely demonstrates error recovery at
                              work when parsing tag soup.
                              [color=blue][color=green]
                              >> You are avoiding the point made that contrary to your claim that the
                              >> media type used made no difference, that a document served as
                              >> application/xml may not be recognized as XHTML.[/color]
                              >
                              >That's what the W3C says. It does not happen nevertheless.[/color]

                              You mean that you haven't seen it happen, excuse me for attaching little
                              to no value to that.

                              --
                              Spartanicus

                              Comment

                              Working...