xhtml vs html 4 strict

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • RobG

    #31
    Re: xhtml vs html 4 strict

    Spartanicus wrote:[color=blue]
    > "Philipp Lenssen" <info@outer-court.com> wrote:
    >
    > [XHTML]
    >
    >[color=green][color=darkred]
    >>>>It's an option if you want to force yourself or a team to be
    >>>>case-sensitive and to always close tags.
    >>>
    >>>XHTML documents must use lower case for all HTML element and attribute
    >>>names. That's not the same as being case sensitive.[/color]
    >>
    >>XHTML is case-sensitive for elements. HTML4 is not. How is that not
    >>being case-sensitive?[/color]
    >
    >
    > Because upper case is invalid in XHTML element and attribute names.
    > In Javascript function names are case sensitive, Foobar!=fooBar, but
    > both are valid.[/color]

    Your opinion that XHTML is case-insensitive is in direct opposition to
    the view of the W3C. Read section 4.2 of the XHTML 1.0 specification:

    "XHTML documents must use lower case for all HTML element and
    attribute names. This difference is necessary because XML is
    case-sensitive..."

    <URL:http://www.w3.org/TR/xhtml1/>
    [color=blue]
    >
    >[color=green][color=darkred]
    >>>Using XHTML doesn't force anything on anyone. Validating against an
    >>>XHTML DTD will fail if elements and/or attribute names are not lower
    >>>case, and if elements are not closed. This is not particular to XHTML
    >>>DTDs.[/color]
    >>
    >>But in HTML4, you may leave certain elements open. This is particular
    >>to XHTML/XML then.[/color]
    >
    >
    > Only for empty elements, which unlike closing non empty elements has no
    > relevance for a parser. In fact a conforming HTML parser would choke on
    > XHTML.
    >[/color]

    Of course non-empty elements with optional closing tags, such as <P>,
    are conveniently ignored. They are perfectly valid in HTML 4.01, but
    not XHTML, which is the point Philipp was making.


    --
    Rob

    Comment

    • Jan Roland Eriksson

      #32
      Re: xhtml vs html 4 strict

      On 01 Jun 2005 15:27:58 GMT, "Philipp Lenssen" <info@outer-court.com>
      wrote:
      [color=blue]
      >XHTML is case-sensitive for elements. HTML4 is not. How is that not
      >being case-sensitive?[/color]

      It is not a question of being "case sensitive" that kind of wording does
      not even exist in the "Handbook".

      The true thing is all about "case folding" where a true validating piece
      of software looks into the SGML declaration for the doc instance in
      process to find out about which characters needs to be "folded" into
      some other characters.

      In the traditional "SGML Concrete Syntax" that has always been taken to
      mean e.g. 'a' folds to 'A' but SGML does not stipulate requirements for
      that. You are allowed to fold anything into anything.

      In the alternative "XML" version of an SGML syntax, designers went out
      of the way to create a markup model that would allow just about any
      "dummy" to have his/her say and still stay at least conforming to some
      simplistic way of looking at markup in principle.

      We are going downhill from there...

      --
      Rex


      Comment

      • Spartanicus

        #33
        Re: xhtml vs html 4 strict

        RobG <rgqld@iinet.ne t.auau> wrote:
        [color=blue][color=green][color=darkred]
        >>>>XHTML documents must use lower case for all HTML element and attribute
        >>>>names. That's not the same as being case sensitive.
        >>>[/color][/color]
        > Your opinion that XHTML is case-insensitive is in direct opposition to
        > the view of the W3C.[/color]

        I never said that XHTML is case-insensitive.
        [color=blue]
        >Read section 4.2 of the XHTML 1.0 specification:
        >
        > "XHTML documents must use lower case for all HTML element and
        > attribute names. This difference is necessary because XML is
        > case-sensitive..."[/color]

        Note that it says "XML", not "XHTML". XML is indeed case sensitive.
        [color=blue][color=green][color=darkred]
        >>>But in HTML4, you may leave certain elements open. This is particular
        >>>to XHTML/XML then.[/color]
        >>
        >> Only for empty elements, which unlike closing non empty elements has no
        >> relevance for a parser. In fact a conforming HTML parser would choke on
        >> XHTML.[/color]
        >
        > Of course non-empty elements with optional closing tags, such as <P>,
        > are conveniently ignored.[/color]

        They're not.

        --
        Spartanicus

        Comment

        • RobG

          #34
          Re: xhtml vs html 4 strict

          Spartanicus wrote:[color=blue]
          > RobG <rgqld@iinet.ne t.auau> wrote:
          >
          >[color=green][color=darkred]
          >>>>>XHTML documents must use lower case for all HTML element and attribute
          >>>>>names. That's not the same as being case sensitive.
          >>>>[/color]
          >> Your opinion that XHTML is case-insensitive is in direct opposition to
          >> the view of the W3C.[/color]
          >
          >
          > I never said that XHTML is case-insensitive.[/color]

          You questioned the statement that XHTML was case-sensitive, but also
          reckon it's not case-insensitive. Is there some state in between that
          is neither?
          [color=blue]
          >
          >[color=green]
          >>Read section 4.2 of the XHTML 1.0 specification:
          >>
          >> "XHTML documents must use lower case for all HTML element and
          >> attribute names. This difference is necessary because XML is
          >> case-sensitive..."[/color]
          >
          >
          > Note that it says "XML", not "XHTML". XML is indeed case sensitive.
          >[/color]

          And, since XHTML is HTML as XML, XHTML is case-sensitive.

          The fact that the authors of the XHTML specification chose to define
          all their tags and attributes solely in lower case does not remove
          case-sensitivity.
          [color=blue]
          >[color=green][color=darkred]
          >>>>But in HTML4, you may leave certain elements open. This is particular
          >>>>to XHTML/XML then.
          >>>
          >>>Only for empty elements, which unlike closing non empty elements has no
          >>>relevance for a parser. In fact a conforming HTML parser would choke on
          >>>XHTML.[/color]
          >>
          >> Of course non-empty elements with optional closing tags, such as <P>,
          >> are conveniently ignored.[/color]
          >
          >
          > They're not.
          >[/color]

          They - optional closing tags - are being ignored by you. You introduced
          empty tags, the treatment of which is also different in XHTML and HTML.

          The OP's point was that, other differences aside, what passes as valid
          HTML may well not pass as valid XHTML due XHTML's case sensitivity and
          requirement for closing tags even where they are optional in HTML.

          I've yet to see a convincing argument to the contrary.



          --
          Rob

          Comment

          • Philipp Lenssen

            #35
            Re: xhtml vs html 4 strict

            RobG wrote:
            [color=blue]
            > Spartanicus told me:[/color]
            [color=blue][color=green][color=darkred]
            > >> Your opinion that XHTML is case-insensitive is in direct[/color][/color]
            > opposition to the view of the W3C.[color=green]
            > >[/color][/color]
            [color=blue]
            >
            > The fact that the authors of the XHTML specification chose to define
            > all their tags and attributes solely in lower case does not remove
            > case-sensitivity.
            >[/color]

            Exactly. Saying "all elements are lower-case, but XHTML is not
            case-sensitive" would make "lower-case" a mere suggestion in XHTML;
            valid XHTML elements could therefore then also be upper-case. And this
            is obviously not the case. Spartanicus surely will now begin to
            understand something can be *both* case-sensitive *and* lower-case (or
            camel-case, or upper-case).
            [color=blue]
            >
            > The OP's point was that, other differences aside, what passes as valid
            > HTML may well not pass as valid XHTML due XHTML's case sensitivity and
            > requirement for closing tags even where they are optional in HTML.
            >
            > I've yet to see a convincing argument to the contrary.[/color]

            Me too. I just don't see how the valid HTML4

            <p>.......

            would somehow force you to close the paragraph (as does valid XHTML
            force you).


            Again, here's some valid HTML4 with mixed case, and non-closed tags:

            <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN">
            <html><head>
            <title>Title</title>
            <meta http-equiv="Content-Type" content="text/html;
            CHARSET=iso-8859-1">
            </head>
            <body>

            <p>Test
            <P>Test

            </body>
            </html>


            And the same is invalid in XHTML1 in both instances, case and missing
            closing tag (I could write a parser for this *without* knowing about
            the DTD -- I couldn't do this in HTML4, as I'd need the DTD to tell me
            which closing tags are optional and which aren't)

            <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
            "DTD/xhtml1-strict.dtd">
            <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
            <head>
            <title>Title</title>
            </head>
            <body>

            <p>Test
            <P>Test

            </body>
            </html>


            Does that mean XHTML is superior to HTML4? No! It may just be a team
            preference for the reasons stated (forced case and closing tags). There
            are also arguments against XHTML which may make other feel HTML is
            superior (like the misleading non-XML content-type necessary for
            browsers like IExplorer).

            --
            Google Blogoscoped
            A daily news blog and community covering Google, search, and technology.

            Comment

            • Spartanicus

              #36
              Re: xhtml vs html 4 strict

              RobG <rgqld@iinet.ne t.auau> wrote:
              [color=blue][color=green][color=darkred]
              >>>>>>XHTML documents must use lower case for all HTML element and attribute
              >>>>>>names. That's not the same as being case sensitive.
              >>>>>
              >>> Your opinion that XHTML is case-insensitive is in direct opposition to
              >>> the view of the W3C.[/color]
              >>
              >> I never said that XHTML is case-insensitive.[/color]
              >
              > You questioned the statement that XHTML was case-sensitive[/color]

              I didn't "question", I corrected an incorrect statement.
              [color=blue]
              >, but also
              > reckon it's not case-insensitive. Is there some state in between that
              > is neither?[/color]

              Yes, read again what I wrote (which was a direct quote from the spec) if
              you want to have another go at understanding what that is.
              [color=blue][color=green][color=darkred]
              >>>Read section 4.2 of the XHTML 1.0 specification:
              >>>
              >>> "XHTML documents must use lower case for all HTML element and
              >>> attribute names. This difference is necessary because XML is
              >>> case-sensitive..."[/color]
              >>
              >> Note that it says "XML", not "XHTML". XML is indeed case sensitive.[/color]
              >
              > And, since XHTML is HTML as XML, XHTML is case-sensitive.[/color]

              XHTML is a reformulation of HTML in XML, thus valid XHTML is
              automatically also valid XML, the reverse however is emphatically not
              the case.

              In the case sensitive XML "<Span>foob ar</Span>" is valid, in XHTML it is
              always invalid.
              [color=blue]
              > The fact that the authors of the XHTML specification chose to define
              > all their tags and attributes solely in lower case does not remove
              > case-sensitivity.[/color]

              You are missing the point in thinking that there are only 2 states.
              [color=blue][color=green][color=darkred]
              >>>>>But in HTML4, you may leave certain elements open. This is particular
              >>>>>to XHTML/XML then.
              >>>>
              >>>>Only for empty elements, which unlike closing non empty elements has no
              >>>>relevance for a parser. In fact a conforming HTML parser would choke on
              >>>>XHTML.
              >>>
              >>> Of course non-empty elements with optional closing tags, such as <P>,
              >>> are conveniently ignored.[/color]
              >>
              >> They're not.[/color]
              >
              > They - optional closing tags - are being ignored by you.[/color]

              Philipp stated that "in HTML4, you may leave certain elements open."
              "This" [required closing of elements] "is particular to XHTML/XML then."
              It is not. Using a custom HTML DTD a validator can also produce
              validation errors for non closed non empty elements. See the "Example
              custom DTD" at the end of


              --
              Spartanicus

              Comment

              • Spartanicus

                #37
                Re: xhtml vs html 4 strict

                "Philipp Lenssen" <info@outer-court.com> wrote:
                [color=blue]
                >Saying "all elements are lower-case, but XHTML is not
                >case-sensitive"[/color]

                But no-one is saying that.

                --
                Spartanicus

                Comment

                • Jukka K. Korpela

                  #38
                  Re: xhtml vs html 4 strict

                  Spartanicus <invalid@invali d.invalid> wrote:
                  [color=blue]
                  > Yes, read again what I wrote (which was a direct quote from the
                  > spec) if you want to have another go at understanding what that is.[/color]

                  It's hard to tell what you are really arguing about. You have written,
                  among other things:
                  "XHTML documents must use lower case for all HTML element and attribute
                  names. That's not the same as being case sensitive."
                  It's surely not the same, but _more_. Surely any requirement to use
                  lower case implicitly _includes_ case sensitivity; without case
                  sensitivity, there can be no requirement on case.
                  [color=blue]
                  > XHTML is a reformulation of HTML in XML, thus valid XHTML is
                  > automatically also valid XML, the reverse however is emphatically
                  > not the case.[/color]

                  Why the emphasis? Sounds like an emphatic strawman.
                  [color=blue]
                  > In the case sensitive XML "<Span>foob ar</Span>" is valid,[/color]

                  Oh no, that depends on the DTD. There need not be a DTD. If there is,
                  it may or may not contain an element named "Span", and its content
                  model may or may not allow character data.
                  [color=blue]
                  > in XHTML it is always invalid.[/color]

                  It is invalid in XHTML as currently defined, of course, because there
                  is no element named "Span".
                  [color=blue][color=green]
                  >> The fact that the authors of the XHTML specification chose to
                  >> define all their tags and attributes solely in lower case does
                  >> not remove case-sensitivity.[/color]
                  >
                  > You are missing the point in thinking that there are only 2 states.[/color]

                  Case-sensitive and case-insensitive _do_ form a dichotomic division.
                  Anything that is not completely case-insensitive is by definition case
                  sensitive.
                  [color=blue]
                  > Philipp stated that "in HTML4, you may leave certain elements
                  > open." "This" [required closing of elements] "is particular to
                  > XHTML/XML then." It is not.[/color]

                  It was a poor formulation. That's usual when discussing SGML or XML
                  syntax. Of course closing of elements is always required. Whether an
                  explicit closing _tag_ is required is a different issue.

                  You seem to be saying that SGML allows declarations that make closing
                  tags mandatory. That's certainly true; most element declarations in
                  HTML DTDs do that.
                  [color=blue]
                  > Using a custom HTML DTD a validator can
                  > also produce validation errors for non closed non empty elements.[/color]

                  It isn't an HTML DTD if it isn't one of the DTDs mentioned in HTML
                  specifications. It might be an SGML DTD.

                  --
                  Yucca, http://www.cs.tut.fi/~jkorpela/
                  Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

                  Comment

                  • Spartanicus

                    #39
                    Re: xhtml vs html 4 strict

                    "Jukka K. Korpela" <jkorpela@cs.tu t.fi> wrote:
                    [color=blue][color=green]
                    >> Yes, read again what I wrote (which was a direct quote from the
                    >> spec) if you want to have another go at understanding what that is.[/color]
                    >
                    >It's hard to tell what you are really arguing about. You have written,
                    >among other things:
                    >"XHTML documents must use lower case for all HTML element and attribute
                    >names. That's not the same as being case sensitive."
                    >It's surely not the same, but _more_.[/color]

                    A requirement to use lower case for element and attribute names is not
                    an additional syntactical requirement to case sensitivity, it is a
                    different syntactical requirement.

                    Declaring XHTML "case sensitive" as Philipp did does not cover the
                    syntactical requirement regarding element and attribute case, "XHTML
                    documents must use lower case for all HTML element and attribute names."
                    does cover the syntactical requirement.
                    [color=blue][color=green]
                    >> In the case sensitive XML "<Span>foob ar</Span>" is valid,[/color]
                    >
                    >Oh no, that depends on the DTD. There need not be a DTD. If there is,
                    >it may or may not contain an element named "Span", and its content
                    >model may or may not allow character data.
                    >[color=green]
                    >> in XHTML it is always invalid.[/color]
                    >
                    >It is invalid in XHTML as currently defined, of course, because there
                    >is no element named "Span".[/color]

                    A "Span" element cannot be added to a XHTML DTD without altering the
                    syntactical requirement regarding element case in the XHTML spec's
                    prose. Adding the qualifier "as currently defined" is pointless since
                    any sensible discussion can only cover the present.
                    [color=blue][color=green]
                    >> Using a custom HTML DTD a validator can
                    >> also produce validation errors for non closed non empty elements.[/color]
                    >
                    >It isn't an HTML DTD if it isn't one of the DTDs mentioned in HTML
                    >specifications . It might be an SGML DTD.[/color]

                    Distinction noted, but it doesn't make a difference to the incorrect
                    statement Philipp made.

                    --
                    Spartanicus

                    Comment

                    • Guy Macon

                      #40
                      Re: xhtml vs html 4 strict




                      Jukka K. Korpela wrote:[color=blue]
                      >
                      >Spartanicus <invalid@invali d.invalid> wrote:
                      >[color=green]
                      >> In the case sensitive XML "<Span>foob ar</Span>" is valid,[/color]
                      >
                      >Oh no, that depends on the DTD. There need not be a DTD. If there is,
                      >it may or may not contain an element named "Span", and its content
                      >model may or may not allow character data.[/color]

                      Simple answer. If <Span>, <SPAN>, <span> and <sPaN> are all
                      treated as the same thing in all respects, it is case-insensitive.
                      If they are treated differently under any conditions, it is case-
                      sensitive.


                      Comment

                      • Jan Roland Eriksson

                        #41
                        Re: xhtml vs html 4 strict

                        On Thu, 2 Jun 2005 10:54:45 +0000 (UTC), "Jukka K. Korpela"
                        <jkorpela@cs.tu t.fi> wrote:
                        [color=blue]
                        >Spartanicus <invalid@invali d.invalid> wrote:[color=green]
                        >> Yes, read again what I wrote (which was a direct quote from the
                        >> spec) if you want to have another go at understanding what that is.[/color][/color]
                        [color=blue]
                        >It's hard to tell what you are really arguing about...[/color]

                        And for the real thing it's hard to discover what the heck any one in
                        this thread is arguing about.

                        You Jukka, for one, should know from your own long time experience that
                        there is no such thing as "case sensitive" defined anywhere within given
                        rules for how to build a markup language.

                        SGML gives the right and means to fold any individual character into any
                        other individual character, and that right is fully trickled down into
                        the SGML declaration for XML.

                        "Case (in)sensitivty" is an idiotic term.

                        "Case Folding" is the correct term.

                        But maybe it should have been named "character re mapping" in order to
                        express exactly what it is all about.

                        --
                        Rex


                        Comment

                        • Jukka K. Korpela

                          #42
                          Re: xhtml vs html 4 strict

                          Jan Roland Eriksson <jrexon@newsguy .com> wrote:
                          [color=blue]
                          > And for the real thing it's hard to discover what the heck any one
                          > in this thread is arguing about.[/color]

                          It's nice that we can agree on _something_. :-)
                          [color=blue]
                          > You Jukka, for one, should know from your own long time experience
                          > that there is no such thing as "case sensitive" defined anywhere
                          > within given rules for how to build a markup language.[/color]

                          There is; the name and implementation for it may vary.

                          Of course, I'm not going to cite the HTML specifications for use of
                          phrases like "case sensitive". We know that the terminology of the
                          specifications is vague and wouldn't do in a standard.
                          [color=blue]
                          > SGML gives the right and means to fold any individual character
                          > into any other individual character,[/color]

                          Case folding is one way to implement case insensitivity, and in many
                          ways an efficient way. It is more efficient to canonicalize letters to,
                          say, upper case when storing names into a symbol table and then just do
                          simple string comparisons than to store them as received and perform
                          any string comparisons in a case insensitive manner. But that's
                          implementation. Besides, such an implementation causes problems in
                          error diagnostics. Once you have canonicalized the case of letters, you
                          cannot report certain types of errors in an informative manner. You
                          cannot tell that case is the problem when you've lost the case. :-)
                          [color=blue]
                          > and that right is fully
                          > trickled down into the SGML declaration for XML.[/color]

                          There is no SGML declaration for XML, because XML is formally defined
                          by a standalone specification, which makes no normative reference to
                          the SGML standard. All those prose about XML as a "subset" or "profile"
                          of SGML is misleading.
                          [color=blue]
                          > "Case (in)sensitivty" is an idiotic term.[/color]

                          No, just misspelled. :-)
                          [color=blue]
                          > "Case Folding" is the correct term.[/color]

                          Well, "case folding" is a suitable term for an operation or principle
                          of canonicalizing case. But surely we can define case insensitivity
                          quite independently of such terms. Case insensitivity means that case
                          is ignored when comparing strings for equality. For example, that "Foo"
                          and "foo" are treated as equal. Whether you achieve it by folding both
                          to "FOO", or by folding both to "foo", or by some other means, is
                          really something that a markup language definition shouldn't even
                          mention, unless it needs _for some other reason_ a concept of
                          canonicalized format for a name (and even then, that format could be
                          defined separately).

                          --
                          Yucca, http://www.cs.tut.fi/~jkorpela/
                          Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

                          Comment

                          • Jan Roland Eriksson

                            #43
                            Re: xhtml vs html 4 strict

                            On Fri, 3 Jun 2005 04:09:43 +0000 (UTC), "Jukka K. Korpela"
                            <jkorpela@cs.tu t.fi> wrote:
                            [color=blue]
                            >Jan Roland Eriksson <jrexon@newsguy .com> wrote:[color=green]
                            >> trickled down into the SGML declaration for XML.[/color][/color]
                            [color=blue]
                            >There is no SGML declaration for XML,[/color]

                            I beg to differ.



                            [color=blue]
                            >because XML is formally defined by a standalone
                            >specificatio n, which makes no normative reference to
                            >the SGML standard.[/color]

                            The work on what was to become 'XML' started long before W3C was ever
                            thought of. Names like Goldfarb and Naggum comes to mind as original
                            idea generators/definition writers in very early 1990'ies.
                            [color=blue]
                            >All those prose about XML as a "subset" or "profile"
                            >of SGML is misleading.[/color]

                            There's a lot of prose in that spec that is misleading, but just this
                            back ref to SGML is not since it has a true SGML based ref available
                            for it.

                            Without the 'Web SGML TC', and the work behind it, there would be no
                            XML in the first place.

                            --
                            Rex


                            Comment

                            • David Håsäther

                              #44
                              Re: xhtml vs html 4 strict

                              Jan Roland Eriksson <jrexon@newsguy .com> wrote:
                              [color=blue]
                              > "Case Folding" is the correct term.
                              >
                              > But maybe it should have been named "character re mapping" in
                              > order to express exactly what it is all about.[/color]

                              The SGML Handbook talks about case substitution. I think that's clearer
                              than case folding, since case folding makes it less obvious that any
                              character can be substituted.
                              But yea, character re mapping might be even clearer.

                              --
                              David Håsäther

                              Comment

                              • Jukka K. Korpela

                                #45
                                Re: xhtml vs html 4 strict

                                Jan Roland Eriksson <jrexon@newsguy .com> wrote:
                                [color=blue][color=green]
                                >>There is no SGML declaration for XML,[/color]
                                >
                                > I beg to differ.
                                >
                                > http://www.y12.doe.gov/sgml/wg8/document/1955.htm[/color]

                                It describes XML as if it were SGML, but in fact XML has been defined
                                independently of SGML - for rather obvious reasons.

                                It says: "XML documents implicitly contain the following SGML
                                declaration." That's hypothetical language, describing how XML _could
                                have been defined_ on SGML basis. (And this has some practical value of
                                course in software design.)
                                [color=blue]
                                > Without the 'Web SGML TC', and the work behind it, there would be no
                                > XML in the first place.[/color]

                                I don't think so. The marked demand for a trivialization of XML was too
                                big.

                                --
                                Yucca, http://www.cs.tut.fi/~jkorpela/
                                Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

                                Comment

                                Working...