XHTML user agent behavior regarding empty elements

**David Madore** · Jul 20 '05, 03:41 PM

Re: empty <div> and <span> elements (was: Re: XHTML user agent behavior...)

"Jukka K. Korpela" in litteris
<Xns93EAABEF181 Ejkorpelacstutf i@193.229.0.31> scripsit:[color=blue]
> The meaning of the clear property is to stop floating, so I cannot see why
> you could not use it the way I suggested. It seems to be that you are
> imitating <br clear="..."> in CSS, rather than making full use of CSS
> possibilities. I don't see how you would "clear the border"; a border
> property affects the element that it is assigned to, and you can assign a
> height property to the element if you wish to make it taller than its
> content needs.[/color]

The clear attribute is deprecated in HTML4 or XHTML. Certainly <br
style="clear: both" /> does the trick, but I fail to see in what way
it is any better than <div style="clear: both"></div>. The HTML4 spec
says that "The BR element forcibly breaks (ends) the current line of
text", and that's not what it's being used for: I'd say that using
<br/> except within a <p> (or somesuch) with text immediately
preceding and following is far worse style than using an empty <div>.

Restate that with an example: I believe that
<URL: http://www.eleves.ens.fr:8080/home/m...st/float1.html >
is better HTML style than
<URL: http://www.eleves.ens.fr:8080/home/m...st/float3.html >
(and the two should be rendered more or less identically).

(Note: I use the style attribute in my examples merely so they can be
written more concisely, but it doesn't have to be so; the class
attribute and an appropriate stylesheet would to just as well, of
course.)
[color=blue]
> You're referring to content generated by server- or client-side scripting
> or preprocessing, right?[/color]

Yes, sorry, my wording was confusing.
[color=blue]
> In any case, the tools you use for
> generating content e.g. server-side should be selected to match the needs,
> not vice versa.[/color]

That's very well in theory, but in practice we have to do with the
tools we have. Unless you were to give a *compelling* reason for not
using empty <div> and <span> elements, which you failed to do.
Altering an entire production chain merely to avoid empty <div> and
<span> elements is hardly a serious suggestion.
[color=blue]
> Well, they did in a sense - but browsers have not implemented the SGML way
> of using entities (except in the trivial sense of supporting a predefined
> set of entity references that expand to character references).[/color]

Precisely.
[color=blue]
> I agree with the idea that a simple markup system like HTML should have
> had a simple include feature. But CSS is _not_ the solution to that. There
> are several better approaches, as describe in the c.i.w.a.h. FAQ.[/color]

If you refer to question 5.1 in the FAQ (at <URL:
http://www.faqs.org/faqs/www/authoring-faq/ >), well, server-side
includes are not an option for me for reasons that I won't go into.
I'm willing to hear about any other proposal (or any other bit of the
FAQ that I might have missed).
[color=blue][color=green]
>> <p>Stylesheet name (if applicable): [<span
>> id="insert-stylesheet-name-here"></span>]</p>[/color]
>
> I fail to see what this relates to. Why would a document contain style
> sheet names that way?[/color]

Because we can have alternate stylesheets, and a well-designed
browser, or a little bit of ECMAscript magic, lets the user choose
among them. It might be nice to let the stylesheet name appear
somewhere within the content.
[color=blue][color=green]
>> Yes, and so? There's nothing wrong with browser-specific inventions
>> if they're useful and are employed in a way that gracefully degrades
>> on other browsers.[/color]
>
> The point is that you make arguments in favor of hacks, on the grounds
> that some hacks need them.[/color]

And so what? Hacks won't go away merely because we call them "hacks".
Sometimes, regretfully, they are needed, because existing tools don't
do the job.
[color=blue][color=green]
>> It seems that in every case I've given (except the first, where I
>> still see no workaround) you've told me "this isn't absolutely
>> necessary" and I've answered "yes, but it's convenient".[/color]
>
> I think for that for every case, including the first, I have shown that
> there is no need for using a <div> or <span> with empty content.[/color]

You have shown that there is no logical necessity to use them, indeed,
but you have not shown that they are not useful or that they are
harmful.
[color=blue][color=green]
>> On the other hand, if you have an important practical reason for not
>> using empty <div> and <span> tags[/color]
>
> First, there is no practical need for <div> and <span> elements with empty
> content (to use the proper terms).[/color]

There is no practical need, but I maintain that there is practical
usefulness.
[color=blue]
> Second, we have the precedent of <p></p>, which has caused much confusion
> - it has been used for layout, and the HTML specification explicitly says
> that it should not be used, and that browsers should ignore such elements.
> And browsers do not generally do that, so we really have a confusion.[/color]

But an empty paragraph is certainly an absurdity: a paragraph, by
definition, cannot be empty. But what is a <div> or a <span>, anyway?
The HTML specification does not enlighten us.

And saying that "it will cause confusion" is handwaving. What
practical problems do you expect should turn up?
[color=blue]
> Third, to take a simple example, such elements mess up the document
> appearance when a user style sheet is used in order to make all <div>
> elements bordered, so that the structure can be seen.[/color]

That's hardly convincing. First, we'll either get empty borders, or
horizontal rules, or simply nothing at all, and in either case, it
accurately reflects the document's structure (there *is* a <div>
there, and it's empty - knowing whether it should be shown or not is
as pointless as knowing how many angels can stand on the head of a
pin).
[color=blue]
> Followups trimmed - I think we are now so far from general XML that this
> belongs to the HTML group only.[/color]

Seems appropriate. I suggest that we drop this discussion, anyway,
since it is getting us nowhere and I think that each of us understands
the other's arguments: do we agree to disagree? Other people can make
their own mind as to whether empty <div> and <span> elements are, or
not, needed / useful / harmful / dangerous.

I would be interested, however, in any suggestions on how to include
foreign inline-level content in HTML without using the CSS content
property or the Mozilla-invented XBL language, given that server-side
includes are ruled out.

Cheers,

--
David A. Madore
(david.madore@e ns.fr,
http://www.eleves.ens.fr:8080/home/madore/ )

**Jukka K. Korpela** · Jul 20 '05, 03:41 PM

Re: empty <div> and <span> elements (was: Re: XHTML user agent behavior...)

david.madore@en s.fr (David Madore) wrote:
[color=blue]
> The clear attribute is deprecated in HTML4 or XHTML.[/color]

Certainly. What I referred to was the fact that the use of the clear
property for empty elements means that you are simulating <br clear="...">
in CSS. And I noted that the natural approach is to use the clear property
for content elements, not for elements without content (like <br>, or
<div> with empty content).
[color=blue][color=green]
>> In any case, the tools you use for
>> generating content e.g. server-side should be selected to match the
>> needs, not vice versa.[/color]
>
> That's very well in theory, but in practice we have to do with the
> tools we have. Unless you were to give a *compelling* reason for not
> using empty <div> and <span> elements, which you failed to do.[/color]

You might just as well argue in favor of using <font> because some
generating software produces it, or generates something for which you need
<font>.
[color=blue]
> Altering an entire production chain merely to avoid empty <div> and
> <span> elements is hardly a serious suggestion.[/color]

Surely many more production chains rely on <font>.
[color=blue]
> If you refer to question 5.1 in the FAQ (at <URL:
> http://www.faqs.org/faqs/www/authoring-faq/ >), well, server-side
> includes are not an option for me for reasons that I won't go into.[/color]

The FAQ describes other approaches too, and actually doesn't favor server-
side includes as much as many other suggestions do.

But I really cannot see why server-side processing is excluded when you
actually refer to content generated server-side, as it seems.
[color=blue][color=green][color=darkred]
>>> <p>Stylesheet name (if applicable): [<span
>>> id="insert-stylesheet-name-here"></span>]</p>[/color]
>>
>> I fail to see what this relates to. Why would a document contain
>> style sheet names that way?[/color]
>
> Because we can have alternate stylesheets, and a well-designed
> browser, or a little bit of ECMAscript magic, lets the user choose
> among them.[/color]

A well-designed browser lets the user choose between style sheets, but
here you seem to be trying to do something that creates a page-specific
method for the same purpose. This might be useful, in the present
situation, but it's difficult to see how your code would relate to that.
Do you mean that people using CSS-disabled browsers should see
"Stylesheet name (if applicable): ", instead of not seeing such a thing?
[color=blue]
> And so what? Hacks won't go away merely because we call them "hacks".[/color]

If we agree on the observation that <div></div> is a hack, we have reached
an obvious conclusion after quite some discussion.
[color=blue]
> But an empty paragraph is certainly an absurdity: a paragraph, by
> definition, cannot be empty. But what is a <div> or a <span>, anyway?[/color]

A <div> or <span> with empty content is the same as <p></p>, just without
the paragraph semantics...
[color=blue]
> The HTML specification does not enlighten us.[/color]

.... and no _explicit_ statement against them in the specs. But if <p></p>
is not recommended and should be ignored by user agents, doesn't the same
apply to <div></div> and <span></span> a fortiori?
[color=blue]
> I would be interested, however, in any suggestions on how to include
> foreign inline-level content in HTML without using the CSS content
> property or the Mozilla-invented XBL language, given that server-side
> includes are ruled out.[/color]

Just write it. And here you have genuine use for <span> or <div>, since
the lang attribute needs some markup element to which it can be attached.
If your document should have some content and should not have it, I'm
afraid you need to explain a bit.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

**David Madore** · Jul 20 '05, 03:41 PM

Re: empty <div> and <span> elements (was: Re: XHTML user agent behavior...)

"Jukka K. Korpela" in litteris
<Xns93EAD74A431 C3jkorpelacstut fi@193.229.0.31 > scripsit:[color=blue]
> Certainly. What I referred to was the fact that the use of the clear
> property for empty elements means that you are simulating <br clear="...">
> in CSS. And I noted that the natural approach is to use the clear property
> for content elements, not for elements without content (like <br>, or
> <div> with empty content).[/color]

Could you stop beating about the bush and explicit the meaning of "use
the clear property for content elements, not for elements without
content"? Can you write down explicitly the markup that you would
use? I give my very simple example again: the basic page (written
with an empty <div> element) is <URL:
http://www.eleves.ens.fr:8080/home/m...st/float1.html >. At
first I understood that you meant me to write <URL:
http://www.eleves.ens.fr:8080/home/m...st/float2.html > instead,
which puts the clear property on the following <p>, and I underlined
that it does not render in the same way (so it is not acceptable).
Then I understood that you meant me to write as in <URL:
http://www.eleves.ens.fr:8080/home/m...st/float3.html >, which
uses <br/> intead. And now you say that I shouldn't be using clear
for content elements.

So, please put an end to the confusion, and rewrite <URL:
http://www.eleves.ens.fr:8080/home/m...st/float1.html > to
produce the same presentation effect, in the way you think it should
be done. This should take you five seconds and put an end to this
silly dialogue of the deaf.
[color=blue][color=green]
>> If you refer to question 5.1 in the FAQ (at <URL:
>> http://www.faqs.org/faqs/www/authoring-faq/ >), well, server-side
>> includes are not an option for me for reasons that I won't go into.[/color]
>
> The FAQ describes other approaches too, and actually doesn't favor server-
> side includes as much as many other suggestions do.[/color]

Would you mind pointing to a specific place within the FAQ?
[color=blue]
> But I really cannot see why server-side processing is excluded when you
> actually refer to content generated server-side, as it seems.[/color]

Yes, but the content that needs to be included is not available to the
server that does the processing, strange as it may seem. The main
HTML content is processed on one computer, is served from another
computer, and further inline content (which is dynamic, whereas the
rest is mostly static) should be inserted that is served from a third
computer. Details are unimportant, but the bottom line is that the
main content cannot use server-side includes.
[color=blue][color=green]
>> And so what? Hacks won't go away merely because we call them "hacks".[/color]
>
> If we agree on the observation that <div></div> is a hack, we have reached
> an obvious conclusion after quite some discussion.[/color]

No, I do not agree with this, and I wish you weren't so condescending.
But I think that we've both given ample arguments by now and it is
useless to continue discussing along this line. I'd just like to know
how you propose to replace the clear property on the empty <div>
(example above), and if you have any totally novel suggestion for
including inline-level HTML content from an HTML document.
[color=blue]
> A <div> or <span> with empty content is the same as <p></p>, just without
> the paragraph semantics...[/color]

Precisely, and it is the paragraph semantics which pose problem,
because a paragraph should not be empty, whereas I see no *a priori*
reason why an abstract container element should not be empty. But as
I just said, I drop this line of discussion.

--
David A. Madore
(david.madore@e ns.fr,
http://www.eleves.ens.fr:8080/home/madore/ )

**Jukka K. Korpela** · Jul 20 '05, 03:41 PM

Re: empty <div> and <span> elements (was: Re: XHTML user agent behavior...)

david.madore@en s.fr (David Madore) wrote:
[color=blue]
> Could you stop beating about the bush and explicit the meaning of "use
> the clear property for content elements, not for elements without
> content"?[/color]

The formulation is very explicit. An element without content is an
element that has no content.
[color=blue]
> Can you write down explicitly the markup that you would use?[/color]

I would use no extra markup, except a class attribute when needed.
[color=blue]
> So, please put an end to the confusion, and rewrite <URL:
> http://www.eleves.ens.fr:8080/home/m...st/float1.html > to
> produce the same presentation effect[/color]

What "same presentation effect"? It renders essentially differently e.g.
on IE 6 and Mozilla 1.3.

If you wish to get consultation help from me with your specific problems,
you need to be prepared to discussions concerning what you really want,
disclosing the real life case, and negotiating on the fee beforehand.
[color=blue]
> Would you mind pointing to a specific place within the FAQ?[/color]

You had already found the right place. Now read it - it does _not_ present
SSI as the only answer to "How do I include a file".

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

**Mikko Ohtamaa** · Jul 20 '05, 03:41 PM

Re: XHTML user agent behavior regarding empty elements

Johannes Koch <koch@w3develop ment.de> wrote in message news:<bin4k6$bc btd$1@ID-61067.news.uni-berlin.de>...[color=blue]
> Mikko Ohtamaa wrote:[color=green]
> > I am using MSXML (Microsoft XML engine) to transform XML data to XHTML
> > reports.[/color]
>
> Why do you want to create _X_HTML reports, when several browsers don't
> know about _X_HTML. Produce HTML instead.[/color]

Yes, we fell back to HTML. The orignal reason for using XHTML was
character encoding difficulties with MSXML and HTML, but we managed to
workaround this other way.
[color=blue][color=green]
> > In XSLT it is too heavy to check if each element will be empty and
> > implement a wrapper for it.[/color]
>
> <xsl:template match="foo">
> <xsl:if test="normalize-space(.) != ''">
> <div class="{local-name()}">
> <xsl:value-of select="."/>
> </div>
> </xsl:if>
> </xsl:template>
>
> Is this really too heavy?[/color]

The empty <div/> problem was not in XSLT itself. When MSXML transforms
XML data to XHTML document, the target document resides in memory as
MSXML DOM tree. The DOM tree doesn't have information about start tags
and end tags. When DOM tree is spit out to XML all empty elements are
presented using self-closing tags. So there is no difference in output
wheter you use <div></div> or <div/> in XSLT stylesheets.

(Empty <div/> elements were produced with XSLT because there were
accepted missing fields in input XML)

Also, the result XHTML XML is stored to a file instead of direct
serving from a web server with MIME type support. Even if there are
<?xml...?> and <!doctype...> tags, browsers fail to identify file
contents as XHTML.

-Mikko

**Ernest Cline** · Jul 20 '05, 03:42 PM

Re: XHTML user agent behavior regarding empty elements

"Headless" <me@privacy.net > wrote:[color=blue]
> David Madore wrote:
>
> Point of order; don't cross post replies.
>[color=green]
> >Note that Mozilla is about the only browser which supports the
> >application/xhtml+xml content-type anyway.[/color]
>
> Ahem: Opera.[/color]

Opera is nice but it has what I consider to be a stupid design decision on
its part concerning application/xhtml+xml. It fails to handle entities such
as é correctly. It is true that the XML specs say that a user agent
can do that for XML in general and if the document were being served as
application/xml, I would agree with their decision, but since Opera
indicates that it supports xhtml+xml then in my opinion if it gets back a
document served as xhtml+xml it should parse the entities. Since they have
to support the entities for HTML anyway, I fail to see why they are being so
obstinate.

**Henri Sivonen** · Jul 20 '05, 03:43 PM

Re: XHTML user agent behavior regarding empty elements

In article <9L56b.4066$_26 .1035@newsread2 .news.atl.earth link.net>,
"Ernest Cline" <ernestcline@mi ndspring.commun ism> wrote:
[color=blue]
> Opera is nice but it has what I consider to be a stupid design decision on
> its part concerning application/xhtml+xml. It fails to handle entities such
> as é correctly. It is true that the XML specs say that a user agent
> can do that for XML in general and if the document were being served as
> application/xml, I would agree with their decision, but since Opera
> indicates that it supports xhtml+xml then in my opinion if it gets back a
> document served as xhtml+xml it should parse the entities. Since they have
> to support the entities for HTML anyway, I fail to see why they are being so
> obstinate.[/color]

As I understand it, the main reason why the XML specification allows
non-validating XML processors not to process external entities is to
accommodate browsers. Therefore, it would be silly for browsers not to
use this opportunity for optimizing performance.

The XHTML DTDs are huge. It doesn't make sense to parse them (even from
a local catalog) in an interactive application only to get some
character entities.

By the way, Safari doesn't support the character entities, either.
Mozilla cheats and uses an abridged DTD. What Mozilla does is quite icky.

Entity support in HTML has nothing to do with this, since HTML gets a
tag soup treatment.

--
Henri Sivonen
hsivonen@iki.fi

Henri Sivonen's pages

http://www.iki.fi/hsivonen/

Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html

**Joel Shepherd** · Jul 20 '05, 03:43 PM

Re: XHTML user agent behavior regarding empty elements

Henri Sivonen wrote:[color=blue]
> In article <9L56b.4066$_26 .1035@newsread2 .news.atl.earth link.net>,
> "Ernest Cline" <ernestcline@mi ndspring.commun ism> wrote:
>[color=green]
>> [Opera] fails to handle entities such as é correctly.[/color]
>
> As I understand it, the main reason why the XML specification
> allows non-validating XML processors not to process external
> entities is to accommodate browsers. Therefore, it would be silly
> for browsers not to use this opportunity for optimizing
> performance.[/color]

Optimizing performance sounds like a very weak rationale for not
rendering characters correctly. I could make my own software much
faster as well, if I could chop out some of the basic functional
requirements. That wouldn't make it *better* though.
[color=blue]
> The XHTML DTDs are huge. It doesn't make sense to parse them (even
> from a local catalog) in an interactive application only to get
> some character entities.[/color]

How many XHTML DTDs would a browser need to know about? Why would it
not make sense to cache the *parsed* version of each, thereby
protecting performance and enabling the browser to render entities
correctly and efficiently?

--
Joel.

**Henri Sivonen** · Jul 20 '05, 03:49 PM

Re: XHTML user agent behavior regarding empty elements

In article <M7L6b.2697$Yt. 1826@newsread4. news.pas.earthl ink.net>,
Joel Shepherd <joelshep@ix.ne tcom.com> wrote:
[color=blue]
> Henri Sivonen wrote:[color=green]
> > In article <9L56b.4066$_26 .1035@newsread2 .news.atl.earth link.net>,
> > "Ernest Cline" <ernestcline@mi ndspring.commun ism> wrote:
> >[color=darkred]
> >> [Opera] fails to handle entities such as é correctly.[/color]
> >
> > As I understand it, the main reason why the XML specification
> > allows non-validating XML processors not to process external
> > entities is to accommodate browsers. Therefore, it would be silly
> > for browsers not to use this opportunity for optimizing
> > performance.[/color]
>
> Optimizing performance sounds like a very weak rationale for not
> rendering characters correctly.[/color]

There are already two other ways of representing characters correctly.

Getting a *third* way for representing characters when the two other
ways can represent all of Unicode and this third way can represent only
a small subset sounds like a very weak rationale for requiring user
agents to parse additional fluff every time a document is parsed.
[color=blue]
> I could make my own software much
> faster as well, if I could chop out some of the basic functional
> requirements. That wouldn't make it *better* though.[/color]

The two other ways of representing all the characters that are allowed
in XML are
1) using an encoding that can represent all of Unicode (UTF-*)
and
2) using numeric character references (Ӓ).
I consider support for UTF-8 a basic funtional requirement for software
that is used for authoring XML documents. Do you?
[color=blue][color=green]
> > The XHTML DTDs are huge. It doesn't make sense to parse them (even
> > from a local catalog) in an interactive application only to get
> > some character entities.[/color]
>
> How many XHTML DTDs would a browser need to know about?[/color]

To meet what requirement? Modularization of XHTML makes it possible for
anyone to concoct a new language variant that is in the "XHTML family"
of languages. Also, anyone can take an existing W3C XHTML DTD and refer
to it using a local system id without the public id.
[color=blue]
> Why would it
> not make sense to cache the *parsed* version of each, thereby
> protecting performance and enabling the browser to render entities
> correctly and efficiently?[/color]

Not really. Grammar caching is hard because declarations in the internal
DTD subset can substantially affect the external DTD subset. It would be
possible to optimize the common cases somewhat, though. The browser
could cache the data structures built when parsing a couple of W3C XHTML
DTDs and use the cached versions if the internal DTD subset is empty.
However, inflicting this kind of complexity on user agents just in order
to get a third way of representing some characters doesn't make sense.

--
Henri Sivonen
hsivonen@iki.fi

Henri Sivonen's pages

http://www.iki.fi/hsivonen/

Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html

XHTML user agent behavior regarding empty elements

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment