non-breaking hyphen

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Jukka K. Korpela

    #16
    Re: non-breaking hyphen

    Lachlan Hunt <spam.my.gspot@ gmail.com> wrote:
    [color=blue][color=green]
    >> <nobr>a/b</nobr> says that a/b is a unit of information where all
    >> characters belong together.[/color]
    >
    > That sounds like your just trying to apply semantics to an element
    > that is defined as purely presentational.[/color]

    That's because you have already decided so. Think about
    rmdir /foo
    versus
    rmdir / foo
    Is the difference purely presentational? That's what the Unicode
    consortium thinks, when it allows the first expression to be divided as
    rmdir /
    foo
    [color=blue][color=green]
    >> It's surely _more_ semantic than the W3C approach which moves us
    >> down to the character level.[/color]
    >
    > It depends. Some situations may be more appropriately marked up
    > using elements, and others may be better left at the character level.[/color]

    Moving it to character level means that presentational features have been
    wired in into the document's textual content. Isn't this worse than
    wiring it in into markup around the content? Things may change, of
    course, if we regards line breaking issues as potentially belonging to
    logical structure or semantics.
    [color=blue]
    > There are also a huge number of situations where I might want bold
    > text.[/color]

    Not really. Ignoring headings, table cells and things like that and
    considering inline emphasis only, the odds are that the reason for
    bolding text is strong emphasis. Whether this is too coarse a concept is
    an interesting question, but it corresponds to the <strong> element.
    Except for a small number of special cases, <b> is just the vulgar way of
    writing <strong> (and the original designers of HTML should be blamed for
    this - _they_ decided to make the logical alternative's name five times
    as long as the physical alternative's name).
    [color=blue]
    > I find the news: URIs more useful since clicking on one will
    > automatically launcy my newsreader for me[/color]

    It won't launch any newsreader unless the browser has been configured to
    use one - and this is normally _not_ handled in any default settings.
    [color=blue][color=green][color=darkred]
    >>>In this case, <code> seems most approprate.[/color]
    >>
    >> Is the name computer code? I think it's a borderline case, and I
    >> think you are just interpreting the semantics of <code> very freely[/color]
    >
    > Yes, it was a very loose interpretation, however <code> is very
    > loosely defined in the spec.[/color]

    We agree on that, though maybe for different values of "very". But the
    reason for your choosing it was that you felt that you _needed_ _any_
    element that you can regard as logical. That is, in an attempt to avoid
    <nobr> and <span>, you would have picked up virtually anything, even an
    element that you wouldn't have dreamt of otherwise.

    But it's not necessarily a bad choice.
    [color=blue]
    > It just states that it is a fragment of
    > comoputer code, and I interpreted that very loosely as content that
    > can be processed in some meaningful way be a computer.[/color]

    That would mean that anything is <code>, wouldn't it? Surely you can feed
    any text into a computer and process it in some meaningful way.

    But a newsgroup name could be marked up as <code> because it is "computer
    code" in the sense of having been _defined_ separately for use as input
    to computer software, as an identifier of a group. This becomes more
    obvious, perhaps, if you think how newsgroup names often have to be
    distorted from the natural language expressions that they have been
    derived from, e.g. by dropping accents away.

    On the practical side, some automatic translation software (BabelFish)
    treats text inside <code> as a literal string that remains invariant in
    translation. And this is very natural and very desirable, since if we
    have, say, some text about Unix, mentioning the <code>cat</code> command,
    then we don't want that "cat" to become "chat" when translating into
    French.
    [color=blue][color=green]
    >> in order to avoid the inevitable conclusion: in the great majority
    >> of cases, the real alternative to <nobr> is <span>, which by
    >> definition lacks _all_ semantics.[/color]
    >
    > As does <nobr>, so in a sense you are correct.[/color]

    No it doesn't. Even if you regard <nobr> as purely presentational,
    marking something with <nobr> says _more_ than marking it with <span>.
    Just as <b class="vector"> says more than <span class="vector"> . The
    former says, loosely speaking, 'here we have an element with undefined
    meaning, but the preferred visual rendering is bold'. It does not say
    what the meaning is, but it may give a hint.
    [color=blue]
    > Classes can be used to give author defined semantics, even to
    > semantically empty elements.[/color]

    What author defined semantics? The class name has no meaning; it is
    simply a string. The author may have something in his mind, and someone
    reading the source code might get a hint if he happens to know the
    natural language from which the name had been taken. But this is
    different from the hint given by <b> (or by <nobr>, even if you regard it
    as presentational only), as defined by the _markup language_.
    Would you understand the author defined semantics of
    class="lauseke" or class="korostus "?

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

    Comment

    • Henri Sivonen

      #17
      Re: non-breaking hyphen

      In article <Xns95A0E6BA543 79jkorpelacstut fi@193.229.0.31 >,
      "Jukka K. Korpela" <jkorpela@cs.tu t.fi> wrote:
      [color=blue]
      > Similar considerations might even apply to
      > the use of &nbsp; - which is universally supported by browsers but which
      > still might cause problems in cut & paste operations for example, since
      > it is by definition a character distinct from the space character.
      >
      > Using <nobr> avoids the problem.[/color]

      If having a non-breaking space is important, isn't it important to copy
      it as well?

      --
      Henri Sivonen
      hsivonen@iki.fi

      Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html

      Comment

      • David Ross

        #18
        Re: non-breaking hyphen

        The Bicycling Guitarist wrote:[color=blue]
        >
        > Hi. I found the following when trying to learn if there is such a thing as a
        > non-breaking hyphen. Apparently Unicode has a ‑ but that is not
        > well-supported, especially in older browsers. Somebody somewhere said:
        >
        > Alternately, you can use CSS to declare a class having:
        >
        > .nowrap { white-space:nowrap }
        >
        > ... and then wrap the compound word in a <span class=nowrap></span> tag (or
        > any other suitable inline tag). You can also try { white-space:pre } ...
        >
        > I wasn't sure where to post this, because part of the question is about the
        > character entity that apparently is NOT defined in html? However, what about
        > the CSS idea for non-wrapping? On one of my pages
        > www.TheBicyclingGuitarist.net/newstuff.htm I give credit to some folks at
        > comp.infosystem s.www.authoring.site-design. I want the hyphen in between
        > site and design to be a non-breaking one.[/color]

        I don't understand. In Mozilla, hyphens (&minus;, ISO 8859-1
        &#045;, 0x2D) are non-breaking.

        --

        David E. Ross
        <http://www.rossde.com/>

        I use Mozilla as my Web browser because I want a browser that
        complies with Web standards. See <http://www.mozilla.org/>.

        Comment

        • Jukka K. Korpela

          #19
          Re: non-breaking hyphen

          David Ross <nobody@nowhere .not> wrote:
          [color=blue]
          > I don't understand. In Mozilla, hyphens (&minus;, ISO 8859-1
          > &#045;, 0x2D) are non-breaking.[/color]

          The entity reference &minus; denotes the minus sign, which is not a
          hyphen at all.

          ISO 8859-1 is irrelevant here.

          The character reference &#045; denotes the hyphen-minus character
          (Ascii hyphen), and 0x2D is a common way of mentioning its hexadecimal
          code in several standards.

          The Unicode line breaking rules allow a line break after a hyphen-minus
          character, and IE (and Opera) applies this principle. The problem we are
          discussing is that such breaks are often undesirable.

          The rules don't imply that a program _must_ break a line after a
          hyphen-minus character in any particular occasion. But IE (and Opera)
          rather mechanically breaks after it.

          --
          Yucca, http://www.cs.tut.fi/~jkorpela/
          Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

          Comment

          • Eric B. Bednarz

            #20
            Re: non-breaking hyphen

            Lachlan Hunt <spam.my.gspot@ gmail.com> writes:
            [color=blue]
            > Using <nobr> is tag soup, so that's just being hypocritical.[/color]

            Your favourite in-itself-hypocritical spec-of-the-week-club aside, it's
            no more tag soup than PI substitutes like BR or HR or mystery-meat
            attributes like WIDTH and HEIGHT.


            --
            | ) Più Cabernet,
            -( meno Internet.
            | ) http://bednarz.nl/

            Comment

            • Malcolm Dew-Jones

              #21
              Re: non-breaking hyphen

              Jukka K. Korpela (jkorpela@cs.tu t.fi) wrote:
              : Lachlan Hunt <spam.my.gspot@ gmail.com> wrote:

              : >> It's surely _more_ semantic than the W3C approach which moves us
              : >> down to the character level.
              : >
              : > It depends. Some situations may be more appropriately marked up
              : > using elements, and others may be better left at the character level.

              : Moving it to character level means that presentational features have been
              : wired in into the document's textual content. Isn't this worse than
              : wiring it in into markup around the content?

              Thank you for those words. That was what I was trying to get at.

              No matter what the desirable semantics of html might be, it still seems
              backwards to me that the "low level logic" of the individual symbols used
              in the document could have more control over the presentation than the
              "high level logic" of the markup language used by a tool that is very much
              concerned with the presentation details of the document.


              In another part of this thread I said[color=blue][color=green]
              >> Surely they don't have any applicability in any text except as the
              >> application chooses them to have applicability.[/color][/color]

              and Alan J. Flavell responded
              [color=blue]
              >That looks like a tautology to me![/color]

              What I meant was that the unicode standard, in my opinion, should not
              define anything but the mapping of character values to the characters name
              and an acceptable glyph for it. Everything else should be handled via
              higher level logic. The exact details appropriate to that higher logic
              would depend on the technologies being used, not on unicode. At least
              that is my opinion after seeing how complex the whole issue of unicode
              seems to have become, compared to the simple simple simple original idea
              of solving character set problems by defining a new standard character set
              that simply defined far more characters than the old standard ascii
              character set and enabled this by simply requiring computers and software
              to use more bits per character.

              Well ok, for various reasons the characters appear to need to be able to
              indicate certain things such as line breaks, but even that level of
              formatting information in the character set should be de-supported, except
              to define sets of reserved values available to applications to use as they
              see fit (and obviously some of those values would end up having "well
              known semantics").

              Somehow, the original discussiom seemed to touch on those ideas, that's
              all.

              Comment

              • Lachlan Hunt

                #22
                Re: non-breaking hyphen

                Eric B. Bednarz wrote:[color=blue]
                > Lachlan Hunt <spam.my.gspot@ gmail.com> writes:[color=green]
                >>Using <nobr> is tag soup, so that's just being hypocritical.[/color]
                >
                > Your favourite in-itself-hypocritical spec-of-the-week-club aside,[/color]

                I have no idea which "club" you're talking about. I try to avoid being
                hypocritical, and if I have, could you please explain so I can correct
                myself?
                [color=blue]
                > it's no more tag soup than PI substitutes like BR or HR or mystery-meat
                > attributes like WIDTH and HEIGHT.[/color]

                Unlike <nobr>, those elements and attributes do actually exist in the
                HTML specificaitions . Although, if your point is that they are also
                presentational, then I would somewhat agree.

                It is true that in some cases, those elements and attributes can be
                considered presentational, especially given the poorly structured design
                of the <br> and <hr> elements, which I'm sure has been discussed many
                times before. However, if they are used correctly, they can be
                reasonably semantic and their use is certainly not as bad as <nobr>.

                --
                Lachlan Hunt

                http://GetFirefox.com/ Rediscover the Web
                http://SpreadFirefox.com/ Igniting the Web

                Comment

                • Alan J. Flavell

                  #23
                  Re: non-breaking hyphen

                  On Mon, 14 Nov 2004, Malcolm Dew-Jones wrote:
                  [color=blue]
                  > What I meant was that the unicode standard, in my opinion, should
                  > not define anything but the mapping of character values to the
                  > characters name and an acceptable glyph for it.[/color]

                  I think that's what the iso-10646 part aims to do. You'll recall that
                  there were originally two separate pushes trying to address the i18n
                  problem: iso-10646, and Unicode, and that they sort-of spliced
                  themselves together. This would have been in the early 1990's,
                  roughly, IIRC. But the joins still show in a number of places.

                  The Unicode specification goes quite some way beyond merely assigning
                  code points to characters. It not only codifies characters (in terms
                  of case mapping, directionality, combining properties etc.) but also
                  defines a number of characters meant to exercise control functions
                  without corresponding to any displayable glyph (zero-width joiner and
                  non-joiner, directionality-control etc.). These are defined primarily
                  for their use in plain-text data; their applicability in particular
                  applications such as a markup language is less obvious, and needs to
                  be codified by the markup language (as indeed occurs to some extent in
                  HTML).

                  You might be of the opinion that that was inadvisable - was being done
                  at the wrong protocol level etc. - and for sure you'd have quite a few
                  arguments in your favour, but I'm afraid we have to take Unicode as it
                  is now, whatever we might think about such details.
                  [color=blue]
                  > Everything else should be handled via higher level logic. The exact
                  > details appropriate to that higher logic would depend on the
                  > technologies being used, not on unicode.[/color]

                  Same answer, I guess. HTML -could- have said that characters like
                  zero-width joiner, pop directional format, etc. had no business being
                  in an HTML source document, and that such matters had to be resolved
                  at the markup or presentation level; but HTML didn't say that - quite
                  the contrary, in fact. For better or for worse.

                  Comment

                  • Spartanicus

                    #24
                    Re: non-breaking hyphen

                    Lachlan Hunt <spam.my.gspot@ gmail.com> wrote:
                    [color=blue]
                    >Unlike <nobr>, those elements and attributes do actually exist in the
                    >HTML specificaitions . Although, if your point is that they are also
                    >presentational , then I would somewhat agree.
                    >
                    >It is true that in some cases, those elements and attributes can be
                    >considered presentational, especially given the poorly structured design
                    >of the <br> and <hr> elements, which I'm sure has been discussed many
                    >times before. However, if they are used correctly, they can be
                    >reasonably semantic and their use is certainly not as bad as <nobr>.[/color]

                    UAs don't give a flying monkey if the markup is valid, proper use of
                    <nobr> causes no problems, it prevents several UAs from applying
                    ludicrous unicode breaking rules, and a custom DTD solves the errors on
                    validation.

                    So what is you argument on why it is "bad"?

                    --
                    Spartanicus

                    Comment

                    • Eric B. Bednarz

                      #25
                      Re: non-breaking hyphen

                      Lachlan Hunt <spam.my.gspot@ gmail.com> writes:
                      [color=blue]
                      > Eric B. Bednarz wrote:[/color]
                      [color=blue][color=green]
                      >> Your favourite in-itself-hypocritical spec-of-the-week-club aside,[/color]
                      >
                      > I have no idea which "club" you're talking about.[/color]

                      I was forward-guessing that the real upshot about 'tag soup' was the
                      mere absence of NOBR in W3C specs.
                      [color=blue][color=green]
                      >> it's no more tag soup than PI substitutes like BR or HR or mystery-meat
                      >> attributes like WIDTH and HEIGHT.[/color]
                      >
                      > Unlike <nobr>, those elements and attributes do actually exist in the
                      > HTML specificaitions .[/color]

                      Well, who cares a rat's private parts.
                      [color=blue]
                      > It is true that in some cases, those elements and attributes can be
                      > considered presentational, especially given the poorly structured
                      > design of the <br> and <hr> elements,[/color]

                      Well, the vocabulary of HTML being so blunt a tool is the reason that
                      sometimes presentational markup is better than nothing (e.g. <I> for
                      anything that is denoted with italics in conventional typography but
                      falls short of a corresponding element type in HTML is still richer than
                      SPAN -- or nothing). You can argue about the virtue of such issues
                      until the cows come home.

                      BR, however, is not about *descriptive markup* (in SGML: tags) at all.
                      It tells the application to *do* something (e.g. explode, play some
                      music, render a new line; in SGML: processing instructions -- though one
                      could probably also argue that a character reference should do the
                      trick: the parser collapses the HTML whitespace chars and resolved
                      character references for CR/LF are passed to the application for literal
                      rendering. This -- like anything SGML related in HTML -- doesn't have
                      anything to do with web browsers, or real life in general, of course.
                      [color=blue]
                      > However, if they are used correctly, they can be reasonably semantic
                      > and their use is certainly not as bad as <nobr>.[/color]

                      I still do not see what is bad about NOBR (or WBR, for that matter). In
                      the worst case scenario nothing happens.

                      Let's look at a slightly modified version of your earlier statement, and
                      pretend the double hyphen/minus was an em dash.

                      | Those elements and attributes--unlike NOBR--do actually exist in the
                      | HTML specifications.

                      If you oberve UA behaviour and Unicode line breaking rules, you'll
                      realise that you need some presentational markup to the rescue:

                      | Those elements and attributes<wbr>--<wbr><nobr>unli ke
                      | NOBR</nobr><wbr>--<wbr>do actually exist in the
                      | HTML specifications.

                      Neat, no? :)


                      --
                      | ) Più Cabernet,
                      -( meno Internet.
                      | ) http://bednarz.nl/

                      Comment

                      • Shmuel (Seymour J.) Metz

                        #26
                        Re: non-breaking hyphen

                        In <41985667@news. victoria.tc.ca> , on 11/14/2004
                        at 11:10 PM, yf110@vtn1.vict oria.tc.ca (Malcolm Dew-Jones) said:
                        [color=blue]
                        >No matter what the desirable semantics of html might be, it still
                        >seems backwards to me that the "low level logic" of the individual
                        >symbols used in the document could have more control over the
                        >presentation than the "high level logic" of the markup language used
                        >by a tool that is very much concerned with the presentation details
                        >of the document.[/color]

                        Actually, it is backwards for the HTML to be very much concerned with
                        the presentation details of the document. That's not what HTML was
                        intended for.
                        [color=blue]
                        >What I meant was that the unicode standard, in my opinion, should not
                        >define anything but the mapping of character values to the characters
                        >name and an acceptable glyph for it.[/color]

                        I strongly disagree. That might be acceptable for character data in
                        your HTML, but it breaks entry of data into forms.
                        [color=blue]
                        >At least that is my opinion after seeing how complex the whole issue
                        >of unicode seems to have become,[/color]

                        It isn't just Unicode, and it didn't "become" complex; it was always
                        complex. You're looking at it from the perspective of an Indo-European
                        language, and aren't seein all of the issues.

                        --
                        Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

                        Unsolicited bulk E-mail subject to legal action. I reserve the
                        right to publicly post or ridicule any abusive E-mail. Reply to
                        domain Patriot dot net user shmuel+news to contact me. Do not
                        reply to spamtrap@librar y.lspace.org

                        Comment

                        • Malcolm Dew-Jones

                          #27
                          Re: non-breaking hyphen

                          Shmuel (Seymour J.) Metz (spamtrap@libra ry.lspace.org.i nvalid) wrote:
                          : In <41985667@news. victoria.tc.ca> , on 11/14/2004
                          : at 11:10 PM, yf110@vtn1.vict oria.tc.ca (Malcolm Dew-Jones) said:

                          : >No matter what the desirable semantics of html might be, it still
                          : >seems backwards to me that the "low level logic" of the individual
                          : >symbols used in the document could have more control over the
                          : >presentation than the "high level logic" of the markup language used
                          : >by a tool that is very much concerned with the presentation details
                          : >of the document.

                          : Actually, it is backwards for the HTML to be very much concerned with
                          : the presentation details of the document. That's not what HTML was
                          : intended for.

                          I said that HTML was used by a _tool_ that is concerned with presentation.

                          Such tools makes extensive presentation decisions based on html, and the
                          ability of the tools to make correct presentation decisions in a variety
                          of environments has always been a prime purpose for html.


                          : >What I meant was that the unicode standard, in my opinion, should not
                          : >define anything but the mapping of character values to the characters
                          : >name and an acceptable glyph for it.

                          : I strongly disagree. That might be acceptable for character data in
                          : your HTML, but it breaks entry of data into forms.

                          How does it break the entry of data into forms?

                          : >At least that is my opinion after seeing how complex the whole issue
                          : >of unicode seems to have become,

                          : It isn't just Unicode, and it didn't "become" complex; it was always
                          : complex. You're looking at it from the perspective of an Indo-European
                          : language, and aren't seein all of the issues.

                          That's right, I am. Various language issues should not be dealt with at
                          the level of character data exactly because some human language issues do
                          not easily map to character data. Trying to do so just makes everything
                          unnecessarily complicated for the languages that do map reasonably well to
                          such a simple system.

                          Those other issues should be dealt with by a higher level protocol.

                          If unicode had been kept simple then we all would have been using it for
                          all western-style languages many years ago, and all the effort currently
                          being spent would instead be used working on systems that work more
                          naturally for non-western-languages.

                          However, as has been pointed out, the decisions have been made and we have
                          to live with them.

                          Comment

                          • Shmuel (Seymour J.) Metz

                            #28
                            Re: non-breaking hyphen

                            In <419af24c@news. victoria.tc.ca> , on 11/16/2004
                            at 10:40 PM, yf110@vtn1.vict oria.tc.ca (Malcolm Dew-Jones) said:
                            [color=blue]
                            >I said that HTML was used by a _tool_ that is concerned with
                            >presentation .[/color]

                            And water is wet. Your point?
                            [color=blue]
                            >How does it break the entry of data into forms?[/color]

                            Because it fails to present the data as the user expects when the
                            user enters Unicode data that are intended to have an effect on
                            presentation.
                            [color=blue]
                            >That's right, I am. Various language issues should not be dealt
                            >with at the level of character data exactly because some human
                            >language issues do not easily map to character data.[/color]

                            The issues dealt with by Unicode do map easily.
                            [color=blue]
                            >Those other issues should be dealt with by a higher level protocol.[/color]

                            No. What you want would destroy interoperabilit y between applications.
                            [color=blue]
                            >If unicode had been kept simple then we all would have been using it
                            >for all western-style languages many years ago,[/color]

                            Why? And why would its adoption matter matter if it didn't include
                            true internationaliz ation?

                            --
                            Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

                            Unsolicited bulk E-mail subject to legal action. I reserve the
                            right to publicly post or ridicule any abusive E-mail. Reply to
                            domain Patriot dot net user shmuel+news to contact me. Do not
                            reply to spamtrap@librar y.lspace.org

                            Comment

                            Working...