number or name for special character

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • The Bicycling Guitarist

    number or name for special character

    A browser conforming to HTML 4.0 is required to recognize &#number;
    notations.


    If I use XHTML 1.0 and charset UTF-8 though, does é have as much
    support as é ?

    Sometimes when I run the TIDY utility on my code, it replaces my character
    notations with weird looking things I don't recognize. Also, when I
    converted to UTF-8 from ISO-8859-1, I discovered many special characters
    didn't make the transition well.

    Thanks for all the help from the wonderful people in this group.


  • Mark Parnell

    #2
    Re: number or name for special character

    On Tue, 02 Nov 2004 01:36:55 GMT, The Bicycling Guitarist
    <Chris@TheBicyc lingGuitarist.n et> declared in
    comp.infosystem s.www.authoring.html:
    [color=blue]
    > If I use XHTML 1.0 and charset UTF-8 though, does &eacute; have as much
    > support as &#233; ?[/color]

    In general numeric entities have better support than their named
    equivalents. Not sure about that one specifically.

    Not entirely what you were asking, but this shows how well supported
    each of the *numeric* entities are (a bit outdated now, though newer
    versions of browsers would presumably support at least the same ones
    that earlier versions did).

    [color=blue]
    > Sometimes when I run the TIDY utility on my code, it replaces my character
    > notations with weird looking things I don't recognize.[/color]

    Examples (ideally before and after URLs)?
    [color=blue]
    > Also, when I
    > converted to UTF-8 from ISO-8859-1, I discovered many special characters
    > didn't make the transition well.[/color]

    That's presumably because those characters are part of UTF-8 but not
    ISO-8859-1.

    --
    Mark Parnell

    Comment

    • Tim

      #3
      Re: number or name for special character

      On Tue, 02 Nov 2004 01:36:55 GMT,
      "The Bicycling Guitarist" <Chris@TheBicyc lingGuitarist.n et> posted:
      [color=blue]
      > A browser conforming to HTML 4.0 is required to recognize &#number;
      > notations.
      >
      > If I use XHTML 1.0 and charset UTF-8 though, does &eacute; have as much
      > support as &#233; ?[/color]

      Quite probably, I haven't found a browser that didn't supported entities
      for any of the characters that aren't too unusual. You're probably more
      likely to strike problems with the browser trying to display something that
      the current font is inadequate for than the browser not supporting the
      character.

      Personally, I prefer named entities than numerical references. If, for
      some reason, my browser can't display some reference it's going to show the
      code for what it can't do (*). I've got a fair guess at what I should have
      seen if it writes &eacute; on the page, but I'd have to look up what &#233;
      referred to.

      * Some browsers don't show details for what you're missing out on, they'll
      just print a ? or a blank box. Very unhelpful...

      There's a point of view that says to avoid one thing in particular, though:
      The euro. With a recommendation to write the name, normally, rather than
      try and use a symbol for it. (There isn't always a symbol for it on the
      system, or it's not authored right and the browser, therefore, doesn't
      display it. Also, not all countries use it - in Australia you'd get very
      little comprehension about what a euro is.)
      [color=blue]
      > Sometimes when I run the TIDY utility on my code, it replaces my character
      > notations with weird looking things I don't recognize. Also, when I
      > converted to UTF-8 from ISO-8859-1, I discovered many special characters
      > didn't make the transition well.[/color]

      Usually that's because you're not actually authoring in the encoding system
      that you think you are. If you're editing in plain text editors on older
      Windows systems, like Win98SE, you're probably better off telling tidy that
      it's receiving win1252 encoding. For newer systems, or fancier editors, it
      might be one of the UTF schemes, but perhaps not UTF-8 (Windows idea of
      Unicode might be UTF16 or UTF8, depending on the application - but with no
      indication of which it's using, you'll have to test things).

      Say what system and software you're using, someone might be able to tell
      you what it's doing, or let you know if it has any peculiar foibles.

      --
      If you insist on e-mailing me, use the reply-to address (it's real but
      temporary). But please reply to the group, like you're supposed to.

      This message was sent without a virus, please delete some files yourself.

      Comment

      • Pierre Goiffon

        #4
        Re: number or name for special character

        "Tim" <tim@mail.local host.invalid> a écrit dans le message de
        news:1a3t0xd3bu r4k.9svh92lazgn o.dlg@40tude.ne t
        [color=blue][color=green]
        >> If I use XHTML 1.0 and charset UTF-8 though, does &eacute; have as
        >> much support as &#233; ?[/color][/color]

        If your page is really encoded in UTF-8, you shouldn't need to use entities
        !
        [color=blue]
        > There's a point of view that says to avoid one thing in particular,
        > though: The euro.[/color]

        I thought the support for the Euro symbol nowdays was really wide : Windows
        and Macintosh at least support it for years... Are the computers outside the
        Euro zone running so old software that they don't support the Euro symbol ?

        Comment

        • Alan J. Flavell

          #5
          Re: number or name for special character

          On Tue, 2 Nov 2004, Mark Parnell wrote:
          [color=blue]
          > On Tue, 02 Nov 2004 01:36:55 GMT, The Bicycling Guitarist
          > <Chris@TheBicyc lingGuitarist.n et> declared in
          > comp.infosystem s.www.authoring.html:
          >[color=green]
          > > If I use XHTML 1.0 and charset UTF-8 though, does &eacute; have as much
          > > support as &#233; ?[/color][/color]

          The answer to the question doesn't really depend on those
          preconditions. Support for character entities and numeric references
          doesn't somehow vary according to what version of (X)HTML you are
          writing. No properly-written browser (i.e that excludes Netscape 4.*
          versions) changes its support for &-notation depending on what
          character encoding scheme ("charset") is in use.
          [color=blue]
          > In general numeric entities[/color]

          ....correctly known as "numeric character references"...
          [color=blue]
          > have better support than their named equivalents.[/color]

          "In general", this is true, agreed. But for the Latin-1 entities
          (that were defined in the appendix to RFC1866/HTML2.0), the coverage
          of the two &-notations is by now essentially the same - and at least
          the character entities have more mnemonic value.

          But the same can not be said for most of the additional character
          entities which were introduced in HTML4.

          However, if you are -really- using utf-8 (instead of just pretending,
          as suggested in
          http://ppewww.ph.gla.ac.uk/~flavell/...cklist.html#s6 ), then
          you don't need to use either of the &-notations (except of course for
          HTML-significant characters "<" and "&").

          I think you'll find the various options listed in that checklist are
          compatible with reality. If not, then I'm keen to hear about it.

          Comment

          • The Bicycling Guitarist

            #6
            Re: number or name for special character


            "Pierre Goiffon" <pgoiffon@nowhe re.invalid> wrote in message
            news:41875a0f$0 $7524$626a14ce@ news.free.fr...[color=blue]
            > "Tim" <tim@mail.local host.invalid> a écrit dans le message de
            > news:1a3t0xd3bu r4k.9svh92lazgn o.dlg@40tude.ne t
            >[color=green][color=darkred]
            >>> If I use XHTML 1.0 and charset UTF-8 though, does &eacute; have as
            >>> much support as &#233; ?[/color][/color]
            >
            > If your page is really encoded in UTF-8, you shouldn't need to use
            > entities
            > ![/color]

            Enough said about the tech support at my host's i.s.p. I do include a meta
            tag specifying UTF-8.

            Is is bad though to put &copy; (for example) in the code instead of the
            copyright symbol character? I am especially curious about special characters
            in meta tags such as descriptions. I think I've seen some of my descriptions
            with the character entity rendered as text (i.e. spelling out the code for
            the entity instead of rendering the specified character) by some search
            engines or search engine simulators. Sorry I don't remember which ones. An
            interesting one to check though would be "Shopzilla"
            https://www.TheBicyclingGuitarist.ne.../shopzilla.htm where I use &Ouml; in meta
            tags and body text.

            separate thread issue: Should I wrap attributes in the code? Is there a
            recommended length of a line of code at which I should wrap? a maximum
            length? I was able to reduce file size by ten percent by eliminating some
            whitespace in some files while still preserving some "pretty print" to make
            editing easier. I have a lot of spaces to delete throughout my web site!

            Chris Watson a.k.a. "The Bicycling Guitarist"
            A guy, his bicycle and a Fender Stratocaster guitar. Lyrics, videos, essays and stories on various subjects.



            Comment

            • Andreas Prilop

              #7
              Re: number or name for special character

              On Tue, 2 Nov 2004, The Bicycling Guitarist wrote:
              [color=blue]
              > X-Newsreader: Microsoft Outlook Express 6.00.2900.2180
              >[color=green]
              >> "Tim" <tim@mail.local host.invalid> a ?crit dans le message de[/color][/color]

              You need to select

              Tools > Options > Send
              Mail Sending Format > Plain Text Settings > Message format MIME
              News Sending Format > Plain Text Settings > Message format MIME
              Encode text using: None

              to send special, non-ASCII characters.
              [color=blue]
              > Is is bad though to put &copy; (for example) in the code instead of the
              > copyright symbol character?[/color]

              No, why?
              [color=blue]
              > I am especially curious about special characters
              > in meta tags such as descriptions. I think I've seen some of my descriptions
              > with the character entity rendered as text (i.e. spelling out the code for
              > the entity instead of rendering the specified character) by some search
              > engines or search engine simulators.[/color]

              AltaVista did this in the past - but no longer.
              [color=blue]
              > I have a lot of spaces to delete throughout my web site![/color]

              Use tabs instead of spaces.

              --
              Top-posting.
              What's the most irritating thing on Usenet?

              Comment

              • Jukka K. Korpela

                #8
                Re: number or name for special character

                "Alan J. Flavell" <flavell@ph.gla .ac.uk> wrote:
                [color=blue]
                > But for the Latin-1 entities
                > (that were defined in the appendix to RFC1866/HTML2.0), the coverage
                > of the two &-notations is by now essentially the same[/color]

                _Except_ when genuine XHTML is used. A non-validating XML parser is not
                required to process an external subset, and technically the entities are
                defined in an external subset. If you serve an XHTML document genuinely
                as XHTML, i.e. with Content-Type: application/xhtml+xml, then a
                conforming browser is not required to recognize predefined entity
                references. And Opera indeed fails to recognize them; and it has been
                reported that so does Safari.

                Hence, there's little point in using entities for characters, if you use
                XHTML.

                Character references such as &#233; work on all browsers.

                --
                Yucca, http://www.cs.tut.fi/~jkorpela/
                Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

                Comment

                • The Bicycling Guitarist

                  #9
                  Re: number or name for special character


                  "Jukka K. Korpela" <jkorpela@cs.tu t.fi> wrote in message
                  news:Xns9595C54 569916jkorpelac stutfi@193.229. 0.31...[color=blue]
                  > "Alan J. Flavell" <flavell@ph.gla .ac.uk> wrote:
                  >[color=green]
                  >> But for the Latin-1 entities
                  >> (that were defined in the appendix to RFC1866/HTML2.0), the coverage
                  >> of the two &-notations is by now essentially the same[/color]
                  >
                  > _Except_ when genuine XHTML is used. A non->
                  > Hence, there's little point in using entities for characters, if you use
                  > XHTML.
                  >
                  > Character references such as &#233; work on all browsers.
                  >[/color]

                  Should I convert all the &quot; &copy; &Ouml; and so on throughout my site
                  into the &#number; form? What about &amp;? Should it also be in a &#number;
                  form?

                  A man's got to know his limitations
                  (Dirty Harry)


                  Comment

                  • Alan J. Flavell

                    #10
                    Re: number or name for special character

                    On Tue, 2 Nov 2004, Jukka K. Korpela wrote:
                    [color=blue]
                    > "Alan J. Flavell" <flavell@ph.gla .ac.uk> wrote:
                    >[color=green]
                    > > But for the Latin-1 entities
                    > > (that were defined in the appendix to RFC1866/HTML2.0), the coverage
                    > > of the two &-notations is by now essentially the same[/color]
                    >
                    > _Except_ when genuine XHTML is used.[/color]

                    OK...
                    [color=blue]
                    > A non-validating XML parser is not required to process an external
                    > subset, and technically the entities are defined in an external
                    > subset.[/color]

                    That's the theory, yes.

                    I'd remind anyone interested that the three forms of character
                    representation (&name; &#number; and the actual "coded character") are
                    genuine alternatives in HTML, and can be straightforward ly converted
                    into each other. Anyone who in future starts to encounter problems
                    with non-validating XML parsers should (assuming they have been
                    writing valid syntax) be able to pass their stuff through a trivial
                    convertor and have no worries at all about which form they prefer at
                    authoring time.
                    [color=blue]
                    > If you serve an XHTML document genuinely as XHTML, i.e. with
                    > Content-Type: application/xhtml+xml, then a conforming browser is
                    > not required to recognize predefined entity references. And Opera
                    > indeed fails to recognize them; and it has been reported that so
                    > does Safari.[/color]

                    AFAICS the only point in doing that is where you have a mind to *add*
                    something to HTML, such as SVG or mathML. It's utterly futile to go
                    writing XHTML/1.1 or later if in fact you're wanting nothing more than
                    what HTML/4.01 can provide.
                    [color=blue]
                    > Hence, there's little point in using entities for characters, if you use
                    > XHTML.
                    >
                    > Character references such as &#233; work on all browsers.[/color]

                    I don't disagree. But those who are writing SVG and mathML and the
                    like, and expecting them to do something useful in the WWW context,
                    have quite a lot of other things to concern themselves with too.

                    But what's the question, *really* ?

                    * what should the author type (HTML notations, keyboard methods...) ?

                    * what processing should the editing software perform ?

                    and so on.

                    As an author, I'm not wanting to go typing in literally Ӓ (for
                    some value of 1234, which I'd have to learn for each character) for
                    every non-ascii character. It's a compromise.

                    Comment

                    • Spartanicus

                      #11
                      Re: number or name for special character

                      "Jukka K. Korpela" <jkorpela@cs.tu t.fi> wrote:
                      [color=blue]
                      >as XHTML, i.e. with Content-Type: application/xhtml+xml, then a
                      >conforming browser is not required to recognize predefined entity
                      >references. And Opera indeed fails to recognize them;[/color]

                      Again: Opera recognizes character references in x(ht)ml mode since 7.5.

                      --
                      Spartanicus

                      Comment

                      • Brian

                        #12
                        Re: number or name for special character

                        The Bicycling Guitarist wrote:[color=blue]
                        > "Jukka K. Korpela" wrote ...
                        >[color=green]
                        >> _Except_ when genuine XHTML is used.[/color][/color]
                        [color=blue][color=green]
                        >> Hence, there's little point in using entities for characters, if
                        >> you use XHTML.
                        >>
                        >> Character references such as &#233; work on all browsers.[/color]
                        >
                        >
                        > Should I convert all the &quot; &copy; &Ouml; and so on throughout my
                        > site into the &#number; form?[/color]

                        No. Just use HTML 4.01 (strict).

                        --
                        Brian (remove "invalid" to email me)

                        Comment

                        • Jan Roland Eriksson

                          #13
                          Re: number or name for special character

                          On Tue, 2 Nov 2004 17:24:53 +0000 (UTC), "Jukka K. Korpela"
                          <jkorpela@cs.tu t.fi> wrote:

                          [...]
                          [color=blue]
                          >Character references such as &#233; work on all browsers.[/color]

                          Agreed; but it, sort of, defeats the idea of a "late binding" between
                          what gets put into a web page and how that thing gets handled by a UA.

                          --
                          Rex


                          Comment

                          • The Bicycling Guitarist

                            #14
                            Re: number or name for special character


                            "Jan Roland Eriksson" <jrexon@newsguy .com> wrote in message
                            news:91vfo0t4i2 rtjg37nih10eiot v8skm5tjt@4ax.c om...[color=blue]
                            > On Tue, 2 Nov 2004 17:24:53 +0000 (UTC), "Jukka K. Korpela"
                            > <jkorpela@cs.tu t.fi> wrote:
                            >
                            > [...]
                            >[color=green]
                            >>Character references such as &#233; work on all browsers.[/color]
                            >
                            > Agreed; but it, sort of, defeats the idea of a "late binding" between
                            > what gets put into a web page and how that thing gets handled by a UA.
                            >[/color]
                            LOL. This morning I replaced most of the named entities in my web site by
                            their equivalent &#number; forms.

                            Two characters I am not certain of are the &quot; and the &amp; Should these
                            be replaced too, assuming I stick with xhtml instead of going back to html
                            4.01? I couldn't find their number forms in the table of characters I was
                            using.

                            Thanks to everyone, again. Feel free to point to my site as an example of a
                            personal web site whose author at least *tries* to conform to w3c
                            recommendations .

                            Chris Watson a.k.a. "The Bicycling Guitarist"
                            A guy, his bicycle and a Fender Stratocaster guitar. Lyrics, videos, essays and stories on various subjects.



                            Comment

                            • The Bicycling Guitarist

                              #15
                              Re: number or name for special character


                              "The Bicycling Guitarist" <Chris@TheBicyc lingGuitarist.n et> wrote in message
                              news:MpUhd.1705 8$6q2.16643@new ssvr14.news.pro digy.com...[color=blue]
                              > Two characters I am not certain of are the &quot; and the &amp; Should
                              > these be replaced too, assuming I stick with xhtml instead of going back
                              > to html 4.01? I couldn't find their number forms in the table of
                              > characters I was using.[/color]
                              I found &#34; for &quot; and &#38; for &amp;
                              I still don't know if I should replace these (or any of the) named forms
                              with the numbered forms. I really don't want to go back to a de facto
                              standard that was introduced in 1997, even though I was shocked to learn
                              that none of the entities named or numbered are recognized by xhtml 1.


                              Comment

                              Working...