preferred charset?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Jane Withnolastname

    preferred charset?

    I have been using the charset windows-1252 for a while, but it was
    pointed out to someone else in this group recently that it's a
    Microsoft creation (I'm sure I'm getting my facts wrong or skewed) and
    therefore not good for cross-platform browsing.
    Anyway, I am beginning my road to recovery (ie, breaking my addiction
    to authoring only for IE) and I would like to know what is the
    preferred charset?
    I have tried a search and only find immense lists that make me
    cross-eyed without ever telling me which to use to utilize a full
    range of characters and have them display the way I intend on
    English-speaking machines.
    I'm not sure of the proper term, but I always use the & character
    substitutes for anything that doesn't show up on my keyboard so,
    ideally, the charset should display those, right? (For instance, if I
    want to display Montréal, I would input Montréal .)
    Thanks!
  • Mad Bad Rabbit

    #2
    Re: preferred charset?

    Jane Withnolastname <JaneWithnolast nameNOSPAM@yaho o.com> wrote:
    [color=blue]
    > Anyway, I am beginning my road to recovery (ie, breaking my addiction
    > to authoring only for IE) and I would like to know what is the
    > preferred charset?[/color]

    Probably UTF-8.

    [color=blue]
    >;K[/color]

    Comment

    • Jane Withnolastname

      #3
      Re: preferred charset?

      On Wed, 27 Aug 2003 22:13:47 -0500, Mad Bad Rabbit
      <madbadrabbit@y ahoo.com> wrote:
      [color=blue]
      >Jane Withnolastname <JaneWithnolast nameNOSPAM@yaho o.com> wrote:
      >[color=green]
      >> Anyway, I am beginning my road to recovery (ie, breaking my addiction
      >> to authoring only for IE) and I would like to know what is the
      >> preferred charset?[/color]
      >
      >Probably UTF-8.
      >[/color]

      I tried UTF-8 a while ago (thinking it was the right one) and went
      back to the windows one because I got odd results from it. However,
      now that you have suggested it, I have gone back to the file I had
      problems with and see that I was using open and close quotes (instead
      of the regular quotes on the keyboard). I don't know how they got in
      there, because I really had to search to figure out how I got them, so
      I must have copy&pasted it.
      Anyway, here's a sorta related question: is it acceptable to write
      ASCII codes into html? As in the above example, it would be “ for
      the open quote, and ” for the close quote.
      Is that acceptable, or is there another way, similar to the preferred
      method of using &eacute; rather than &#233;?
      Or would I be better advised to stick with regular quotes and never
      mind special ASCII-only characters?
      And while we're on quotes ... is it acceptable to use the quote key to
      put them in a file, or is it better to use &quot;?
      Thanks again. I'm feeling quite stupid right now :)

      P.S. Is there a list somewhere of all the alternate characters? I'd
      try a search, but I don't know the proper term for these.

      Comment

      • Mad Bad Rabbit

        #4
        Re: preferred charset?

        Jane Withnolastname <JaneWithnolast nameNOSPAM@yaho o.com> wrote:
        [color=blue]
        > Anyway, here's a sorta related question: is it acceptable to write
        > ASCII codes into html? As in the above example, it would be “ for
        > the open quote, and ” for the close quote.[/color]

        Not if you're declaring the codeset to be Unicode.
        “ isn't a double-quote unless you use codeset 1252.

        To find a given character in Unicode, go to:



        and look at (for example) "General Punctuation" chart.
        In Unicode, the double-quotes you want are assigned
        codes &#x201C; and &#x201D;
        [color=blue]
        > Is that acceptable, or is there another way, similar
        > to the preferred method of using &eacute; rather than &#233;?[/color]

        Good question: yes there are. See

        A table of the HTML 4 entities for markup-significant and internationalization characters.


        You can use entities &ldquo; and &rdquo; for these characters.
        (which is a lot easier to remember than the numeric codes).
        [color=blue]
        > P.S. Is there a list somewhere of all the alternate characters? I'd
        > try a search, but I don't know the proper term for these.[/color]

        They're called "HTML entities", and the above site has lists.


        HTH
        [color=blue]
        >;K[/color]

        Comment

        • Headless

          #5
          Re: preferred charset?

          Jane Withnolastname wrote:
          [color=blue]
          >I have been using the charset windows-1252 for a while, but it was
          >pointed out to someone else in this group recently that it's a
          >Microsoft creation (I'm sure I'm getting my facts wrong or skewed) and
          >therefore not good for cross-platform browsing.
          >Anyway, I am beginning my road to recovery (ie, breaking my addiction
          >to authoring only for IE) and I would like to know what is the
          >preferred charset?
          >I have tried a search and only find immense lists that make me
          >cross-eyed without ever telling me which to use to utilize a full
          >range of characters and have them display the way I intend on
          >English-speaking machines.
          >I'm not sure of the proper term, but I always use the & character
          >substitutes for anything that doesn't show up on my keyboard so,
          >ideally, the charset should display those, right? (For instance, if I
          >want to display Montréal, I would input Montr&eacute;al .)[/color]

          I use ISO-8859-1 because it allows me to dispense with character
          references like &eacute; the source readability is much better without
          those codes.


          Headless

          --
          Email and usenet filter list: http://www.headless.dna.ie/usenet.htm

          Comment

          • Jukka K. Korpela

            #6
            Re: preferred charset?

            Mad Bad Rabbit <madbadrabbit@y ahoo.com> wrote:
            [color=blue][color=green]
            >> Anyway, here's a sorta related question: is it acceptable to write
            >> ASCII codes into html? As in the above example, it would be “ for
            >> the open quote, and ” for the close quote.[/color]
            >
            > Not if you're declaring the codeset to be Unicode.
            > “ isn't a double-quote unless you use codeset 1252.[/color]

            “ is undefined in HTML, no matter what you "declare" anywhere.
            This has been discussed dozens of times.

            --
            Yucca, http://www.cs.tut.fi/~jkorpela/
            Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

            Comment

            • Alan J. Flavell

              #7
              Re: preferred charset?

              On Thu, Aug 28, Mad Bad Rabbit inscribed on the eternal scroll:
              [color=blue]
              > Jane Withnolastname <JaneWithnolast nameNOSPAM@yaho o.com> wrote:
              >[color=green]
              > > Anyway, here's a sorta related question: is it acceptable to write
              > > ASCII codes into html?[/color][/color]

              ASCII is a 7-bit code, and has displayable characters in the range 32
              (space) to 126 inclusive. (x20 to x7e).
              [color=blue][color=green]
              > > As in the above example, it would be “ for
              > > the open quote, and ” for the close quote.[/color][/color]

              No.
              [color=blue]
              > Not if you're declaring the codeset to be Unicode.[/color]

              It would be helpful if you'd refrain from offering answers until you
              understand them.
              [color=blue]
              > “ isn't a double-quote unless you use codeset 1252.[/color]

              &#number; notations in the range 127 to 159 inclusive are undefined in
              HTML, and illegal in XHTML. No matter what this or that browser might
              happen to display when presented with them.

              I don't believe your term "codeset" means anything in HTML, SGML, XML
              or XHTML. It seems to be some confused conflation of the terms
              "character code" (or maybe "code page") and "character set". These
              are distinct concepts in HTML/XHTML, and any attempt to muddle them up
              is sure to be unhelpful.

              have fun.

              Comment

              • Stan Brown

                #8
                Re: preferred charset?

                In article <in0rkvkoa880c1 t1qsvkmnvlarnoa j0eli@4ax.com> in
                comp.infosystem s.www.authoring.html, Jane Withnolastname
                <JaneWithnolast nameNOSPAM@yaho o.com> wrote:[color=blue]
                >On Wed, 27 Aug 2003 22:13:47 -0500, Mad Bad Rabbit
                ><madbadrabbit@ yahoo.com> wrote:[color=green]
                >>Jane Withnolastname <JaneWithnolast nameNOSPAM@yaho o.com> wrote:[color=darkred]
                >>> I would like to know what is the preferred charset?[/color]
                >>
                >>Probably UTF-8.[/color][/color]

                Good advice. See
                <http://ppewww.ph.gla.a c.uk/~flavell/charset/checklist.html> .
                [color=blue]
                >Anyway, here's a sorta related question: is it acceptable to write
                >ASCII codes into html?[/color]

                Yes, though it's unnecessary excel;t for > and &.
                [color=blue]
                >As in the above example, it would be “ for
                >the open quote, and ” for the close quote.[/color]

                No, those are not ASCII. They're not even Unicode; they are
                Microsoft creations. Any reference between € and Ÿ
                inclusive is wrong.

                You can create an open quote in three legal ways:
                " &ldquo; “
                and a close quote in three legal ways:
                " &rdquo; ”

                The straight quote " works in all browsers without exception. If its
                appearance is acceptable to you (and it should be, since a great
                many Web sites use it), you need look no further.

                If you really want curly quotes, use the "entities" or the numeric
                references. Most browsers treat them exactly the same; a few (like
                Netscape 4 if I recall correctly) will handle the numeric references
                correctly but not the entitles.
                [color=blue]
                >Is that acceptable, or is there another way, similar to the preferred
                >method of using &eacute; rather than &#233;?[/color]

                How is that the "preferred method"? The two are the same. As far as
                I know, browser support is the same.
                [color=blue]
                >Or would I be better advised to stick with regular quotes and never
                >mind special ASCII-only characters?[/color]

                Yes, I think so, if by "regular quotes" you mean the standard
                double-quote character on the keyboard. I don't know what you mean
                by "ASCII-only characters".
                [color=blue]
                >And while we're on quotes ... is it acceptable to use the quote key to
                >put them in a file, or is it better to use &quot;?[/color]

                There is no reason to use &quot; ever, that I am aware of.
                [color=blue]
                >P.S. Is there a list somewhere of all the alternate characters? I'd
                >try a search, but I don't know the proper term for these.[/color]

                "Numeric character references", but here's a terrific list:

                An indexed list of the 252 character entity references for letters with diacritics, Greek letters and various special characters that are supported in HTML 4.01, including many alternative names for characters.


                For numbers up to 255, it shouldn't matter whether you use the
                number or the entity. For higher numbers, some browsers do a better
                job with the number than with the entity.

                --
                Stan Brown, Oak Road Systems, Cortland County, New York, USA

                HTML 4.01 spec: http://www.w3.org/TR/html401/
                validator: http://validator.w3.org/
                CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
                validator: http://jigsaw.w3.org/css-validator/

                Comment

                • Chris Morris

                  #9
                  Re: preferred charset?

                  Stan Brown <the_stan_brown @fastmail.fm> writes:[color=blue]
                  > There is no reason to use &quot; ever, that I am aware of.[/color]

                  <img src="quotechar. jpg" alt="&quot;">

                  --
                  Chris

                  Comment

                  • Stan Brown

                    #10
                    Re: preferred charset?

                    In article <87r835hmtg.fsf @dinopsis.dur.a c.uk> in
                    comp.infosystem s.www.authoring.html, Chris Morris
                    <c.i.morris@dur ham.ac.uk> wrote:[color=blue]
                    >Stan Brown <the_stan_brown @fastmail.fm> writes:[color=green]
                    >> There is no reason to use &quot; ever, that I am aware of.[/color]
                    >
                    ><img src="quotechar. jpg" alt="&quot;">[/color]

                    <img src="quotechar. jpg" alt='"'> -- even aside from the fact that
                    the example is extremely unlikely to occur in practice. :-)

                    "Single quote marks can be included within the attribute value when
                    the value is delimited by double quote marks, and vice versa."


                    --
                    Stan Brown, Oak Road Systems, Cortland County, New York, USA

                    HTML 4.01 spec: http://www.w3.org/TR/html401/
                    validator: http://validator.w3.org/
                    CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
                    validator: http://jigsaw.w3.org/css-validator/

                    Comment

                    • Stan Brown

                      #11
                      Re: preferred charset?

                      In article <MPG.19b7db3ad7 6ea21c98b22f@ne ws.odyssey.net> in
                      comp.infosystem s.www.authoring.html, Stan Brown
                      <the_stan_brown @fastmail.fm> wrote:[color=blue]
                      >In article <in0rkvkoa880c1 t1qsvkmnvlarnoa j0eli@4ax.com> in
                      >comp.infosyste ms.www.authoring.html, Jane Withnolastname[color=green]
                      >>Anyway, here's a sorta related question: is it acceptable to write
                      >>ASCII codes into html?[/color]
                      >
                      >Yes, though it's unnecessary excel;t for > and &.[/color]

                      Hmm -- I'm not sure how that got mangled. It should have read
                      "except for < > and &."

                      ASCII codes run 0 to 127; of them numbers 32 to 126 are displayable
                      (though 32 "displays" as a space).

                      --
                      Stan Brown, Oak Road Systems, Cortland County, New York, USA

                      HTML 4.01 spec: http://www.w3.org/TR/html401/
                      validator: http://validator.w3.org/
                      CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
                      validator: http://jigsaw.w3.org/css-validator/

                      Comment

                      • Chris Morris

                        #12
                        Re: preferred charset?

                        Stan Brown <the_stan_brown @fastmail.fm> writes:[color=blue]
                        > In article <87r835hmtg.fsf @dinopsis.dur.a c.uk> in
                        > comp.infosystem s.www.authoring.html, Chris Morris
                        > <c.i.morris@dur ham.ac.uk> wrote:[color=green]
                        > >Stan Brown <the_stan_brown @fastmail.fm> writes:[color=darkred]
                        > >> There is no reason to use &quot; ever, that I am aware of.[/color]
                        > >
                        > ><img src="quotechar. jpg" alt="&quot;">[/color]
                        >
                        > <img src="quotechar. jpg" alt='"'> -- even aside from the fact that
                        > the example is extremely unlikely to occur in practice. :-)[/color]

                        <img src="quoteandap os.png" alt="&quot '">

                        Even more unlikely, yes, but user input could potentially contain
                        both. More realistically on the image:

                        <img src="quotation. png" alt="&quot;Quot ation&quot; - John O'Name">
                        [color=blue]
                        > "Single quote marks can be included within the attribute value when
                        > the value is delimited by double quote marks, and vice versa."
                        > http://www.w3.org/TR/html401/intro/sgmltut.html#h-3.2.2[/color]

                        Doing both at once remains a bit more difficult.

                        --
                        Chris

                        Comment

                        • Andreas Prilop

                          #13
                          Re: preferred charset?

                          On Thu, 28 Aug 2003, Stan Brown wrote:
                          [color=blue][color=green]
                          >> <img src="quotechar. jpg" alt="&quot;">[/color]
                          >
                          > <img src="quotechar. jpg" alt='"'> -- even aside from the fact that
                          > the example is extremely unlikely to occur in practice. :-)[/color]

                          Do not use the ASCII character 0x60 (grave accent) as a left quotation mark together with 0x27 (apostrophe), because it will look silly with modern correct fonts.

                          Users of German and other European keyboards frequently confuse the ASCII apostrophe with the ISO 8859-1 acute accent.


                          Comment

                          • Andreas Prilop

                            #14
                            Re: preferred charset?

                            On Thu, 28 Aug 2003, Jane Withnolastname wrote:
                            [color=blue][color=green][color=darkred]
                            >>> Anyway, I am beginning my road to recovery (ie, breaking my addiction
                            >>> to authoring only for IE) and I would like to know what is the
                            >>> preferred charset?[/color][/color][/color]

                            There is none. It depends on *your* special situation.

                            [color=blue]
                            > Or would I be better advised to stick with regular quotes and never
                            > mind special ASCII-only characters?[/color]

                            It might be preferable to use only ASCII quotes (" '). See

                            [color=blue]
                            > Thanks again. I'm feeling quite stupid right now :)[/color]

                            No, no. Perfectly valid questions.
                            [color=blue]
                            > Is there a list somewhere of all the alternate characters?[/color]

                            You probably don't need all. Take

                            as a starting point.

                            Comment

                            • Jane Withnolastname

                              #15
                              Re: preferred charset?

                              On Thu, 28 Aug 2003 08:24:25 +0100, Headless <me@privacy.net > wrote:
                              [color=blue]
                              >Jane Withnolastname wrote:
                              >[color=green]
                              >>I have been using the charset windows-1252 for a while, but it was
                              >>pointed out to someone else in this group recently that it's a
                              >>Microsoft creation (I'm sure I'm getting my facts wrong or skewed) and
                              >>therefore not good for cross-platform browsing.
                              >>Anyway, I am beginning my road to recovery (ie, breaking my addiction
                              >>to authoring only for IE) and I would like to know what is the
                              >>preferred charset?
                              >>I have tried a search and only find immense lists that make me
                              >>cross-eyed without ever telling me which to use to utilize a full
                              >>range of characters and have them display the way I intend on
                              >>English-speaking machines.
                              >>I'm not sure of the proper term, but I always use the & character
                              >>substitutes for anything that doesn't show up on my keyboard so,
                              >>ideally, the charset should display those, right? (For instance, if I
                              >>want to display Montréal, I would input Montr&eacute;al .)[/color]
                              >
                              >I use ISO-8859-1 because it allows me to dispense with character
                              >references like &eacute; the source readability is much better without
                              >those codes.
                              >
                              >
                              >Headless[/color]

                              So I've got one vote for utf-8 and one vote for iso-8859-1 and
                              everybody else just wants to argue about quotes, which was so not the
                              point, to begin with.
                              Can I get a consensus?
                              It depends on what I'm using it for? OK, it's a general-use site aimed
                              at an English-speaking audience that may, at some time or another,
                              need to use non-English characters, such as é or ç. I need it to
                              display on all browsers and would be nice (but not necessary) if it
                              was printable on most printers.

                              If I understand correctly, this ISO charset will allow me to simply
                              input é and it will display correctly in all browsers?

                              Someone questioned my saying that entity rather than number was the
                              preferred method. Well, it's what I read on this newsgroup only a few
                              days ago, when someone was asking about the Euro character. The person
                              had said that it was written with the numerical identifier and was
                              advised to change it to the entity.

                              I apologize for apparently having no idea that ASCII stopped at 127. I
                              learned everything I know about ASCII in high school, something like
                              15 years ago. Some of it may have been wrong and some may have meshed
                              with what I *thought* was fact.... Anyway, thanks for straightening me
                              out on that.

                              Thanks!

                              Comment

                              Working...