Simple high-ascii character encoding

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Henri Sivonen

    #16
    Re: Simple high-ascii character encoding

    In article <pQrPe.222$uQ6. 8297@news.optus .net.au>,
    RobG <rgqld@iinet.ne t.au> wrote:
    [color=blue]
    > Harlan Messinger wrote:
    > [...]
    >[color=green]
    > > Then there are all the non-standard arrangements that font designers
    > > used in the past to map alphabets and symbol sets other than the basic
    > > English one to the sub-128 positions so that foreign text[/color]
    >
    > While we're being pedantic about words, should the phrase 'foreign text'
    > be 'non-English text'? Or in the context of ASCII, are the two terms
    > identical?[/color]

    Isn't 'non-English' the definition of 'foreign'? :-) Although nowadays
    it is politically correct to say 'international' instead of 'foreign'.

    (I suppose you might get along with ASCII when writing Dutch and
    Afrikaans.)

    --
    Henri Sivonen
    hsivonen@iki.fi

    Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html

    Comment

    • Jukka K. Korpela

      #17
      Re: Simple high-ascii character encoding

      RobG wrote:
      [color=blue]
      > While we're being pedantic about words, should the phrase 'foreign text'
      > be 'non-English text'? Or in the context of ASCII, are the two terms
      > identical?[/color]

      It's easy to out-pedant you here: You cannot write even English
      correctly using ASCII only. ASCII lacks orthographicall y correct
      quotation marks (and apostrophe), en and am dashes, and horizontal
      ellipsis, though the last one is debatable. (Is horizontal ellipsis just
      a presentational form of "..."? The difference is real, anyway, and
      English style guides require the use of "spaced dots", and achieving
      this without using the horizontal ellipsis character is awkward.)

      ASCII also lacks letters like e with acute accent, o with diaeresis, and
      letter (ligature) ae, which belong to _some_ forms of written English at
      least.

      Comment

      • Alan J. Flavell

        #18
        Re: Simple high-ascii character encoding

        On Thu, 25 Aug 2005, RobG wrote:
        [color=blue]
        > Harlan Messinger wrote:
        > [...]
        >[color=green]
        > > Then there are all the non-standard arrangements that font
        > > designers used in the past to map alphabets and symbol sets other
        > > than the basic English one to the sub-128 positions so that
        > > foreign text[/color]
        >
        > While we're being pedantic about words, should the phrase 'foreign
        > text' be 'non-English text'?[/color]

        Even English uses some characters which are "foreign" to the ASCII
        character set.
        [color=blue]
        > Or in the context of ASCII, are the two terms identical?[/color]

        Well, to be extra pedantic, ASCII is an American Standard. They don't
        have an exclusive hold on English.

        Comment

        • Pierre Goiffon

          #19
          Re: Simple high-ascii character encoding

          Andreas Prilop wrote:[color=blue][color=green]
          >>but perhaps it is easier to simply ask
          >>if there are any character sets that are *not* identical to
          >>ASCII in the first 127 characters...[/color]
          >
          > ^
          > (Characters 0 to 127 are the first 128 characters.)
          >
          > All of these
          > http://www.unicode.org/Public/MAPPIN...MICSFT/EBCDIC/
          > http://czyborra.com/charsets/iso646.html#EBCDIC[/color]

          I had to treat EBCDIC encodings recently - it is used by most IBM
          plateforms, as for exemple iSeries servers (OS/400).
          Just to give an idea of the numbers of such encodings, here is a list of
          the charsets supported by the Open statement in LotusScript
          (Notes/Domino R6) :

          IBM037 US and Canadian English, Dutch, Protuguese
          IBM273 German
          IBM277 Danish, Norwegian
          IBM278 Finnish, Swedish
          IBM280 Italian
          IBM284 Spanish
          IBM285 International English
          IBM297 French
          IBM420 Arabic
          IBM424 Hebrew
          IBM500 Intl. Eglish, Latin-1, Albanian, Belgian English, French
          IBM838 Thai
          IBM870 Latin-2, Croatian, Czech, Hungarian, Polish
          IBM871 Icelandic
          IBM875 Greek
          IBM1025 Bulgarian, Russian, Serbian Cyrillic
          IBM1026 Turkish
          IBM1047 Latin-1 Open Systems
          IBM1112 Latvian, Lithuanian
          IBM1122 Estonian
          IBM930 Japanese Katakana
          IBM933 Korean
          IBM935 Simplified Chinese
          IBM937 Traditional Chinese
          IBM939 Japanese Latin
          IBM1388 Simplified Chinese

          Comment

          • Andreas Prilop

            #20
            Re: Simple high-ascii character encoding

            On Fri, 26 Aug 2005, Alan J. Flavell wrote:
            [color=blue]
            > Even English uses some characters which are "foreign" to the ASCII
            > character set.[/color]

            £

            Comment

            • Andreas Prilop

              #21
              Re: Simple high-ascii character encoding

              On Thu, 25 Aug 2005, RobG wrote:
              [color=blue]
              > While we're being pedantic about words, should the phrase 'foreign text'
              > be 'non-English text'?[/color]

              No! English text is foreign text; German text is non-foreign text.
              ;-)

              Comment

              • Andreas Prilop

                #22
                Re: Simple high-ascii character encoding

                On Fri, 26 Aug 2005, Henri Sivonen wrote:
                [color=blue]
                > (I suppose you might get along with ASCII when writing Dutch and
                > Afrikaans.)[/color]

                Find the non-ASCII characters on



                Comment

                • Harlan Messinger

                  #23
                  Re: Simple high-ascii character encoding

                  RobG wrote:[color=blue]
                  > Harlan Messinger wrote:
                  > [...]
                  >[color=green]
                  >> Then there are all the non-standard arrangements that font designers
                  >> used in the past to map alphabets and symbol sets other than the basic
                  >> English one to the sub-128 positions so that foreign text[/color]
                  >
                  >
                  > While we're being pedantic about words, should the phrase 'foreign text'
                  > be 'non-English text'? Or in the context of ASCII, are the two terms
                  > identical?[/color]

                  Touché.

                  Comment

                  • Harlan Messinger

                    #24
                    Re: Simple high-ascii character encoding

                    Henri Sivonen wrote:[color=blue]
                    > In article <pQrPe.222$uQ6. 8297@news.optus .net.au>,
                    > RobG <rgqld@iinet.ne t.au> wrote:
                    >
                    >[color=green]
                    >>Harlan Messinger wrote:
                    >>[...]
                    >>
                    >>[color=darkred]
                    >>>Then there are all the non-standard arrangements that font designers
                    >>>used in the past to map alphabets and symbol sets other than the basic
                    >>>English one to the sub-128 positions so that foreign text[/color]
                    >>
                    >>While we're being pedantic about words, should the phrase 'foreign text'
                    >>be 'non-English text'? Or in the context of ASCII, are the two terms
                    >>identical?[/color]
                    >
                    >
                    > Isn't 'non-English' the definition of 'foreign'? :-) Although nowadays
                    > it is politically correct to say 'international' instead of 'foreign'.
                    >
                    > (I suppose you might get along with ASCII when writing Dutch and
                    > Afrikaans.)[/color]

                    (I've set the encoding for this message to UTF-8.)

                    Nope. Dutch:
                    IJsselmeer (the former Zuider Zee) (the first character
                    should look like an IJ ligature)
                    Ik heb het maar één keer gezien. (I've only seen it once).
                    Ons tweeëns (the two of us) (funny--I tried to corroborate
                    my recollection of this from the web, but I can't
                    find it with either two or three "e"s. Can someone
                    tell me whether I'm making this up?)
                    Afrikaans:
                    Ek sal hê (I will have).

                    Comment

                    • Stan Brown

                      #25
                      Re: Simple high-ascii character encoding

                      On Fri, 26 Aug 2005 10:44:13 +0300, "Jukka K. Korpela"
                      <jkorpela@cs.tu t.fi> wrote:
                      [color=blue]
                      >You cannot write even English
                      >correctly using ASCII only. ASCII lacks orthographicall y correct
                      >quotation marks (and apostrophe), en and am dashes, and horizontal
                      >ellipsis, though the last one is debatable.[/color]

                      This claim does not become true no matter how often repeated.

                      Aside from the question of whether _punctuation_ can properly be
                      said to be part of a _language_ at all, there is no divine ordinance
                      that says opening and closing quotes need to look different, or that
                      apostrophes and single quotes need to look different.

                      Indeed, a counter-example comes readily to mind: the King James
                      bible, the standard English bible for centuries, was printed
                      entirely without quote marks, en dashes, "am dashes", and ellipses.

                      Perhaps, meaning no disrespect, you might yield just a tiny bit to a
                      native speaker in your pronouncements about what is and is not
                      correct English?

                      --
                      Stan Brown, Oak Road Systems, Tompkins County, New York, USA
                      Dragon222 adalah situs slot gacor terbaru yang selalu memberikan banyak bonus menarik dan kemenangan JP untuk pemain setia selama bermain di link slot DRAGON222.

                      HTML 4.01 spec: http://www.w3.org/TR/html401/
                      validator: http://validator.w3.org/
                      CSS 2.1 spec: http://www.w3.org/TR/CSS21/
                      validator: http://jigsaw.w3.org/css-validator/
                      Why We Won't Help You:

                      Comment

                      • Harlan Messinger

                        #26
                        Re: Simple high-ascii character encoding

                        Stan Brown wrote:[color=blue]
                        > On Fri, 26 Aug 2005 10:44:13 +0300, "Jukka K. Korpela"
                        > <jkorpela@cs.tu t.fi> wrote:
                        >
                        >[color=green]
                        >>You cannot write even English
                        >>correctly using ASCII only. ASCII lacks orthographicall y correct
                        >>quotation marks (and apostrophe), en and am dashes, and horizontal
                        >>ellipsis, though the last one is debatable.[/color]
                        >
                        >
                        > This claim does not become true no matter how often repeated.
                        >
                        > Aside from the question of whether _punctuation_ can properly be
                        > said to be part of a _language_ at all, there is no divine ordinance
                        > that says opening and closing quotes need to look different, or that
                        > apostrophes and single quotes need to look different.[/color]

                        There's no ordinance that says a lower-case "l" ("L") and a numeral "1"
                        (one) need to look different, and indeed in the typeface in which this
                        very sentence appears on my screen they look identical. Ditto for
                        capital "O" ("o") and numeral "0" (zero). That doesn't alter the fact
                        that they are different characters with different purposes and that they
                        shouldn't be treated as one in the character scheme. (Alan, don't resent
                        me for introducing "scheme". I couldn't resist.)
                        [color=blue]
                        > Indeed, a counter-example comes readily to mind: the King James
                        > bible, the standard English bible for centuries, was printed
                        > entirely without quote marks, en dashes, "am dashes", and ellipses.
                        >
                        > Perhaps, meaning no disrespect, you might yield just a tiny bit to a
                        > native speaker in your pronouncements about what is and is not
                        > correct English?[/color]

                        When I *write* quotation marks, my opening quotes are different from my
                        closing quotes, and that's they way they usually are in print. So,
                        indeed, I consider them separate characters. Using a single character to
                        represent both may not be the end of the world, but it can pose
                        disadvantages.

                        Comment

                        • Andreas Prilop

                          #27
                          Re: Simple high-ascii character encoding

                          On Fri, 26 Aug 2005, Stan Brown wrote:
                          [color=blue]
                          > Indeed, a counter-example comes readily to mind: the King James
                          > bible, the standard English bible for centuries, was printed
                          > entirely without quote marks, en dashes, "am dashes", and ellipses.[/color]

                          .... but with a doubled hyphen and lots of ligatures (e.g. U+FB06).

                          Comment

                          • Tim

                            #28
                            Re: Simple high-ascii character encoding

                            Jukka K. Korpela:
                            [color=blue][color=green]
                            >> You cannot write even English correctly using ASCII only. ASCII
                            >> lacks orthographicall y correct quotation marks (and apostrophe),
                            >> en and am dashes, and horizontal ellipsis, though the last one
                            >> is debatable.[/color][/color]


                            Stan Brown sent:
                            [color=blue]
                            > This claim does not become true no matter how often repeated.[/color]

                            Claiming that some of those things aren't true, no many how times
                            repeated, does NOT make it true.
                            [color=blue]
                            > Aside from the question of whether _punctuation_ can properly be
                            > said to be part of a _language_ at all, there is no divine ordinance
                            > that says opening and closing quotes need to look different, or that
                            > apostrophes and single quotes need to look different.[/color]

                            Whatever they *look* like, apostrophes are not quote marks (neither are
                            they accents, as they're all-too-frequently abused for), and properly used
                            quote marks are not the same as what almost passes for quote marks in
                            ASCII (there are two separate opening and closing quote symbols, and ASCII
                            provides neither, certainly nothing that's opening quotes with a
                            corresponding something as closing quotes).

                            Anyone who's been taught to use the English language properly knows full
                            well that opening and closing quotes are two dissimilar symbols. Stylised
                            fonts are something else again, but standard punctuation marks do have
                            *proper* ways of being drawn (dots with tails going in particular
                            directions), even the letters have proper ways of being drawn. Anybody
                            who's *properly* taught children to write knows this.
                            [color=blue]
                            > Indeed, a counter-example comes readily to mind: the King James
                            > bible, the standard English bible for centuries, was printed
                            > entirely without quote marks, en dashes, "am dashes", and ellipses.[/color]

                            That a document doesn't do something is no proof. It also doesn't have
                            the word computer in it, no various other parts of our current language.
                            [color=blue]
                            > Perhaps, meaning no disrespect, you might yield just a tiny bit to a
                            > native speaker in your pronouncements about what is and is not
                            > correct English?[/color]

                            Maybe you ought to listen to some native speakers who assert that you're
                            wrong. Perhaps some who've been taught properly.

                            --
                            If you insist on e-mailing me, use the reply-to address (it's real but
                            temporary). But please reply to the group, like you're supposed to.

                            This message was sent without a virus, please destroy some files yourself.

                            Comment

                            • John Dunlop

                              #29
                              Re: Simple high-ascii character encoding

                              Jukka K. Korpela wrote:
                              [color=blue]
                              > You cannot write even English correctly using ASCII only. ASCII lacks
                              > orthographicall y correct quotation marks (and apostrophe), en and am
                              > dashes, and horizontal ellipsis, though the last one is debatable.[/color]

                              Thanks a lot, Jukka, now I'm sweer to post any followups in
                              case I run afoul of ISO Standard English!!

                              --
                              Jock

                              Comment

                              • Stan Brown

                                #30
                                Re: Simple high-ascii character encoding

                                On Sat, 27 Aug 2005 01:32:54 +0900, Tim <tim@mail.local host.invalid>
                                wrote:
                                [color=blue]
                                >Anyone who's been taught to use the English language properly knows full
                                >well that opening and closing quotes are two dissimilar symbols.[/color]

                                Sure they are - in some type faces. In other type faces they're the
                                same. In other type faces they don't exist. In handwriting they're
                                the same. The point is, they're not part of English, they're part of
                                typography.
                                [color=blue]
                                >standard punctuation marks do have
                                >*proper* ways of being drawn (dots with tails going in particular
                                >directions), even the letters have proper ways of being drawn. Anybody
                                >who's *properly* taught children to write knows this.[/color]

                                Uh-huh. Children are all taught to make commas as dots with tails.
                                Suuuuure they are.

                                --
                                Stan Brown, Oak Road Systems, Tompkins County, New York, USA
                                Dragon222 adalah situs slot gacor terbaru yang selalu memberikan banyak bonus menarik dan kemenangan JP untuk pemain setia selama bermain di link slot DRAGON222.

                                HTML 4.01 spec: http://www.w3.org/TR/html401/
                                validator: http://validator.w3.org/
                                CSS 2.1 spec: http://www.w3.org/TR/CSS21/
                                validator: http://jigsaw.w3.org/css-validator/
                                Why We Won't Help You:

                                Comment

                                Working...