Re: XHTML 1.0 Strict and the Apostrophe

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Thomas 'PointedEars' Lahn

    Re: XHTML 1.0 Strict and the Apostrophe

    Jukka K. Korpela wrote:
    Scripsit Andy Dingley:
    >As to the difference between ' or ’, we had a long thread on
    >this fairly recently (few months), centred on the fact that "single
    >quote" and "apostrophe " are really not very clearly defined as
    >distinct in the available character sets, even Unicode.
    >
    I don't know who these "we" are, but the references denote distinct
    characters without doubt, and the only confusion is around the
    unfortunate _names_. The _Unicode name_ of the Ascii apostrophe, ' (or
    '), is APOSTROPHE, but that's just a name, an identifier, and not
    descriptive of meaning (actually, it's misleading, but Unicode names
    will never be changed).
    >
    >With much less consensus, the general outcome was that you can
    >reasonably use whichever you like, neither is ever "wrong" (except
    >that ‛ should be paired with ’, but not with ') and
    >that you'd quite reliably get a visually different glyph for each,
    >either straight or curly. Apart from that, there's no hard-and-fast
    >rule ' is only ever an "apostrophe " and never a "quote".
    >
    Sorry, but that paragraph has far too much confusion to be analyzed.
    >
    Here's the picture:
    >
    The ASCII apostrophe ' works fairly universally in text, but it's almost
    never the _right_ character for anything, except in computer languages.
    Consider it as a poor man's excuse for a surrogate for a large
    collection of characters. Use it that way if you are lazy or have made
    an informed decision (a compromise), but don't you ever be proud of
    that.
    IBTD. For example, in English it is customary (and AIUI expected) to use
    the character that ’ represents should be used to delimit a quotation
    within direct speech (which itself should be delimited by “ and
    ”. (I gathered that from reading several English books.)

    I think you would agree that it would make especially English text with
    quotations in direct speech (say, in a novel where one person tells another
    what a third said) quite badly legible if somewhere there is an apostrophe
    represented by ’ in the inner quotation, because you would have to
    look very hard at the character and the context to see whether the inner
    quotation ends or there is just an apostrophe in it. (BTDT, but YMMV if you
    are a speaker of English as first language.)

    Since apostrophes appear to occur quite often in English texts, I have
    therefore decided that in my English texts, ' (the straight apostrophe,
    ' or ') is the appropriate character for all apostrophes as it
    is clearly distinguishable from "the curly one" using the standard fonts
    provided by common UIs. If you want to call that a compromise -- I call
    it an informed design decision in support of usability (that should have
    been made by the Unicode people instead if what you say below is correct).

    To be proud about that is yet another thing. But what reasonable
    alternative to the aforementioned approach would you suggest instead?
    For other characters, consult the applicable language and style guides
    (for _human_ languages).
    >
    Note that ’ _should_ have a curly (curved) glyph but it's similar
    to a prime (yard symbol) in some fonts. It is explicitly recommended as
    punctuation apostrophe in the Unicode standard, and the standard also
    explicitly says that it is the same character as the right single
    quotation mark.
    So it would seen that the standard recommends nonsense, or at least
    something not universally applicable, here.


    PointedEars
    --
    Use any version of Microsoft Frontpage to create your site.
    (This won't prevent people from viewing your source, but no one
    will want to steal it.)
    -- from <http://www.vortex-webdesign.com/help/hidesource.htm>
  • Jukka K. Korpela

    #2
    Re: XHTML 1.0 Strict and the Apostrophe

    Scripsit Thomas 'PointedEars' Lahn:
    I think you would agree that it would make especially English text
    with quotations in direct speech (say, in a novel where one person
    tells another what a third said) quite badly legible if somewhere
    there is an apostrophe represented by ’ in the inner quotation,
    No I wouldn't. Such usage is _standard_ English, to the extent anything
    is standard in English. Consult the applicable style guide and then the
    Unicode Standard, which identifies the punctuation marks at the level of
    coded characters.
    Since apostrophes appear to occur quite often in English texts, I have
    therefore decided that in my English texts, ' (the straight
    apostrophe, &apos; or &#39;) is the appropriate character for all
    apostrophes
    That's computerize or typewriterese - abhorred, disliked, and frowned
    upon by typographers and grammars.

    --
    Jukka K. Korpela ("Yucca")


    Comment

    • Ben C

      #3
      Re: XHTML 1.0 Strict and the Apostrophe

      On 2008-04-14, Jukka K. Korpela <jkorpela@cs.tu t.fiwrote:
      Scripsit Thomas 'PointedEars' Lahn:
      >
      >I think you would agree that it would make especially English text
      >with quotations in direct speech (say, in a novel where one person
      >tells another what a third said) quite badly legible if somewhere
      >there is an apostrophe represented by ’ in the inner quotation,
      >
      No I wouldn't. Such usage is _standard_ English, to the extent anything
      is standard in English. Consult the applicable style guide and then the
      Unicode Standard, which identifies the punctuation marks at the level of
      coded characters.
      >
      >Since apostrophes appear to occur quite often in English texts, I have
      >therefore decided that in my English texts, ' (the straight
      >apostrophe, &apos; or &#39;) is the appropriate character for all
      >apostrophes
      >
      That's computerize or typewriterese - abhorred, disliked, and frowned
      upon by typographers and grammars.
      Style guides and grammar books just tell you when to use apostrophes.
      They don't say anything about whether you should use U+0027 or U+2019 to
      represent them.

      PointedEars is right that using U+2019 to write an apostrophe is
      obviously illogical, although I don't agree that it causes any real
      ambiguity or legibility problems for human readers.

      But do you know _why_ the Unicode Standard recommends using U+2019?

      U+0027 is called the "apostrophe " but then the description says "neutral
      (vertical) glyph with mixed usage" (whatever that's supposed to mean-- I
      thought we were talking about a character not a glyph) and then goes on
      about how the wonderful U+2019 is preferred for practically everything.

      Typographers may be right that a curlier glyph looks better, but then why
      not just map the curlier glyph to both U+2019 and U+0027 in the font?

      I don't understand the case against U+0027.

      Comment

      • Eric B. Bednarz

        #4
        Re: XHTML 1.0 Strict and the Apostrophe

        Ben C <spamspam@spam. eggswrites:
        Style guides and grammar books just tell you when to use apostrophes.
        They don't say anything about whether you should use U+0027 or U+2019 to
        represent them.
        Pardon? Style guides *can* and sometimes actually just *do* that.
        U+0027 is called the "apostrophe "
        For starters, Unicode names have no semantics (they cannot even be
        changed if considered ambiguous or even outright wrong later).
        Typographers may be right that a curlier glyph looks better, but then why
        not just map the curlier glyph to both U+2019 and U+0027 in the font?
        That would be funny, but not very practical. Have you ever copied and
        pasted programming code from one of those auto-smart-quoting comment
        systems on techno web logs, by the way?


        --
        ||| hexadecimal EBB
        o-o decimal 3771
        --oOo--( )--oOo-- octal 7273
        205 goodbye binary 111010111011

        Comment

        • Harlan Messinger

          #5
          Re: XHTML 1.0 Strict and the Apostrophe

          Thomas 'PointedEars' Lahn wrote:
          I think you would agree that it would make especially English text with
          quotations in direct speech (say, in a novel where one person tells another
          what a third said) quite badly legible if somewhere there is an apostrophe
          represented by ’ in the inner quotation, because you would have to
          look very hard at the character and the context to see whether the inner
          quotation ends or there is just an apostrophe in it. (BTDT, but YMMV if you
          are a speaker of English as first language.)
          I think that's an exaggeration. Except in rare cases where the
          apostrophe is at the end of a word it's quite easy to distinguish them
          from closing single quotes, which are always at the end of a word or
          after a punctuation mark.
          >
          Since apostrophes appear to occur quite often in English texts, I have
          therefore decided that in my English texts, ' (the straight apostrophe,
          &apos; or &#39;) is the appropriate character for all apostrophes as it
          is clearly distinguishable from "the curly one" using the standard fonts
          provided by common UIs. If you want to call that a compromise -- I call
          it an informed design decision in support of usability (that should have
          been made by the Unicode people instead if what you say below is correct).
          Like it or not, this isn't a problem that's newly sprung. Even before
          computers, this was the convention in printed material (where the ugly
          little ASCII apostrophe didn't exist--it was confined to the typewriter
          and, later, to computer programming), and it didn't cause massive
          difficulties. So there isn't a massive need to "fix" it now with a
          mongrelization of two unrelated practices.
          To be proud about that is yet another thing. But what reasonable
          alternative to the aforementioned approach would you suggest instead?
          The expected one, the familiar one, the one that's been in use for a
          very long time.
          >For other characters, consult the applicable language and style guides
          >(for _human_ languages).
          >>
          >Note that ’ _should_ have a curly (curved) glyph but it's similar
          >to a prime (yard symbol) in some fonts. It is explicitly recommended as
          >punctuation apostrophe in the Unicode standard, and the standard also
          >explicitly says that it is the same character as the right single
          >quotation mark.
          >
          So it would seen that the standard recommends nonsense,
          No, it recommends existing mainstream practice.
          or at least
          something not universally applicable, here.

          Comment

          • Harlan Messinger

            #6
            Re: XHTML 1.0 Strict and the Apostrophe

            Ben C wrote:
            On 2008-04-14, Jukka K. Korpela <jkorpela@cs.tu t.fiwrote:
            >Scripsit Thomas 'PointedEars' Lahn:
            >>
            >>I think you would agree that it would make especially English text
            >>with quotations in direct speech (say, in a novel where one person
            >>tells another what a third said) quite badly legible if somewhere
            >>there is an apostrophe represented by ’ in the inner quotation,
            >No I wouldn't. Such usage is _standard_ English, to the extent anything
            >is standard in English. Consult the applicable style guide and then the
            >Unicode Standard, which identifies the punctuation marks at the level of
            >coded characters.
            >>
            >>Since apostrophes appear to occur quite often in English texts, I have
            >>therefore decided that in my English texts, ' (the straight
            >>apostrophe, &apos; or &#39;) is the appropriate character for all
            >>apostrophes
            >That's computerize or typewriterese - abhorred, disliked, and frowned
            >upon by typographers and grammars.
            >
            Style guides and grammar books just tell you when to use apostrophes.
            They don't say anything about whether you should use U+0027 or U+2019 to
            represent them.
            >
            PointedEars is right that using U+2019 to write an apostrophe is
            obviously illogical, although I don't agree that it causes any real
            ambiguity or legibility problems for human readers.
            It's "illogical" in the semantic sense but since the single closing
            quote and the apostrophe are assigned the same appearance by convention
            it isn't any more illogical than using the Unicode exclamation point for
            factorials, instead of setting off a separate code point for the
            factorial mark so that some day a wacko type designer can design a font
            in which the factorial symbol looks different from the exclamation point.
            But do you know _why_ the Unicode Standard recommends using U+2019?
            >
            U+0027 is called the "apostrophe "
            Well, that's what it was called when it was the typewriter apostrophe
            and there were no curly quotes to be seen anywhere--and their use as
            single quotes was infrequent.
            but then the description says "neutral
            (vertical) glyph with mixed usage" (whatever that's supposed to mean-- I
            thought we were talking about a character not a glyph) and then goes on
            about how the wonderful U+2019 is preferred for practically everything.
            >
            Typographers may be right that a curlier glyph looks better, but then why
            not just map the curlier glyph to both U+2019 and U+0027 in the font?
            Because we don't want no curly apostrophes in our stinkin' C++.
            I don't understand the case against U+0027.

            Comment

            • Eric B. Bednarz

              #7
              Re: XHTML 1.0 Strict and the Apostrophe

              Harlan Messinger <hmessinger.rem ovethis@comcast .netwrites:
              [...] it isn't any more illogical than using the Unicode
              exclamation point for factorials, instead of setting off a separate
              code point for the factorial mark so that some day a wacko type
              designer can design a font in which the factorial symbol looks
              different from the exclamation point.
              :)

              Still, there’s a lot left to be desired in digital typography. The next
              best practical enemy of good taste I can think of is the hyphen; U+002D
              is as bad a substitute for it as U+0027 is for a proper apostrophe (this
              would become apparent with typefaces that feature a distinctive canted
              hyphen; I don’t personally care much for proper minus signs, though, I’m
              not really geek enough to know what to look for ;).

              I don’t think it is safe to use it on the web, but it is has been quite
              a while that I checked.

              --
              ||| hexadecimal EBB
              o-o decimal 3771
              --oOo--( )--oOo-- octal 7273
              205 goodbye binary 111010111011

              Comment

              • Ben C

                #8
                Re: XHTML 1.0 Strict and the Apostrophe

                On 2008-04-14, Harlan Messinger <hmessinger.rem ovethis@comcast .netwrote:
                Ben C wrote:
                [...]
                >PointedEars is right that using U+2019 to write an apostrophe is
                >obviously illogical, although I don't agree that it causes any real
                >ambiguity or legibility problems for human readers.
                >
                It's "illogical" in the semantic sense but since the single closing
                quote and the apostrophe are assigned the same appearance by convention
                it isn't any more illogical than using the Unicode exclamation point for
                factorials, instead of setting off a separate code point for the
                factorial mark so that some day a wacko type designer can design a font
                in which the factorial symbol looks different from the exclamation point.
                Well not quite the same, because there isn't a separate factorial code
                point.

                Suppose there were. Then it would be like being told that in spite of
                that we were supposed to use the exclamation mark even for factorials
                and to ignore the despicable factorial code point altogether.
                >But do you know _why_ the Unicode Standard recommends using U+2019?
                >>
                >U+0027 is called the "apostrophe "
                >
                Well, that's what it was called when it was the typewriter apostrophe
                and there were no curly quotes to be seen anywhere--and their use as
                single quotes was infrequent.
                I think Bednarz may have a hint at the true explanation when he said
                something about how the names cannot ever be changed.
                >but then the description says "neutral
                >(vertical) glyph with mixed usage" (whatever that's supposed to mean-- I
                >thought we were talking about a character not a glyph) and then goes on
                >about how the wonderful U+2019 is preferred for practically everything.
                >>
                >Typographers may be right that a curlier glyph looks better, but then why
                >not just map the curlier glyph to both U+2019 and U+0027 in the font?
                >
                Because we don't want no curly apostrophes in our stinkin' C++.
                Then you'd just use a horrible font for your stinkin' C++ in which they
                appeared as nasty abhorrent typewriterized neutral vertical glyphs.

                Anyway there are no apostrophes in C++, only single quotes, for which
                you use apostrophes.

                Comment

                • Thomas 'PointedEars' Lahn

                  #9
                  Re: XHTML 1.0 Strict and the Apostrophe

                  Jukka K. Korpela wrote:
                  Scripsit Thomas 'PointedEars' Lahn:
                  >I think you would agree that it would make especially English text
                  >with quotations in direct speech (say, in a novel where one person
                  >tells another what a third said) quite badly legible if somewhere
                  >there is an apostrophe represented by ’ in the inner quotation,
                  >
                  No I wouldn't. Such usage is _standard_ English, to the extent anything
                  is standard in English. Consult the applicable style guide and then the
                  Unicode Standard, which identifies the punctuation marks at the level of
                  coded characters.
                  Well, compare

                  | Paul took two deep breaths. “She said a thing.” He closed his eyes,
                  | calling up the words, and when he spoke his voice unconsciously took on
                  | some of the old woman's tone: “ ‘You, Paul Atreides, descendant of kings,
                  | son of a Duke, you must learn to rule. It's something none of your
                  | ancestors learned.’ ” Paul opened his eyes, said: “That made me angry and
                  | I said my father rules an entire planet. And she said, ‘He's losing it.’
                  | And I said my father was getting a richer planet. And she said, ‘He'll
                  | lose that one, too.’ And I wanted to run and warn my father, but she said
                  | he'd already been warned—by you, by Mother, by many people.”
                  (from: Frank Herbert, Dune, book 1, chapter 4)

                  against

                  | Paul took two deep breaths. “She said a thing.” He closed his eyes,
                  | calling up the words, and when he spoke his voice unconsciously took on
                  | some of the old woman’s tone: “ ‘You, Paul Atreides, descendant of kings,
                  | son of a Duke, you must learn to rule. It’s something none of your
                  | ancestors learned.’ ” Paul opened his eyes, said: “That made me angry and
                  | I said my father rules an entire planet. And she said, ‘He’s losing it.’
                  | And I said my father was getting a richer planet. And she said, ‘He’ll
                  | lose that one, too.’ And I wanted to run and warn my father, but she said
                  | he’d already been warned—by you, by Mother, by many people.”

                  Which one do you consider better legible?
                  >Since apostrophes appear to occur quite often in English texts, I have
                  >therefore decided that in my English texts, ' (the straight
                  >apostrophe, &apos; or &#39;) is the appropriate character for all
                  >apostrophes
                  >
                  That's computerize or typewriterese - abhorred, disliked, and frowned
                  upon by typographers and grammars.
                  IBTD. At least as for regular grammars, having the straight apostrophe only
                  as the apostrophe and ’ only for closing single quote makes it a lot
                  easier to parse the text.


                  PointedEars
                  --
                  realism: HTML 4.01 Strict
                  evangelism: XHTML 1.0 Strict
                  madness: XHTML 1.1 as application/xhtml+xml
                  -- Bjoern Hoehrmann

                  Comment

                  • Jukka K. Korpela

                    #10
                    Re: XHTML 1.0 Strict and the Apostrophe

                    Scripsit Thomas 'PointedEars' Lahn:
                    >No I wouldn't. Such usage is _standard_ English, to the extent
                    >anything is standard in English. Consult the applicable style guide
                    >and then the Unicode Standard, which identifies the punctuation
                    >marks at the level of coded characters.
                    >
                    Well, compare
                    Which style guide did you consult?
                    (from: Frank Herbert, Dune, book 1, chapter 4)
                    >
                    against
                    Which style was used in the printed book? I haven't read it, but I think
                    I know the answer.
                    IBTD. At least as for regular grammars, having the straight
                    apostrophe only as the apostrophe and ’ only for closing single
                    quote makes it a lot easier to parse the text.
                    And using "." only as a full stop and never as a decimal point or an
                    abbreviation point would make parsing even more easier. But that's
                    completely irrelevant here. It's not feasible to resolve ambiguities
                    that way, especially since the world around won't listen to your
                    rationalizing arguments.

                    The only relevant thing in HTML perspective is that &apos; (when
                    implemented at all) denotes the Ascii apostrophe and - against common
                    superstition of unknown origin - not the typographically and
                    orthographicall y correct apostrophe of English and other human
                    languages. This entity reference is best forgotten: it's almost never
                    needed, and should you need it, the character reference is much safer.

                    --
                    Jukka K. Korpela ("Yucca")


                    Comment

                    Working...