Zero width space still unsafe?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Andreas Prilop

    Zero width space still unsafe?

    Jukka reports on

    that Internet Explorer 6 fails on the "zero width space" U+200B ​

    Is this observation still valid? For which versions of MS Windows
    does it apply? Does it depend on the encoding (charset)?
    I have a test page in three encodings:



    After each letter "z" there is a "zero width space". Do you see
    an empty box instead? The correct browser behaviour would be
    to allow a line break after "zero width space".


    http://validator.w3.org does not recognize ISO-8859-11.
    Why not?

  • Alan J. Flavell

    #2
    Re: Zero width space still unsafe?

    On Mon, 20 Dec 2004, Andreas Prilop wrote:
    [color=blue]
    > http://validator.w3.org does not recognize ISO-8859-11.
    > Why not?[/color]

    Hmmm, Google's hit for this:



    (which leads to

    http://mail.apps.ietf.org/ietf/charsets/msg01363.html )

    says that (as of April 2003) it hadn't been registered at IANA.

    And it's still not registered with IANA (although 8859-16, which
    I think came in at around the same time, is there)

    Comment

    • Jan Roland Eriksson

      #3
      Re: Zero width space still unsafe?

      On Mon, 20 Dec 2004 15:46:54 +0100, Andreas Prilop
      <nhtcapri@rrz n-user.uni-hannover.de> wrote:
      [color=blue]
      >Jukka reports on
      > http://www.cs.tut.fi/~jkorpela/chars/spaces.html
      >that Internet Explorer 6 fails on the "zero width space" U+200B ​
      >
      >Is this observation still valid? For which versions of MS Windows
      >does it apply? Does it depend on the encoding (charset)?
      >I have a test page in three encodings:
      > http://www.unics.uni-hannover.de/nht...temp/zwsp.html
      > http://www.unics.uni-hannover.de/nht...mp/zwsp.html11
      > http://www.unics.uni-hannover.de/nhtcapri/temp/zwsp.tis
      >After each letter "z" there is a "zero width space". Do you see
      >an empty box instead? The correct browser behaviour would be
      >to allow a line break after "zero width.[/color]

      Mozilla and Firefox behaves as required, i.e. no "empty box" and
      correct line breaks at various points depending on UA window width.

      IE6(+latest SP) is also correct for UTF-8 but...

      ....it shows the box for the other two examples but still linebreaks at
      points either before or after the boxes depending on window width.
      Peculiar behavior :-)
      [color=blue]
      > http://validator.w3.org does not recognize ISO-8859-11.
      >Why not?[/color]

      Que Nick?

      --
      Rex


      Comment

      • Alan J. Flavell

        #4
        Re: Zero width space still unsafe?

        On Mon, 20 Dec 2004, Jan Roland Eriksson wrote:

        [IE...][color=blue]
        > ... shows the box for the other two examples but still linebreaks at
        > points either before or after the boxes depending on window width.[/color]

        Strange, it doesn't do that for me (neither IE6 Win2K nor XP SP2).

        However, I do believe that both of them have the Japanese language
        option installed. Yup: control panel -> regional options shows that
        my Win2k has Japanese and various other language options enabled,
        though *not* Thai; whereas this XP has the boxes turned on for
        "complex script... including Thai" and "East Asian languages".

        Comment

        • Jan Roland Eriksson

          #5
          Re: Zero width space still unsafe?

          On Mon, 20 Dec 2004 16:22:53 +0000, "Alan J. Flavell"
          <flavell@ph.gla .ac.uk> wrote:
          [color=blue]
          >On Mon, 20 Dec 2004, Jan Roland Eriksson wrote:
          >[IE...][color=green]
          >> ... shows the box for the other two examples but still linebreaks at
          >> points either before or after the boxes depending on window width.[/color]
          >
          >Strange, it doesn't do that for me (neither IE6 Win2K nor XP SP2).
          >...I do believe that both of them have the Japanese language
          >option installed. Yup: control panel -> regional options shows that
          >my Win2k has Japanese and various other language options enabled,
          >though *not* Thai; whereas this XP has the boxes turned on for
          >"complex script... including Thai" and "East Asian languages".[/color]

          XP-Pro+Sp2 here and IE6+latest SP (plus all the latest sequrity stuff of
          course) but no "fancy" langauages, only English and Swedish AFAICS.

          (I can't read anything but text in Western alphabets anyway :-)

          --
          Rex


          Comment

          • Alan J. Flavell

            #6
            Re: Zero width space still unsafe?

            On Mon, 20 Dec 2004, Jan Roland Eriksson wrote:
            [color=blue]
            > (I can't read anything but text in Western alphabets anyway :-)[/color]

            Neither can I, but by installing Japanese I found I got a load of
            interesting symbols to display in IE, which were otherwise
            unavailable, even though they had no evident relevance to Japanese.

            (AFAIR, most of them were previously displaying just fine in Mozilla,
            which was finding them from somewhere or other - but IE wasn't finding
            them, as I discuss on my browsers-fonts web page.)

            Comment

            • Henri Sivonen

              #7
              Re: Zero width space still unsafe?

              In article <Pine.GSO.4.44. 0412201534460.1 2988-100000@s5b004>,
              Andreas Prilop <nhtcapri@rrz n-user.uni-hannover.de> wrote:
              [color=blue]
              > http://www.unics.uni-hannover.de/nht...temp/zwsp.html
              > http://www.unics.uni-hannover.de/nht...mp/zwsp.html11
              > http://www.unics.uni-hannover.de/nhtcapri/temp/zwsp.tis
              > After each letter "z" there is a "zero width space". Do you see
              > an empty box instead?[/color]

              I see a box in Firefox (trunk) on OS X.

              --
              Henri Sivonen
              hsivonen@iki.fi

              Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html

              Comment

              • Jim Ley

                #8
                Re: Zero width space still unsafe?

                On Mon, 20 Dec 2004 17:59:50 +0000, "Alan J. Flavell"
                <flavell@ph.gla .ac.uk> wrote:
                [color=blue]
                >On Mon, 20 Dec 2004, Jan Roland Eriksson wrote:
                >[color=green]
                >> (I can't read anything but text in Western alphabets anyway :-)[/color]
                >
                >Neither can I, but by installing Japanese I found I got a load of
                >interesting symbols to display in IE, which were otherwise
                >unavailable, even though they had no evident relevance to Japanese.[/color]

                It works for me, no fancy language packs installed. Is it perhaps
                font related?

                Jim.
                --
                comp.lang.javas cript FAQ - http://jibbering.com/faq/

                Comment

                • Alan J. Flavell

                  #9
                  Re: Zero width space still unsafe?

                  On Mon, 20 Dec 2004, Jim Ley wrote:
                  [color=blue]
                  > It works for me, no fancy language packs installed.[/color]

                  That's a useful data point, thanks. Would that be XP?
                  [color=blue]
                  > Is it perhaps font related?[/color]

                  Could well be - I'm afraid my understanding of Windows internals
                  is quite lacking - most of what I think I've grasped has been done
                  by experimenting. And installing and de-installing fonts and language
                  packs to prove a point, rapidly gets stale, as I'm sure you'd agree,

                  There do seem to be some typographical issues that can only be
                  resolved by installing the relevant language pack. I'm afraid I
                  don't really know whether this is one of them or not.

                  Comment

                  • Jim Ley

                    #10
                    Re: Zero width space still unsafe?

                    On Mon, 20 Dec 2004 21:23:36 +0000, "Alan J. Flavell"
                    <flavell@ph.gla .ac.uk> wrote:
                    [color=blue]
                    >On Mon, 20 Dec 2004, Jim Ley wrote:
                    >[color=green]
                    >> It works for me, no fancy language packs installed.[/color]
                    >
                    >That's a useful data point, thanks. Would that be XP?[/color]

                    Yes XP SP2

                    The only thing that might be thought of as increasing support for more
                    chars was manually installing Arial Unicode.

                    Jim.
                    --
                    comp.lang.javas cript FAQ - http://jibbering.com/faq/

                    Comment

                    • Jukka K. Korpela

                      #11
                      Re: Zero width space still unsafe?

                      Andreas Prilop <nhtcapri@rrz n-user.uni-hannover.de> wrote:
                      [color=blue]
                      > Jukka reports on
                      > http://www.cs.tut.fi/~jkorpela/chars/spaces.html
                      > that Internet Explorer 6 fails on the "zero width space" U+200B
                      > ​[/color]

                      .... in "normal" conditions, yes. By "normal" I mean that the font used
                      is not Arial Unicode MS or Lucida Sans Unicode (or some special font).

                      It seems to me that the behavior mostly depends on fonts, which in turn
                      depend on many things. If an author style sheet suggests
                      font-family: Arial Unicode MS, Lucida Sans Unicode;
                      then I would say that the great majority of users would see the
                      document rendered properly in this respect. But such settings may have
                      drawbacks.

                      The problem, as I understand it, is this:
                      - IE 6 (and even IE 4 and IE 5) knows the basic property of U+200B that
                      a line break is permitted after it
                      - however it does not know that it has zero width so that the browser
                      need not render anything for it
                      - so it uses whatever the font in use has for the character
                      - and it fails to scan through the available fonts to pick up one that
                      contains a glyph for the character.

                      So my practical conclusion is that U+200B is not ready for prime time,
                      and if it is important to suggest permissible line breaks in a long
                      string, the nonstandard <wbr> is still the practical solution.

                      For some additional notes, see

                      where I mention that the HTML 4.01 specification explicitly leaves the
                      rendering of ZWSP (as one of the white space characters for which
                      rendering is _not_ defined) explicitly undefined.

                      --
                      Yucca, http://www.cs.tut.fi/~jkorpela/
                      Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

                      Comment

                      • Jan Roland Eriksson

                        #12
                        Re: Zero width space still unsafe?

                        On Mon, 20 Dec 2004 21:23:36 +0000, "Alan J. Flavell"
                        <flavell@ph.gla .ac.uk> wrote:
                        [color=blue]
                        >On Mon, 20 Dec 2004, Jim Ley wrote:[/color]
                        [...][color=blue][color=green]
                        >> Is it perhaps font related?[/color][/color]
                        [color=blue]
                        >Could well be - I'm afraid my understanding of Windows internals
                        >is quite lacking...[/color]

                        The real basic fact is that there is no one single person that knows how
                        Windows is supposed to work today, not even within MS themselves.

                        That come as a result of "outsourcin g" for coding works. Most parts of
                        MS products are today produced in so called low cost countries, India,
                        Russia, China and every other country that is willing to sell the souls
                        of their people just to get the money in.

                        For quite some time back it's all about the money, and protection of the
                        "monopoly". Heck, MS is in a "full control" position of just about every
                        hard disk producing company in the world. Proved by the fact that it is
                        cheaper to buy a new HD with Win-something pre installed than it is to
                        get the same drive all blank from the start :-)
                        [color=blue]
                        >- most of what I think I've grasped has been done by experimenting.[/color]

                        So have we all, but the target keeps moving around :-)

                        Allow me to predict (as based on last days "experimenting" ) that, given
                        the right tool, every and all Win NT/XP user can find at least a 1000
                        dead entries in his registry data base.

                        The "registry database" is just another con played on MS users that made
                        it possible for MS to hide away all the basic idiocy that is buried in
                        that OP-system.

                        From what I have found it looks like a garbage dump for both MS and
                        other applications that gets installed in the Win environment.

                        I'm pretty sure that this (ab)usage of the "registry database" was not
                        an original idea of Dave Cutler.

                        --
                        Rex


                        Comment

                        • Alan J. Flavell

                          #13
                          Re: Zero width space still unsafe?

                          On Mon, 20 Dec 2004, Jukka K. Korpela wrote:
                          [color=blue]
                          > It seems to me that the behavior mostly depends on fonts, which in turn
                          > depend on many things. If an author style sheet suggests
                          > font-family: Arial Unicode MS, Lucida Sans Unicode;
                          > then I would say that the great majority of users would see the
                          > document rendered properly in this respect. But such settings may have
                          > drawbacks.[/color]

                          I believe that Tahoma is likely to rate better than L.S.U in this
                          regard, whereas we shouldn't assume that most people have A.U.MS.

                          Whereas, if they have a font that's well tuned to their writing
                          system, then telling MSIE to use any of the above will be a
                          disservice to them. It's a difficult choice to have to make.
                          [color=blue]
                          > So my practical conclusion is that U+200B is not ready for prime time,[/color]

                          In general I'd have to agree with you. However, the context was
                          browsing of the Thai writing system, so one might presume that anyone
                          interested in that would be willing to equip themselves with an
                          appropriate font and browser settings. The fact that it'll make a
                          hopeless mess for the rest of us is neither here nor there, since we
                          can't read it anyway. IMHO and YMMV...
                          [color=blue]
                          > and if it is important to suggest permissible line breaks in a long
                          > string, the nonstandard <wbr> is still the practical solution.[/color]

                          I don't know why that cited Thai page claims that this non-standard
                          <wbr> is no longer working (for some practical value of the term
                          "working" ;-)

                          Mind you, the marker could just as well be <foobar> or <secam>, for
                          all that most browsers seem to care. Or <x> if you prefer less typing
                          ;-)
                          [color=blue]
                          > For some additional notes, see
                          > http://www.cs.tut.fi/~jkorpela/html/nobr.html#zwsp
                          > where I mention that the HTML 4.01 specification explicitly leaves the
                          > rendering of ZWSP (as one of the white space characters for which
                          > rendering is _not_ defined) explicitly undefined.[/color]

                          Possibly; but there are hints elsewhere that browsers are expected to
                          apply appropriate typography for the writing system in use, and
                          Thai evidently needs this, so it's still on the agenda for browser
                          implementers, no matter that HTML doesn't demand it in so many words.

                          Comment

                          • Jim Ley

                            #14
                            Re: Zero width space still unsafe?

                            On Mon, 20 Dec 2004 23:55:56 +0100, Jan Roland Eriksson
                            <jrexon@newsguy .com> wrote:
                            [color=blue]
                            >That come as a result of "outsourcin g" for coding works. Most parts of
                            >MS products are today produced in so called low cost countries, India,
                            >Russia, China and every other country that is willing to sell the souls
                            >of their people just to get the money in.[/color]

                            Good, I'm very, very glad that they're using low cost developers,
                            almost all the problems I've seen with outsourcing has been because of
                            poor management by the western countries, not low cost developers. It
                            certainly makes sense for them.
                            [color=blue]
                            >Heck, MS is in a "full control" position of just about every
                            >hard disk producing company in the world. Proved by the fact that it is
                            >cheaper to buy a new HD with Win-something pre installed than it is to
                            >get the same drive all blank from the start :-)[/color]

                            Could you tell me where I get to buy these hard disks? I've never
                            even seen a hard disk for sale with an operating system on it.
                            [color=blue]
                            >Allow me to predict (as based on last days "experimenting" ) that, given
                            >the right tool, every and all Win NT/XP user can find at least a 1000
                            >dead entries in his registry data base.[/color]

                            I think there's a good chance that any computer user could find 1000
                            dead lines of config data.

                            Jim.
                            --
                            comp.lang.javas cript FAQ - http://jibbering.com/faq/

                            Comment

                            • Andreas Prilop

                              #15
                              Re: Zero width space still unsafe?

                              On Mon, 20 Dec 2004, Henri Sivonen wrote:
                              [color=blue][color=green]
                              >> http://www.unics.uni-hannover.de/nhtcapri/temp/zwsp.tis
                              >> After each letter "z" there is a "zero width space". Do you see
                              >> an empty box instead?[/color]
                              >
                              > I see a box in Firefox (trunk) on OS X.[/color]

                              Firefox (Solaris 9) does not display a box - it shows only the letters
                              and breaks, if necessary, after "z".

                              The MacThai character set includes the zero width space:

                              If you don't mind, you might (temporarily) install Thai language
                              support and see what happens.

                              I regard the "zero width space" not as a graphic character, but as
                              a control character like "newline" or "zero width joiner". There's
                              nothing to display with these characters. What's the point of including
                              glyphs for "newline" or "zero width space" in a font? Consider a
                              program that wouldn't do a newline when the font has no glyph for it!
                              A bit stupid. There's something wrong with programs when they insist
                              of displaying certain glyphs for the control characters "newline" or
                              "zero width space".

                              The mystery is:
                              How are existing Thai pages written?

                              Comment

                              Working...