utf-8 and xhtml 1.0 strict

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • The Bicycling Guitarist

    #16
    Re: utf-8 and xhtml 1.0 strict


    "Leif K-Brooks" <eurleif@ecritt ers.biz> wrote in message
    news:2ucs4eF27v mn8U1@uni-berlin.de...[color=blue]
    > The Bicycling Guitarist wrote:[color=green]
    >> My web site has not been spidered by Googlebot since April 2003. The site
    >> in question is at www.TheBicyclingGuitarist.net/ I > Seems to be sending
    >> the incorrect MIME type "text/*" instead of text/html:[/color]
    >
    > [leif@localhost leif]$ telnet TheBicyclingGui tarist.net 80
    > Trying 216.229.101.149 ...
    > Connected to TheBicyclingGui tarist.net (216.229.101.14 9).
    > Escape character is '^]'.
    > GET / HTTP/1.1
    > Host: TheBicyclingGui tarist.net
    >
    > HTTP/1.1 200 OK
    > Server: Microsoft-IIS/5.0
    > Content-Location: http://TheBicyclingGuitarist.net/index.htm
    > Date: Thu, 28 Oct 2004 18:24:11 GMT
    > Content-Type: text/*;charset=utf-8
    > Accept-Ranges: bytes
    > Last-Modified: Wed, 27 Oct 2004 19:48:04 GMT
    > ETag: "fab99adf5dbcc4 1:9f9"
    > Content-Length: 5169
    >[/color]
    Oh no. I just asked the tech to change from text/html to text/* on the
    advice of someone in another NG. I am sorry about posting the same question
    to two similar NG's. I was told "If you accept text/* you get your page. It
    doesn't seem to be linked to the charset."

    The problem existed for a year and a half using "text/html". The change to
    "text/*" just happened today or yesterday. Should the tech change it back?
    How do I get these two NG threads back together?

    Chris Watson a.k.a. "The Bicycling Guitarist"


    Comment

    • Jan Roland Eriksson

      #17
      Re: utf-8 and xhtml 1.0 strict

      On Thu, 28 Oct 2004 19:50:17 GMT, Invalid User <user@domain.in valid>
      wrote:
      [...][color=blue]
      >Where do you think I copied this from:
      ><?xml version="1.0"?>
      ><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
      >"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
      ><html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">[/color]

      Hmm, lets quote you from an earlier post...

      "On Thu, 28 Oct 2004 14:53:30 GMT, Invalid User
      <user@domain.in valid> wrote:
      [...]
      I have small site in XHTML 1.1 and UTF-8 and I have no
      problem with Google. So, your host's tech guy don't know
      what his talking about!"

      You claimed; "I have small site in XHTML 1.1 and UTF-8"
      but your document prolog indicates XHTML 1.0

      There's a difference specified at W3C on what MIME type to use for these
      two different versions of XHTML as you may want to find out.

      Hence the confusion that did develop in this thread.

      --
      Rex


      Comment

      • The Bicycling Guitarist

        #18
        Re: utf-8 and xhtml 1.0 strict


        "Alan J. Flavell" <flavell@ph.gla .ac.uk> wrote in message
        news:Pine.LNX.4 .61.04102814372 40.19356@ppepc5 6.ph.gla.ac.uk. ..[color=blue]
        > On Thu, 28 Oct 2004, The Bicycling Guitarist wrote:
        >
        > $ telnet www.thebicyclingguitarist.net 80
        > Trying 216.229.101.149 ...
        > Connected to www.thebicyclingguitarist.net.
        > Escape character is '^]'.
        > GET / HTTP/1.0
        > Host: www.thebicyclingguitarist.net
        > Accept: text/html,text/plain
        >
        > HTTP/1.1 406 No acceptable objects were found
        > Server: Microsoft-IIS/5.0
        > Date: Thu, 28 Oct 2004 13:51:57 GMT
        > Content-Length: 3906
        > Content-Type: text/html
        >[/color]
        Thank you for your help, Alan. I notice the test date is Oct 28. On Oct 27
        the tech changed the MIME from text/html to text/* at my request. Is that
        wrong?
        Chris Watson a.k.a. "The Bicycling Guitarist"


        Comment

        • Leif K-Brooks

          #19
          Re: utf-8 and xhtml 1.0 strict

          The Bicycling Guitarist wrote:[color=blue]
          > Oh no. I just asked the tech to change from text/html to text/* on the
          > advice of someone in another NG. I am sorry about posting the same question
          > to two similar NG's. I was told "If you accept text/* you get your page. It
          > doesn't seem to be linked to the charset."[/color]

          He was talking about the accept header that the user agent (e.g.
          browser) sends to the server, not the content-type that your server
          changes to the browser. Wildcarding is acceptable for accept headers,
          but for content-type headers, it's ridiculous.

          By the way, your broken content-type header makes the site not work in
          working browsers (like Mozilla).

          Comment

          • The Bicycling Guitarist

            #20
            Charset in server header? (was utf-8 and xhtml 1.0 strict)


            "Alan J. Flavell" <flavell@ph.gla .ac.uk> wrote in message
            news:Pine.LNX.4 .61.04102814372 40.19356@ppepc5 6.ph.gla.ac.uk. ..[color=blue]
            > On Thu, 28 Oct 2004, The Bicycling Guitarist wrote:[color=green]
            >> My host's tech guy just sent me the following.[/color][/color]
            [color=blue]
            > Once more with feeling:
            > $ telnet www.thebicyclingguitarist.net 80
            > Trying 216.229.101.149 ...
            > Connected to www.thebicyclingguitarist.net.
            > Escape character is '^]'.
            > GET / HTTP/1.0
            > Host: www.thebicyclingguitarist.net
            > Accept: text/html
            >
            > HTTP/1.1 406 No acceptable objects were found
            > Server: Microsoft-IIS/5.0
            > Date: Thu, 28 Oct 2004 13:53:59 GMT
            >
            > You need to concentrate on why content-type negotiation is failing.[/color]

            The following is what my host's tech guy just sent me. Now he says it's the
            fault of specifying the charset in the server header. I thought that's what
            we're supposed to do. Is he doing it wrong?

            Chris,
            While I agree that the charset of usf-8 [sic] is hardly "custom" the fact
            that you had us enter it as a server injected http header as documented at
            http://www.w3.org/International/O-HTTP-charset makes it a custom setting on
            your site on IIS. Also doing a search on the M$ KB I found a list of IIS
            error codes which included "406 - Client browser does not accept the MIME
            type of the requested page." again leading me to believe this issue has to
            do with the MIME charset specificiation. You also qouted in a previous
            message "Google sends Accept: text/html,text/plain; which of course makes
            good sense for a robot as it doesn't want anything else" further reinforcing
            my theory.

            When you telnet in and request the page manually with an accept of
            "text/html" or "text/html,text/plain: the server tries to return
            "text/html;charset=ut f-8" which isn't what the client requested so it kicks
            back the 406 error. If you specify the content type in the accept field, it
            matches what the server is offering (see below) and gives the http 200
            status. I also temporarily removed the charset=usf-8 setting and tested it
            again sending nothing but an "Accept: text/html" and "text/html,text/plain"
            and it responded correctly. If there is another way in IIS that you would
            like me to specify the charset, other than outlined at w3.org please let me
            know.

            $ telnet thebicyclinggui tarist.net 80

            Trying 216.229.101.149 ...

            Connected to www.thebicyclingguitarist.net.

            Escape character is '^]'.

            GET / HTTP/1.0

            Host: www.thebicyclingguitarist.net

            Accept Content-type: text/html,text/plain,usf-8

            HTTP/1.1 200 OK

            Server: Microsoft-IIS/5.0

            Content-Location: http://www.thebicyclingguitarist.net/index.htm

            Date: Thu, 28 Oct 2004 21:01:12 GMT

            Content-Type: text/html;charset=ut f-8

            Accept-Ranges: bytes

            Last-Modified: Wed, 27 Oct 2004 19:48:04 GMT

            ETag: "fab99adf5dbcc4 1:9fa"

            Content-Length: 5169






            Comment

            • Daniel R. Tobias

              #21
              Re: utf-8 and xhtml 1.0 strict

              Alan J. Flavell wrote:[color=blue]
              > My suspicion is that we're going to end up with XHTML-flavoured tag soup,
              > thus losing any of the benefits which XML was *supposed* to bring us.[/color]

              We're seeing some of that already, in tag-soup pages in which
              XHTML-style tags like <br /> are sprinkled randomly (interspersed with
              their HTML-style equivalents like <br>).

              --
              == Dan ==
              Dan's Mail Format Site: http://mailformat.dan.info/
              Dan's Web Tips: http://webtips.dan.info/
              Dan's Domain Site: http://domains.dan.info/

              Comment

              • Nick Kew

                #22
                Re: Charset in server header? (was utf-8 and xhtml 1.0 strict)

                In article <ncfgd.1879$zx1 .747@newssvr13. news.prodigy.co m>,
                "The Bicycling Guitarist" <Chris@TheBicyc lingGuitarist.n et> writes:
                [color=blue]
                > The following is what my host's tech guy just sent me. Now he says it's the
                > fault of specifying the charset in the server header. I thought that's what
                > we're supposed to do. Is he doing it wrong?[/color]

                Yes.

                Well, I don't know IIS - it may be impossible to do it right.
                But what Alan reported shows that it doesn't support HTTP, at least
                as currently configured. If your host can't fix that, then they're
                as fit for use on the Internet as a washing machine that only works
                on 115 volts DC.

                Hmmm, I recently ran an accessibility evaluation on an IIS site -
                just checked it. They return "text/html; charset=UTF-8" and an HTML
                page, regardless of any Accept header I send. Maybe that would be
                an improvement for your host, if they lack the basic commonsense to
                upgrade to Apache.

                --
                Nick Kew

                Comment

                • Rijk van Geijtenbeek

                  #23
                  Re: utf-8 and xhtml 1.0 strict

                  On Thu, 28 Oct 2004 21:32:47 -0400, Daniel R. Tobias <dan@tobias.nam e>
                  wrote:[color=blue]
                  > Alan J. Flavell wrote:[/color]
                  [color=blue][color=green]
                  >> My suspicion is that we're going to end up with XHTML-flavoured tag
                  >> soup, thus losing any of the benefits which XML was *supposed* to bring
                  >> us.[/color]
                  >
                  > We're seeing some of that already, in tag-soup pages in which
                  > XHTML-style tags like <br /> are sprinkled randomly (interspersed with
                  > their HTML-style equivalents like <br>).[/color]

                  Don't forget the XHTML pages with topmargin=0 added to the BODY tag...
                  Even systems geared to producing standards based markup like some blogging
                  tools are not vigorously checking input and comments, so you end up with
                  invalid pages anyway. Not a big deal, but this means that browsers will
                  not be able to treat XHTML different from HTML: it is and will stay tag
                  soup. There is nothing (more) wrong with sending XHTML tag soup instead of
                  sending HTML tag soup, from the browsers and readers point of view.

                  --
                  Rijk van Geijtenbeek

                  The Web is a procrastination apparatus:
                  It can absorb as much time as is required to ensure that you
                  won't get any real work done. - J.Nielsen

                  Comment

                  • The Bicycling Guitarist

                    #24
                    Re: Charset in server header? (was utf-8 and xhtml 1.0 strict)


                    "Nick Kew" <nick@hugin.web thing.com> wrote in message
                    news:7eh852-0j1.ln1@hugin.w ebthing.com...[color=blue]
                    > In article <ncfgd.1879$zx1 .747@newssvr13. news.prodigy.co m>,
                    > "The Bicycling Guitarist" <Chris@TheBicyc lingGuitarist.n et> writes:
                    >
                    > But what Alan reported shows that it doesn't support HTTP, at least
                    > as currently configured. If your host can't fix that, then they're
                    > as fit for use on the Internet as a washing machine that only works
                    > on 115 volts DC.
                    >
                    > Hmmm, I recently ran an accessibility evaluation on an IIS site -[/color]
                    [color=blue]
                    > Nick Kew[/color]

                    Hi Nick. I like "fussy" mode. Your online tools have greatly assisted me
                    many times. I am very grateful for your assistance, and to everyone else who
                    has contributed to this thread. I may need another host soon.
                    Chris Watson a.k.a. "The Bicycling Guitarist"


                    Comment

                    • Nick Kew

                      #25
                      Re: utf-8 and xhtml 1.0 strict

                      In article <opsgmjp5qzcvft y8@news.individ ual.net>,
                      "Rijk van Geijtenbeek" <rijk@operaremo vethiz.com> writes:
                      [color=blue]
                      > Even systems geared to producing standards based markup like some blogging
                      > tools are not vigorously checking input and comments, so you end up with
                      > invalid pages anyway.[/color]

                      Huh? Either tools generate valid markup, or they don't. You want to
                      accept markup from users and guarantee it's valid, see for example


                      --
                      Nick Kew

                      Comment

                      • Rijk van Geijtenbeek

                        #26
                        Re: utf-8 and xhtml 1.0 strict

                        On Fri, 29 Oct 2004 12:08:42 +0100, Nick Kew <nick@hugin.web thing.com>
                        wrote:
                        [color=blue]
                        > In article <opsgmjp5qzcvft y8@news.individ ual.net>,
                        > "Rijk van Geijtenbeek" <rijk@operaremo vethiz.com> writes:
                        >[color=green]
                        >> Even systems geared to producing standards based markup like some
                        >> blogging
                        >> tools are not vigorously checking input and comments, so you end up with
                        >> invalid pages anyway.[/color]
                        >
                        > Huh? Either tools generate valid markup, or they don't.[/color]

                        Many don't. Even though they use nice standards-based table-less CSS
                        driven markup, that's what I meant.
                        [color=blue]
                        > You want to
                        > accept markup from users and guarantee it's valid, see for example
                        > http://www.apachetutor.org/apps/annot
                        >[/color]



                        --
                        Rijk van Geijtenbeek

                        The Web is a procrastination apparatus:
                        It can absorb as much time as is required to ensure that you
                        won't get any real work done. - J.Nielsen

                        Comment

                        • Pierre Goiffon

                          #27
                          Re: Charset in server header? (was utf-8 and xhtml 1.0 strict)

                          "The Bicycling Guitarist" <Chris@TheBicyc lingGuitarist.n et> a écrit
                          dans le message de news:ncfgd.1879 $zx1.747@newssv r13.news.prodig y.com[color=blue]
                          > The following is what my host's tech guy just sent me.[/color]
                          (...)[color=blue]
                          > Chris,[/color]
                          (...)[color=blue]
                          > When you telnet in and request the page manually with an accept of
                          > "text/html" or "text/html,text/plain: the server tries to return
                          > "text/html;charset=ut f-8" which isn't what the client requested so it
                          > kicks back the 406 error.[/color]

                          Seems like there is a big confusion : the mime type value and the charset
                          share the same content type http header. For a content type negociation of
                          course only the mime type is compared. The charset information is, as told
                          before, very important and should always be returned.

                          Comment

                          • The Bicycling Guitarist

                            #28
                            Re: Charset in server header? (was utf-8 and xhtml 1.0 strict)


                            "Pierre Goiffon" <pgoiffon@nowhe re.invalid> wrote in message
                            news:418237ee$0 $27298$636a15ce @news.free.fr.. .[color=blue]
                            > "The Bicycling Guitarist" <Chris@TheBicyc lingGuitarist.n et> a écrit
                            > dans le message de news:ncfgd.1879 $zx1.747@newssv r13.news.prodig y.com[color=green]
                            >> The following is what my host's tech guy just sent me.[/color]
                            > (...)[color=green]
                            >> Chris,[/color]
                            > (...)[color=green]
                            >> When you telnet in and request the page manually with an accept of
                            >> "text/html" or "text/html,text/plain: the server tries to return
                            >> "text/html;charset=ut f-8" which isn't what the client requested so it
                            >> kicks back the 406 error.[/color]
                            >
                            > Seems like there is a big confusion : the mime type value and the charset
                            > share the same content type http header. For a content type negociation of
                            > course only the mime type is compared. The charset information is, as told
                            > before, very important and should always be returned.[/color]

                            Should I also forward the above paragraph to my host's tech? Will *this*
                            finally fix things (assuming he understands it and can implement it on IIS)?
                            Chris Watson a.k.a. "The Bicycling Guitarist"


                            Comment

                            • Pierre Goiffon

                              #29
                              Re: Charset in server header? (was utf-8 and xhtml 1.0 strict)

                              "The Bicycling Guitarist" <Chris@TheBicyc lingGuitarist.n et> a écrit
                              dans le message de news:33sgd.2099 $zx1.102@newssv r13.news.prodig y.com[color=blue]
                              > Should I also forward the above paragraph to my host's tech?[/color]

                              Hu. Well no problem. You should just rephrase because I think my english is
                              far for perfect :)
                              The idea is that the content-type HTTP header contains two distincts MIME
                              informations : the MIME content-type itself and the charset. The content
                              negociation must be done with the _MIME_ content-type.

                              Comment

                              • Spartanicus

                                #30
                                Re: utf-8 and xhtml 1.0 strict

                                Invalid User <user@domain.in valid> wrote:
                                [color=blue][color=green][color=darkred]
                                >>> Therefore more XHTML coded sites is published almost every day now.[/color][/color][/color]

                                Every day more Lemmings are jumping off a cliff.
                                [color=blue][color=green]
                                >> Mostly by people who don't understand the issues involved. Instead, like
                                >> you, they've been seduced the the "X" into thinking it's somehow cool or
                                >> modern.[/color]
                                >
                                >Where do you think I copied this from:
                                ><?xml version="1.0"?>
                                ><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
                                >"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
                                ><html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
                                >
                                >No, it's not any of my pages. It's from the W3C site! Don't even the
                                >guys at W3C understand?[/color]

                                They do, it's a mistake to see their usage of xhtml served as text/html
                                as a recommendation. It's an aspiration that has proven to be pointless
                                for the foreseeable future. Even the things they do recommend should be
                                evaluated for value, usefulness and practicality. This evaluation is
                                left to the coding community and the verdict on xhtml is: don't. Look
                                through the archives of this group; every time it is discussed the
                                mythical benefits of xhtml have been dispelled by the people who've been
                                there and done that.
                                [color=blue]
                                >I know a lot more sites by people who know a lot
                                >more about this than I do, and who use XHTML served as text/html[/color]

                                Lemmings, do you want to be one?

                                --
                                Spartanicus

                                Comment

                                Working...