Character encoding

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Mambo Bananapatch

    Character encoding

    I'm preparing a site for a client which includes several pages
    containing Cyrillic characters. I used the UTF-8 charset, but the
    Cyrillic characters appeared as question marks (and, oddly, some
    Chinese characters as well.) I tried every Cyrillic charset I could
    find and nothing worked.

    I usually just hand-code all my PHP and HTML, but I swallowed hard and
    went to Dreamweaver CS3, searched around, and found that I could set
    each file's encoding to UTF-8 using the Modify =Page Properties =>
    Title/Encoding command.

    Now it works fine, but I don't really understand what the command did.
    It didn't add any code, and it didn't change the http-equiv tag. In
    fact, I have to perform the command on every file that is included in
    the PHP file.

    So: a) what exactly did Dreamweaver do, and b) how could I have hand-
    coded whatever it is?

    Thank you in advance.

    (Also posted in alt.html -- my apologies if I've violated etiquette.)
  • Martin Honnen

    #2
    Re: Character encoding

    Mambo Bananapatch wrote:
    I'm preparing a site for a client which includes several pages
    containing Cyrillic characters. I used the UTF-8 charset, but the
    Cyrillic characters appeared as question marks (and, oddly, some
    Chinese characters as well.) I tried every Cyrillic charset I could
    find and nothing worked.
    So: a) what exactly did Dreamweaver do, and b) how could I have hand-
    coded whatever it is?
    Well it all depends on what exactly you do when you say "I used the
    UTF-8 charset" or "I tried every Cyrillic charset"? Have you used an
    editor that supports saving as UTF-8 (or a Cyrillic charset) and have
    you used it so that it saved your documents as UTF-8 (or a Cyrillic
    charset)? That is all what you need to do to ensure your files are
    properly encoded. Then, when serving them over HTTP you need to make
    sure the server sends a HTTP Content-Type response header indicating the
    used charset as a paramter e.g.
    Content-Type: text/html; charset=UTF-8

    --

    Martin Honnen

    Comment

    • Jukka K. Korpela

      #3
      Re: Character encoding

      Scripsit Mambo Bananapatch:
      (Also posted in alt.html -- my apologies if I've violated etiquette.)
      Oh, you'll just be ignored in the sequel. No problem.

      --
      Jukka K. Korpela ("Yucca")

      Comment

      • Mambo Bananapatch

        #4
        Re: Character encoding

        On Apr 26, 7:53 am, Martin Honnen <mahotr...@yaho o.dewrote:
        Mambo Bananapatch wrote:
        I'm preparing a site for a client which includes several pages
        containing Cyrillic characters. I used the UTF-8 charset, but the
        Cyrillic characters appeared as question marks (and, oddly, some
        Chinese characters as well.) I tried every Cyrillic charset I could
        find and nothing worked.
        So: a) what exactly did Dreamweaver do, and b) how could I have hand-
        coded whatever it is?
        >
        Well it all depends on what exactly you do when you say "I used the
        UTF-8 charset" or "I tried every Cyrillic charset"? Have you used an
        editor that supports saving as UTF-8 (or a Cyrillic charset) and have
        you used it so that it saved your documents as UTF-8 (or a Cyrillic
        charset)? That is all what you need to do to ensure your files are
        properly encoded. Then, when serving them over HTTP you need to make
        sure the server sends a HTTP Content-Type response header indicating the
        used charset as a paramter e.g.
        Content-Type: text/html; charset=UTF-8
        >
        --
        >
        Martin Honnen
        http://JavaScript.FAQTs.com/
        Thanks Martin, that's exactly what I did. Dreamweaver saved the files
        with the correct encoding, and I used the response header you
        suggested, and all's well.

        I guess my question was more about what Dreamweaver did; if I were to
        hand-code a page with Cyrillic characters, and didn't have access to
        Dreamweaver, how would I encode each file? And why must I encode each
        file, in addition to including the UTF-8 Content-Type response
        header?

        I just wanted to understand what I was doing.

        Thanks for your time.

        MB

        Comment

        • Paul Gorodyansky

          #5
          Re: Character encoding

          Hello!

          You did not really answer Martin's question - what did you do _before_
          you decided to use Dreamweaver.
          On a non-Russian OS one can get question marks in many cases, for
          example:
          - typing in an editor such as Notepad and save as "ANSI", that is, in
          a character set encoding = system code page
          - using copy/paste between Unicode and not-Unicode programs
          - converting to UTF-8 without explicitely providing source encoding
          and thus system code page is assumed
          - etc.

          You may want to read some explanations on my site:
          - section "for developers: Cyrillic (Russian) in HTML"
          - section "for developers: Cyrillic (Russian) in Multilingula HTML -
          UTF-8"
          - chapter "Copy/Paste; Word, .TXT" in the section
          "Unicode and Cyrillic"

          :)

          --
          Regards,
          Paul

          Comment

          • Andreas Prilop

            #6
            Re: Character encoding

            On Sun, 27 Apr 2008, Mambo Bananapatch wrote:
            if I were to
            hand-code a page with Cyrillic characters, and didn't have access to
            Dreamweaver, how would I encode each file?
            You do not write with a pencil, do you? You have some editor
            (word-processor, etc.) on some operating system on some computer.
            We don't know what they are - but you know. Your editor saves
            files in some character set, such as

            MacCyrillic


            ISO-8859-5


            Windows-1251


            Unicode UTF-8

            And why must I encode each
            file, in addition to including the UTF-8 Content-Type response
            header?
            I don't understand what this question means.

            --
            Top-posting.
            What's the most irritating thing on Usenet?

            Comment

            • Ben C

              #7
              Re: Character encoding

              On 2008-05-01, David Trimboli <david@trimboli .namewrote:
              [...]
              Normally the browser learns what encoding to read by the server's HTTP
              headers. An http-equiv declaration in an HTML file is a way to override
              a server's content-type (encoding).
              It doesn't override it-- if both are present, the server header wins.
              You only use this if your server isn't serving files with the correct
              content-type.
              Yes, or because you're using file:// urls during development.

              Comment

              • Andreas Prilop

                #8
                Re: Character encoding

                On Thu, 1 May 2008, David Trimboli wrote:
                An http-equiv declaration in an HTML file is a way to override
                a server's content-type (encoding).
                No, it is not. See



                --
                Bugs in Internet Explorer 7

                Comment

                Working...