Pierre Goiffon Oct 6 2004, 4:29 am show options
Newsgroups: comp.infosystem s.www.authoring.html[color=blue][color=green]
>> The problem with charset UTF-8 on pages with forms for e.g.
>> guestbooks, formmail and bloggs is that writing in a non-english
>> language can give garbage characters from the letters that is not>
>> represented in the english language. That's because what is writed in
>> the text box don't get encoded, as text done with HTML editors does.[/color]
>
>I really can't understand your post. A server that sends a form to a client
>with the appropriate charset headers should get in return all the users
>input encoded in that charset. If the form is sent with a UTF-8 header, you
>should get all the characters encoded in UTF-8. And so user could input any
>character included in Unicode.[/color]
I saw this old post and decided that I did not understand it.
Suppose I have a form on a webpage and that form has a UTF-8 charset
header. Suppose there is also a textarea in that form, and a submit
button. Suppose I write something in Microsoft Word and use lots of
strange characters, then I copy and paste it into the textarea and hit
the submit button. At the other end, receiving the form, is a PHP
script which takes that text and makes it a webpage, with a UTF-8
charset header.
If I understand what Pierre Goiffon is saying, then it sounds as if no
garbage characters will appear on that page, no matter how many strange
characters I used in the Word document. It sounds to me as if he is
saying that everything will magically get transformed into a character
that makes sense in UTF-8.
Am I missing something? Surely that is not how it works?
Newsgroups: comp.infosystem s.www.authoring.html[color=blue][color=green]
>> The problem with charset UTF-8 on pages with forms for e.g.
>> guestbooks, formmail and bloggs is that writing in a non-english
>> language can give garbage characters from the letters that is not>
>> represented in the english language. That's because what is writed in
>> the text box don't get encoded, as text done with HTML editors does.[/color]
>
>I really can't understand your post. A server that sends a form to a client
>with the appropriate charset headers should get in return all the users
>input encoded in that charset. If the form is sent with a UTF-8 header, you
>should get all the characters encoded in UTF-8. And so user could input any
>character included in Unicode.[/color]
I saw this old post and decided that I did not understand it.
Suppose I have a form on a webpage and that form has a UTF-8 charset
header. Suppose there is also a textarea in that form, and a submit
button. Suppose I write something in Microsoft Word and use lots of
strange characters, then I copy and paste it into the textarea and hit
the submit button. At the other end, receiving the form, is a PHP
script which takes that text and makes it a webpage, with a UTF-8
charset header.
If I understand what Pierre Goiffon is saying, then it sounds as if no
garbage characters will appear on that page, no matter how many strange
characters I used in the Word document. It sounds to me as if he is
saying that everything will magically get transformed into a character
that makes sense in UTF-8.
Am I missing something? Surely that is not how it works?
Comment