utf8 encoding problem

**Erik Max Francis** · Jul 18 '05, 07:48 AM

Re: utf8 encoding problem

Wichert Akkerman wrote:
[color=blue]
> I'm struggling with what should be a trivial problem but I can't seem
> to
> come up with a proper solution: I am working on a CGI that takes utf-8
> input from a browser. The input is nicely encoded so you get something
> like this:
>
> firstname=t%C3% A9s
>
> where %C3CA9 is a single character in utf-8 encoding. Passing this
> through urllib.unquote does not help:
>[color=green][color=darkred]
> >>> urllib.unquote( u't%C3%A9st')[/color][/color]
> u't%C3%A9st'[/color]

Unquote it as a normal string, then convert it to Unicode.
[color=blue][color=green][color=darkred]
>>> import urllib
>>> x = 't%C3%A9s'
>>> y = urllib.unquote( x)
>>> y[/color][/color][/color]
't\xc3\xa9s'[color=blue][color=green][color=darkred]
>>> z = unicode(y, 'utf-8')
>>> z[/color][/color][/color]
u't\xe9s'

--
__ Erik Max Francis && max@alcyone.com && http://www.alcyone.com/max/
/ \ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
\__/ I do not promise to consider race or religion in my appointments.
I promise only that I will not consider them. -- John F. Kennedy

utf8 encoding problem

utf8 encoding problem

Comment