Dear all,
could somebody please just put an end to the unicode mysery I'm in,
men... The situation is that I have a Tkinter program that let's the
user enter data in some Entries and this data needs to be transformed
to the encoding compatible with an .rtf-file. In fact I only need to
do some of the usual symbols like ë etc.
Here's the function that I am using:
def pythonUnicodeTo RTFAscii(self,s ):
if isinstance(s,st r):
return s
s_str=repr(s.en code('UTF-8'))
replDic={'\xc3\ xa0':"\\'e0",'\ xc3\xa4':"\\'e4 ",'\xc3\xa1 ':"\
\'e1",
'\xc3\xa8':"\\' e8",'\xc3\xab': "\\'eb",'\xc3\x a9':"\
\'e9",
'\xc3\xb2':"\\' f2",'\xc3\xb6': "\\'f6",'\xc3\x b3':"\
\'f3",
'\xe2\x82\xac': "\\'80"}
for k in replDic.keys():
if repr(k) in s_str:
s_str=s_str.rep lace(repr(k),re plDic[k])
return s_str
So replDic represents the mapping from one encoding to the other. Now,
if I enter e.g. 'Arjën' in the Entry, then s_str in the above function
becomes 'Arj\xc3\xabn' and since replDic contains the key \xc3\xab I
would expect the replacement in the final lines of the function to
kick in. This however doesn't happen, there's no match.
However interactive:
True
I just don't get it, what's the difference? Is the above anyhow the
best way to attack such a problem?
Thanks & best wishes, Kees
could somebody please just put an end to the unicode mysery I'm in,
men... The situation is that I have a Tkinter program that let's the
user enter data in some Entries and this data needs to be transformed
to the encoding compatible with an .rtf-file. In fact I only need to
do some of the usual symbols like ë etc.
Here's the function that I am using:
def pythonUnicodeTo RTFAscii(self,s ):
if isinstance(s,st r):
return s
s_str=repr(s.en code('UTF-8'))
replDic={'\xc3\ xa0':"\\'e0",'\ xc3\xa4':"\\'e4 ",'\xc3\xa1 ':"\
\'e1",
'\xc3\xa8':"\\' e8",'\xc3\xab': "\\'eb",'\xc3\x a9':"\
\'e9",
'\xc3\xb2':"\\' f2",'\xc3\xb6': "\\'f6",'\xc3\x b3':"\
\'f3",
'\xe2\x82\xac': "\\'80"}
for k in replDic.keys():
if repr(k) in s_str:
s_str=s_str.rep lace(repr(k),re plDic[k])
return s_str
So replDic represents the mapping from one encoding to the other. Now,
if I enter e.g. 'Arjën' in the Entry, then s_str in the above function
becomes 'Arj\xc3\xabn' and since replDic contains the key \xc3\xab I
would expect the replacement in the final lines of the function to
kick in. This however doesn't happen, there's no match.
However interactive:
>>'\xc3\xab' in 'Arj\xc3\xabn'
I just don't get it, what's the difference? Is the above anyhow the
best way to attack such a problem?
Thanks & best wishes, Kees
Comment