Replacement in unicodestrings?

**=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=** · Oct 5 '08, 05:45 AM

Re: Replacement in unicodestrings?

s_str=repr(s.en code('UTF-8'))

It would be easier to encode this in cp1252 here, as this is apparently
the encoding that you want to use in the RTF file, too. You could then
loop over the string, replacing all bytes >= 128 with \\'%.2x

As yet another alternative, you could create a Unicode error handler
(call it 'rtf'), and then do

return s.encode('ascii ', errors='rtf')

replDic={'\xc3\ xa0':"\\'e0",'\ xc3\xa4':"\\'e4 ",'\xc3\xa1 ':"\
\'e1",
'\xc3\xa8':"\\' e8",'\xc3\xab': "\\'eb",'\xc3\x a9':"\
\'e9",
'\xc3\xb2':"\\' f2",'\xc3\xb6': "\\'f6",'\xc3\x b3':"\
\'f3",
'\xe2\x82\xac': "\\'80"}
for k in replDic.keys():
if repr(k) in s_str:
s_str=s_str.rep lace(repr(k),re plDic[k])
return s_str
>
However interactive:
>

>>>'\xc3\xab' in 'Arj\xc3\xabn'

True
>
I just don't get it, what's the difference?

It's the repr():

py'\xc3\xab' in 'Arj\xc3\xabn'
True
pyrepr('\xc3\xa b') in repr('Arj\xc3\x abn')
False
pyrepr('\xc3\xa b')
"'\\xc3\\xa b'"
pyrepr('Arj\xc3 \xabn')
"'Arj\\xc3\\xab n'"

repr('\xc3\xab' ) starts with an apostrophe, which doesn't
appear before the \\xc3 in repr('Arj\xc3\x abn').

HTH,
Martin

Replacement in unicodestrings?

Replacement in unicodestrings?

Comment