The reverse of encode('...', 'backslashreplace')

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • =?ISO-8859-1?Q?Tor_Erik_S=F8nvisen?=

    The reverse of encode('...', 'backslashreplace')

    Hi,

    How can I transform b so that the assertion holds? I.e., how can I
    reverse the backslash-replaced encoding, while retaining the str-type?
    >>a = u'æ'
    >>b = a.encode('ascii ', 'backslashrepla ce')
    >>b
    '\\xe6'
    >>assert isinstance(b, str) and b == 'æ'
    Traceback (most recent call last):
    File "<pyshell#5 9>", line 1, in <module>
    assert isinstance(b, str) and b == 'æ'
    AssertionError

    Regards,
    tores
  • Duncan Booth

    #2
    Re: The reverse of encode('...', 'backslashrepla ce')

    "Tor Erik Sønvisen" <torerik81@gmai l.comwrote:
    How can I transform b so that the assertion holds? I.e., how can I
    reverse the backslash-replaced encoding, while retaining the str-type?
    >
    >>>a = u'‘'
    >>>b = a.encode('ascii ', 'backslashrepla ce')
    >>>b
    '\\xe6'
    >>>assert isinstance(b, str) and b == '‘'
    >
    Traceback (most recent call last):
    File "<pyshell#5 9>", line 1, in <module>
    assert isinstance(b, str) and b == '‘'
    AssertionError
    >
    The simple answer is that you cannot: the backslashreplac e isn't a
    reversible operation. e.g. Try:
    >>a = u'\\xe6æ'
    >>print a
    \xe6æ
    >>b = a.encode('ascii ', 'backslashrepla ce')
    >>b
    '\\xe6\\xe6'
    >>>
    There is no way after the encoding that you can tell which of the \xe6
    sequences needs reversing and which doesn't. Perhaps the following is
    what you want:
    >>b = a.encode('unico de_escape')
    >>print b
    \\xe6\xe6
    >>print b.decode('unico de_escape')
    \xe6æ
    >>>

    Comment

    Working...