Re: convert unicode characters to visibly similar ascii characters

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Terry Reedy

    Re: convert unicode characters to visibly similar ascii characters



    Peter Bulychev wrote:
    Hello.
    >
    I want to convert unicode character into ascii one.
    The method ".encode('ASCII ') " can convert only those unicode
    characters, which fit into 0..128 range.
    >
    But there are still lots of characters beyond this range, which can be
    manually converted to some visibly similar ascii characters. For
    instance, there are several quotation marks in unicode, which can be
    converted into ascii quotation mark.
    >
    Can this conversion be performed in automatic manner? After googling
    I've only found that there exists Unicode database, which stores
    human-readable information on notation of all unicode characters
    (ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt). And there also
    exists the Python adapter for this database
    (http://docs.python.org/lib/module-unicodedata.html). Using this
    database I can do something like `if
    notation.find(' QUOTATION')!=-1:\n\treturn "'"`. I believe there is more
    elegant way. Am I right?
    I believe you will have to make up your own translation dictionary for
    the translations *you* want. You should then be able to use that with
    the .translate() method.

    tjr

Working...