How to emit Cyrillic and Chinese via unicode from console mode?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Siegfried Heintze

    How to emit Cyrillic and Chinese via unicode from console mode?

    Can someone point me to an example of a little program that emits non-ascii
    Unicode characters (Russian or Chinese perhaps)? The unicode
    Russian/Cyrillic alphabet starts at 0x410. Is this possible to do in a
    console mode program? If not, I guess I would want to use pywin32 to create
    a window and a message pump and display it there. I anticipate using pywin32
    for some other function calls.

    Thanks!
    Siegfried


  • rs387

    #2
    Re: How to emit Cyrillic and Chinese via unicode from console mode?

    On Sep 14, 2:03 am, "Siegfried Heintze" <siegfr...@hein tze.comwrote:
    Can someone point me to an example of a little program that emits non-ascii
    Unicode characters (Russian or Chinese perhaps)?
    The following doesn't quite work, but I'll post it anyway since it
    actually ends up printing the characters. Perhaps someone can point
    out what causes the exception at the end?

    The important thing is to set the console codepage to 65001, which is
    UTF-8. This lets you output utf8-encoded text and see the Unicode
    chars displayed.



    import sys
    import encodings.utf_8
    import win32console

    sys.stdout = encodings.utf_8 .StreamWriter(s ys.stdout)

    win32console.Se tConsoleCP(6500 1)
    win32console.Se tConsoleOutputC P(65001)

    s = "English: ok\n"
    s += u'Russian: \u0420\u0443\u0 441\u0441\u043a \u0438\u0439\n'
    s += u'Greek: \u03bc\u03b5\u0 3b3\u03b1\u03bb \u03cd
    \u03c4\u03b5\u0 3c1\u03b7\n'

    print s



    If redirected to file, all is well, this prints everything properly in
    UTF-8. If ran on the console, this also prints everything correctly,
    but then throws a mysterious exception:

    English: ok
    Russian: Русский
    Greek: μεγαλύτΠµÏÎ·
    Traceback (most recent call last):
    File "I:\Temp\utf8co nsole.py", line 18, in <module>
    print s
    File "C:\Progs\Pytho n25\lib\codecs. py", line 304, in write
    self.stream.wri te(data)
    IOError: [Errno 0] Error

    Any ideas?

    Roman

    P.S. This really ought to Just Work in this day and age, and do so
    without all those 65001/utf8 incantations. Pity that it doesn't. Sigh.

    Comment

    • Gertjan Klein

      #3
      Re: How to emit Cyrillic and Chinese via unicode from console mode?

      rs387 wrote:
      >sys.stdout = encodings.utf_8 .StreamWriter(s ys.stdout)
      >
      >win32console.S etConsoleCP(650 01)
      >win32console.S etConsoleOutput CP(65001)
      [...]
      >If redirected to file, all is well, this prints everything properly in
      >UTF-8. If ran on the console, this also prints everything correctly,
      >but then throws a mysterious exception:
      Interesting. On my system (Windows XP) the console codepage does not
      change, and hence the characters don't print properly (I get some of the
      CP437 line drawing characters, for example). I have never been able to
      convince windows to assume/support UTF-8 encoding in the console,
      programatically or otherwise. :(

      Gertjan.

      --
      Gertjan Klein <gklein@xs4all. nl>

      Comment

      • rs387

        #4
        Re: How to emit Cyrillic and Chinese via unicode from console mode?

        On Sep 14, 11:51 am, Gertjan Klein <gkl...@xs4all. nlwrote:
        Interesting. On my system (Windows XP) the console codepage does not
        change, and hence the characters don't print properly (I get some of the
        CP437 line drawing characters, for example). I have never been able to
        convince windows to assume/support UTF-8 encoding in the console,
        programatically or otherwise. :(
        I found that a useful test is to create a directory whose name
        contains chars from various languages, then run "cmd" and see if "dir"
        displays them correctly. This worked fine for me in WinXP, even though
        my system locale is Russian. Would be interesting to know if you can
        get the console to display international chars this way.

        Comment

        • pataphor

          #5
          Re: How to emit Cyrillic and Chinese via unicode from console mode?

          On Sun, 14 Sep 2008 01:02:39 -0700 (PDT)
          rs387 <rstarkov@gmail .comwrote:
          On Sep 14, 2:03 am, "Siegfried Heintze" <siegfr...@hein tze.comwrote:
          Can someone point me to an example of a little program that emits
          non-ascii Unicode characters (Russian or Chinese perhaps)?
          The following doesn't quite work, but I'll post it anyway since it
          actually ends up printing the characters.
          That's more like it! Just answer with whatever one has. Here's another
          gem:

          from Tkinter import *
          from collections import deque

          def byn(x,n =5 ):
          L = deque(x)
          R = []
          while L:
          R.append(L.popl eft())
          if len(R) == n:
          yield ''.join(R)
          R = []
          if R:
          yield ''.join(R)

          root = Tk()
          start = int('16A6',16)
          end = int('16F0',16)
          g = (unichr(i) for i in xrange(start, end+1))
          L = byn(g,16)
          s = '\n'.join(L)
          w = Label(root, text=s,font = ("freemono","80 "))
          w.pack()

          root.mainloop()

          P.

          Comment

          • =?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=

            #6
            Re: How to emit Cyrillic and Chinese via unicode from console mode?

            Can someone point me to an example of a little program that emits non-ascii
            Unicode characters (Russian or Chinese perhaps)? The unicode
            Russian/Cyrillic alphabet starts at 0x410. Is this possible to do in a
            console mode program? If not, I guess I would want to use pywin32 to create
            a window and a message pump and display it there. I anticipate using pywin32
            for some other function calls.
            It also depends on your console. On Linux, with an UTF-8
            capable-and-enabled console,

            pyprint u"\u0413"
            Г

            work just fine.

            Regards,
            Martin

            Comment

            Working...