does raw_input() return unicode?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Stuart McGraw

    does raw_input() return unicode?

    In the announcement for Python-2.3

    it says "raw_input( ): can now return Unicode objects".

    But I didn't see anything about this in Andrew Kuchling's
    "2.3 What's New", nor does the current python docs for
    raw_input() say anything about this. A test on a MS
    Windows system with a cp932 (japanese) default locale
    shows the object returned by raw_input() is a str() object
    containing cp932 encoded text. This remained true even
    when I set Python's default encoding to cp932 (in
    sitecustomize.p y).

    So, does raw_input() ever return unicode objects and if
    so, under what conditions?

  • Duncan Booth

    #2
    Re: does raw_input() return unicode?

    "Stuart McGraw" <smcg4191zz@fri izz.RimoovAllZZ s.comwrote:
    So, does raw_input() ever return unicode objects and if
    so, under what conditions?
    >
    It returns unicode if reading from sys.stdin returns unicode.

    Unfortunately, I can't tell you how to make sys.stdin return unicode for
    use with raw_input. I tried what I thought should work and as you can see
    it messed up the buffering on stdin. Does anyone else know how to wrap
    sys.stdin so it returns unicode but is still unbuffered?

    Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)]
    on win32
    Type "help", "copyright" , "credits" or "license" for more information.
    >>import sys, codecs
    >>sys.stdin.enc oding
    'cp437'
    >>sys.stdin = codecs.getreade r(sys.stdin.enc oding)(sys.stdi n)
    >>raw_input()
    hello world
    still going?
    ^Z
    ^Z
    u'hello world'
    >>>

    Comment

    • Martin v. Löwis

      #3
      Re: does raw_input() return unicode?

      Stuart McGraw schrieb:
      So, does raw_input() ever return unicode objects and if
      so, under what conditions?
      At the moment, it only returns unicode objects when invoked
      in the IDLE shell, and only if the character entered cannot
      be represented in the locale's charset.

      Regards,
      Martin

      Comment

      • Theerasak Photha

        #4
        Re: does raw_input() return unicode?

        On 10/10/06, "Martin v. Löwis" <martin@v.loewi s.dewrote:
        Stuart McGraw schrieb:
        So, does raw_input() ever return unicode objects and if
        so, under what conditions?
        >
        At the moment, it only returns unicode objects when invoked
        in the IDLE shell, and only if the character entered cannot
        be represented in the locale's charset.
        Why only IDLE? Does urwid or another console UI toolkit avoid this somehow?

        -- Theerasak

        Comment

        • Fredrik Lundh

          #5
          Re: does raw_input() return unicode?

          Theerasak Photha wrote:
          >>So, does raw_input() ever return unicode objects and if
          >>so, under what conditions?
          >>
          >At the moment, it only returns unicode objects when invoked
          >in the IDLE shell, and only if the character entered cannot
          >be represented in the locale's charset.
          >
          Why only IDLE? Does urwid or another console UI toolkit avoid this somehow?
          Martin was probably thinking of the standard distribution.

          The 2.3 note says that "raw_input( ) *can* return Unicode", not that it
          "should" or "must" do it.

          </F>

          Comment

          • Theerasak Photha

            #6
            Re: does raw_input() return unicode?

            On 10/10/06, Fredrik Lundh <fredrik@python ware.comwrote:
            Martin was probably thinking of the standard distribution.
            >
            The 2.3 note says that "raw_input( ) *can* return Unicode", not that it
            "should" or "must" do it.
            Practically speaking, at the heart of the matter: as of Python 2.5
            final, does or can raw_input() return Unicode under the appropriate
            circumstances, according to user wishes?

            (Yes, I would test, but I am presently away from my Linux box with
            Python, and can't install it here.)

            -- Theerasak

            Comment

            • Fredrik Lundh

              #7
              Re: does raw_input() return unicode?

              Theerasak Photha wrote:
              Practically speaking, at the heart of the matter: as of Python 2.5
              final, does or can raw_input() return Unicode under the appropriate
              circumstances, according to user wishes?
              didn't Martin just answer that question?

              </F>

              Comment

              • Theerasak Photha

                #8
                Re: does raw_input() return unicode?

                On 10/10/06, Fredrik Lundh <fredrik@python ware.comwrote:
                Theerasak Photha wrote:
                >
                Practically speaking, at the heart of the matter: as of Python 2.5
                final, does or can raw_input() return Unicode under the appropriate
                circumstances, according to user wishes?
                >
                didn't Martin just answer that question?
                *slaps forehead* D'oh!

                -- Theerasak

                Comment

                • Martin v. Löwis

                  #9
                  Re: does raw_input() return unicode?

                  Theerasak Photha schrieb:
                  >At the moment, it only returns unicode objects when invoked
                  >in the IDLE shell, and only if the character entered cannot
                  >be represented in the locale's charset.
                  >
                  Why only IDLE? Does urwid or another console UI toolkit avoid this somehow?
                  I admit I don't know what urwid is; from a shallow description I find
                  ("a console user interface library") I can't see the connection to
                  raw_input(). How would raw_input() ever use urwid?

                  Regards,
                  Martin

                  Comment

                  • Martin v. Löwis

                    #10
                    Re: does raw_input() return unicode?

                    Theerasak Photha schrieb:
                    >At the moment, it only returns unicode objects when invoked
                    >in the IDLE shell, and only if the character entered cannot
                    >be represented in the locale's charset.
                    >
                    Why only IDLE? Does urwid or another console UI toolkit avoid this somehow?
                    I admit I don't know what urwid is; from a shallow description I find
                    ("a console user interface library") I can't see the connection to
                    raw_input(). How would raw_input() ever use urwid?

                    Regards,
                    Martin

                    Comment

                    • Stuart McGraw

                      #11
                      Re: does raw_input() return unicode?


                      "Martin v. Löwis" <martin@v.loewi s.dewrote in message news:452b5190$0 $29833$9b622d9e @news.freenet.d e...
                      Stuart McGraw schrieb:
                      So, does raw_input() ever return unicode objects and if
                      so, under what conditions?
                      >
                      At the moment, it only returns unicode objects when invoked
                      in the IDLE shell, and only if the character entered cannot
                      be represented in the locale's charset.
                      Thanks for the answer.

                      Also, if anyone has a solution for Duncan Booth's attempt
                      to wrap stdin, I would find it very useful too!

                      "Duncan Booth" <duncan.booth@i nvalid.invalidw rote:
                      "Stuart McGraw" <smcg4191zz@fri izz.RimoovAllZZ s.comwrote:
                      >
                      So, does raw_input() ever return unicode objects and if
                      so, under what conditions?
                      It returns unicode if reading from sys.stdin returns unicode.
                      >
                      Unfortunately, I can't tell you how to make sys.stdin return unicode for
                      use with raw_input. I tried what I thought should work and as you can see
                      it messed up the buffering on stdin. Does anyone else know how to wrap
                      sys.stdin so it returns unicode but is still unbuffered?
                      >
                      Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)]
                      on win32
                      Type "help", "copyright" , "credits" or "license" for more information.
                      >import sys, codecs
                      >sys.stdin.enco ding
                      'cp437'
                      >sys.stdin = codecs.getreade r(sys.stdin.enc oding)(sys.stdi n)
                      >raw_input()
                      hello world
                      still going?
                      ^Z
                      ^Z
                      u'hello world'
                      >>

                      Comment

                      • Theerasak Photha

                        #12
                        Re: does raw_input() return unicode?

                        On 10/10/06, "Martin v. Löwis" <martin@v.loewi s.dewrote:
                        Theerasak Photha schrieb:
                        At the moment, it only returns unicode objects when invoked
                        in the IDLE shell, and only if the character entered cannot
                        be represented in the locale's charset.
                        Why only IDLE? Does urwid or another console UI toolkit avoid this somehow?
                        >
                        I admit I don't know what urwid is; from a shallow description I find
                        ("a console user interface library") I can't see the connection to
                        raw_input(). How would raw_input() ever use urwid?
                        The other way around: would urwid use raw_input() or other Python
                        input functions anywhere?

                        And what causes Unicode input to work in IDLE alone?

                        -- Theerasak

                        Comment

                        • Leo Kislov

                          #13
                          Re: does raw_input() return unicode?


                          Theerasak Photha wrote:
                          On 10/10/06, "Martin v. Löwis" <martin@v.loewi s.dewrote:
                          Theerasak Photha schrieb:
                          >At the moment, it only returns unicode objects when invoked
                          >in the IDLE shell, and only if the character entered cannot
                          >be represented in the locale's charset.
                          >
                          Why only IDLE? Does urwid or another console UI toolkit avoid this somehow?
                          I admit I don't know what urwid is; from a shallow description I find
                          ("a console user interface library") I can't see the connection to
                          raw_input(). How would raw_input() ever use urwid?
                          >
                          The other way around: would urwid use raw_input() or other Python
                          input functions anywhere?
                          >
                          And what causes Unicode input to work in IDLE alone?
                          Other applications except python are actually free to implement unicode
                          stdin. python cannot do it because of backward compatibility. You can
                          argue that python interactive console could do it too, but think about
                          it this way: python interactive console deliberately behaves like a
                          running python program would.

                          Comment

                          • Leo Kislov

                            #14
                            Re: does raw_input() return unicode?


                            Duncan Booth wrote:
                            "Stuart McGraw" <smcg4191zz@fri izz.RimoovAllZZ s.comwrote:
                            >
                            So, does raw_input() ever return unicode objects and if
                            so, under what conditions?
                            It returns unicode if reading from sys.stdin returns unicode.
                            >
                            Unfortunately, I can't tell you how to make sys.stdin return unicode for
                            use with raw_input. I tried what I thought should work and as you can see
                            it messed up the buffering on stdin. Does anyone else know how to wrap
                            sys.stdin so it returns unicode but is still unbuffered?
                            Considering that all consoles are ascii based, the following should
                            work where python was able to determine terminal encoding:

                            class ustdio(object):
                            def __init__(self, stream):
                            self.stream = stream
                            self.encoding = stream.encoding
                            def readline(self):
                            return self.stream.rea dline().decode( self.encoding)

                            sys.stdin = ustdio(sys.stdi n)

                            answer = raw_input()
                            print type(answer)

                            Comment

                            • Martin v. Löwis

                              #15
                              Re: does raw_input() return unicode?

                              Theerasak Photha schrieb:
                              The other way around: would urwid use raw_input() or other Python
                              input functions anywhere?
                              Since I still don't know what urwid is, I can't answer the question.
                              It should be easy enough to grep its source code to find out whether
                              it ever uses raw_input.
                              And what causes Unicode input to work in IDLE alone?
                              Because in IDLE, it is possible to enter characters that are not
                              in the user's charset. For example, if the user's charset is
                              cp-1252 (western european), you can still enter cyrillic characters
                              into IDLE. This is not possible in a regular terminal.

                              Regards,
                              Martin

                              Comment

                              Working...