Problem with curses and UTF-8

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Thomas Dickey

    #16
    Re: Problem with curses and UTF-8

    "Martin v. Löwis" <martin@v.loewi s.de> wrote:[color=blue][color=green]
    >> I'll test it if someone would dumb down "link with ncursesw instead of
    >> ncurses" a little for me.
    >>
    >> I tried:
    >> ./configure --with-libs="ncursesw5 "
    >>
    >> and it failed saying:
    >> checking size of wchar_t... configure: error: cannot compute sizeof
    >> (wchar_t), 77[/color][/color]
    [color=blue]
    > If that was Python's configure: don't do that. Instead, hack setup.py[/color]

    yes - python's configure script needs a lot of work
    (alternatively, it is not the sort of script I would write).
    [color=blue]
    > to make it change the compiler/linker settings, or even edit the
    > compiler/linker line manually at first.[/color]

    that works

    --
    Thomas E. Dickey
    Thomas Dickey develops/maintains widely-used tools and libraries for software development (diffstat, yacc, mawk) and terminals (ncurses, lynx, xterm)

    ftp://invisible-island.net

    Comment

    • Thomas Dickey

      #17
      Re: Problem with curses and UTF-8

      Ian Ward <ian@excess.org > wrote:[color=blue]
      > Martin v. Löwis wrote:[/color]
      [color=blue][color=green]
      >> If that was Python's configure: don't do that. Instead, hack setup.py
      >> to make it change the compiler/linker settings, or even edit the
      >> compiler/linker line manually at first.[/color][/color]
      [color=blue]
      > Ok, that compiled.[/color]

      same here - though it was not immediately not clear which copy of ncurses it's
      using (not the shared libraries since I installed those with tracing - a
      little odd for it to use the static library, but that's what the access time
      tells me).

      To check on that (since I wanted to read the ncurses trace),
      I ran strace and ltrace to look for clues.
      [color=blue]
      > Now when I run the same test:[/color]
      [color=blue]
      > import curses
      > s = curses.initscr( )
      > s.addstr('\xc3\ x85 U+00C5 LATIN CAPITAL LETTER A WITH RING ABOVE\n')
      > s.addstr('\xc3\ xa5 U+00F5 LATIN SMALL LETTER O WITH TILDE')
      > s.refresh()
      > s.getstr()
      > curses.endwin()[/color]

      Testing this, and looking to see what's going on, I notice that python
      is doing a

      setlocale(LC_AL L, "C");

      before the addstr is actually called. (ncurses never sets the locale;
      it calls setlocale in one place to ask what it is).

      That makes ncurses think it's not really doing UTF-8, of course. What I
      see on the screen is the U+00C5 comes out with a box and a "~E" (the
      latter being ncurses' representation in POSIX for \0x85).
      [color=blue]
      > This is what I see:[/color]
      [color=blue]
      > +00C5 LATIN CAPITAL LETTER A WITH RING ABOVE
      > +00F5 LATIN SMALL LETTER O WITH TILDE[/color]

      [color=blue]
      > so, the UTF-8 characters didn't appear and the " U" at the beginning
      > became just " ".[/color]

      well - running in uxterm I see the second line properly. But some more
      tinkering is needed to make python work properly.

      --
      Thomas E. Dickey
      Thomas Dickey develops/maintains widely-used tools and libraries for software development (diffstat, yacc, mawk) and terminals (ncurses, lynx, xterm)

      ftp://invisible-island.net

      Comment

      • Thomas Dickey

        #18
        Re: Replacing curses

        Grant Edwards <grante@visi.co m> wrote:
        [color=blue]
        > Depending on what you're tring to do, slang might be an option,[/color]

        perhaps not - he's trying to use UTF-8. I haven't seen any plausible
        comment that indicates John Davis is interested in updating newt to
        work with slang2 (though of course he's welcome to show the code ;-)

        --
        Thomas E. Dickey
        Thomas Dickey develops/maintains widely-used tools and libraries for software development (diffstat, yacc, mawk) and terminals (ncurses, lynx, xterm)

        ftp://invisible-island.net

        Comment

        • Thomas Dickey

          #19
          Re: Replacing curses

          Ian Ward <ian@excess.org > wrote:[color=blue]
          > Grant Edwards wrote:[color=green]
          >> Depending on what you're tring to do, slang might be an option,[/color][/color]
          [color=blue]
          > I've looked at newt and snack, but all I really need is:
          > - a way to position the cursor at (0,0)
          > - a way to hide and show the cursor
          > - a way to detect when the terminal is resized
          > - a way to query the terminal size[/color]

          ....and send UTF-8 text, keeping track of where you really are on the screen.

          --
          Thomas E. Dickey
          Thomas Dickey develops/maintains widely-used tools and libraries for software development (diffstat, yacc, mawk) and terminals (ncurses, lynx, xterm)

          ftp://invisible-island.net

          Comment

          • Thomas Dickey

            #20
            Re: Replacing curses

            Ian Ward <ian@excess.org > wrote:[color=blue]
            > Thomas Dickey wrote:[color=green]
            >> hmm - I've read Urwid, and most of the comments I've read in that regard
            >> reflect problems in Urwid. Perhaps it's time for you to do a little analysis.
            >>
            >> (looking forward to bug reports, rather than line noise)[/color][/color]
            [color=blue]
            > A fair request. My appologies for the inflammatory subject :-)[/color]
            [color=blue]
            > When trying to check for user input without waiting I use code like:
            > window_object.n odelay(1)
            > curses.cbreak()
            > input = window_object.g etch()[/color]
            [color=blue]
            > Occasionally (hard to reproduce reliably) the cbreak() call will raise
            > an exception, but if I call it a second time before calling getch the
            > code will work properly. This problem might be related to a signal
            > interrupting the function call, I'm not sure.[/color]

            perhaps a more complete test-case would let me test it and see.
            [color=blue]
            > Also, screen resizing only seems to be reported once by getch() even if
            > the user continues to resize the window. I have worked around this by
            > calling curses.doupdate () between calls to getch(). Maybe this is by design?[/color]

            Or perhaps it's some interaction with python - I don't know.
            The applications that I use with resizing (and ncurses' test
            programs) work smoothly enough.
            [color=blue]
            > Finally, the curses escape sequence detection could be broadened. The
            > top part of the curses_display module in Urwid defines many escape
            > sequences I've run into that curses doesn't detect.[/color]

            That's data (terminfo). ncurses is data-driven, doesn't "detect"
            features of the terminal (though it does of course use environment
            variables for locale, etc.).

            xterm's terminfo lists a lot of function keys, for instance.

            The limit for predefined function-key names for terminfo is 60,
            but ncurses can accept extended terminfo descriptions (but I like to
            limit the length and style of names so it's possible to access them
            from termcap). One could define names like shift_f1, but then termcap
            applications couldn't see them. (The last I knew, slang doesn't either,
            but that's a different thread).

            That's been true for about 6 years.

            Current xterm's terminfo includes these names which apply to your
            comment: The ones on the end are extended names that ncurses' tic
            deduces from the terminfo file when it compiles it:

            comparing xterm-new to xterm-xf86-v44.
            comparing booleans.
            comparing numbers.
            comparing strings.
            kf49: '\EO3P', NULL.
            kf50: '\EO3Q', NULL.
            kf51: '\EO3R', NULL.
            kf52: '\EO3S', NULL.
            kf53: '\E[15;3~', NULL.
            kf54: '\E[17;3~', NULL.
            kf55: '\E[18;3~', NULL.
            kf56: '\E[19;3~', NULL.
            kf57: '\E[20;3~', NULL.
            kf58: '\E[21;3~', NULL.
            kf59: '\E[23;3~', NULL.
            kf60: '\E[24;3~', NULL.
            kf61: '\EO4P', NULL.
            kf62: '\EO4Q', NULL.
            kf63: '\EO4R', NULL.
            kind: '\E[1;2B', NULL.
            kri: '\E[1;2A', NULL.
            kDN: '\E[1;2B', NULL.
            kDN5: '\E[1;5B', NULL.
            kDN6: '\E[1;6B', NULL.
            kLFT5: '\E[1;5D', NULL.
            kLFT6: '\E[1;6D', NULL.
            kRIT5: '\E[1;5C', NULL.
            kRIT6: '\E[1;6C', NULL.
            kUP: '\E[1;2A', NULL.
            kUP5: '\E[1;5A', NULL.
            kUP6: '\E[1;6A', NULL.

            --
            Thomas E. Dickey
            Thomas Dickey develops/maintains widely-used tools and libraries for software development (diffstat, yacc, mawk) and terminals (ncurses, lynx, xterm)

            ftp://invisible-island.net

            Comment

            • Donn Cave

              #21
              Re: Replacing curses

              In article <11ul2eskh5q2pa b@corp.supernew s.com>,
              Thomas Dickey <dickey@saltmin e.radix.net> wrote:
              [color=blue]
              > Ian Ward <ian@excess.org > wrote:[/color]
              ....[color=blue][color=green]
              > > Also, screen resizing only seems to be reported once by getch() even if
              > > the user continues to resize the window. I have worked around this by
              > > calling curses.doupdate () between calls to getch(). Maybe this is by
              > > design?[/color]
              >
              > Or perhaps it's some interaction with python - I don't know.
              > The applications that I use with resizing (and ncurses' test
              > programs) work smoothly enough.[/color]

              I have no idea about the present application, but just as a
              general observation, when Python traps a signal, it saves
              the signal number, and makes a note to check for trapped signals
              as the next Python operation. That check iterates through the
              list of possible signals to see if any have been caught, and
              execute their respective handlers if any.

              Since an external function call is an operation, no signal
              handler will execute until it returns. At that time, the
              signal handler will execute once, at most.
              [color=blue][color=green]
              > > Finally, the curses escape sequence detection could be broadened. The
              > > top part of the curses_display module in Urwid defines many escape
              > > sequences I've run into that curses doesn't detect.[/color]
              >
              > That's data (terminfo). ncurses is data-driven, doesn't "detect"
              > features of the terminal (though it does of course use environment
              > variables for locale, etc.).
              >
              > xterm's terminfo lists a lot of function keys, for instance.[/color]

              This is just my opinion, but any application that depends
              on function keys in terminfo is broken, automatically.
              Optional support for function keys is a nice touch, but the
              data isn't good enough out there to depend on it.

              Donn Cave, donn@u.washingt on.edu

              Comment

              • Damjan

                #22
                Re: Problem with curses and UTF-8

                I just recompiled my python to link to ncursesw, and tried your example
                with a little modification:

                import curses, locale
                locale.setlocal e(locale.LC_ALL , '')
                s = curses.initscr( )
                s.addstr(u'\u00 c5 U+00C5 LATIN CAPITAL LETTER A WITH RING
                ABOVE\n'.encode ('utf-8') )
                s.addstr(u'\u00 f5 U+00F5 LATIN SMALL LETTER O WITH
                TILDE\n'.encode ('utf-8'))
                s.refresh()
                s.getstr()
                curses.endwin()

                And it works ok for me, Slackware-10.2, python-2.4.2, ncurses-5.4 all
                in KDE's konsole.
                My locale is mk_MK.UTF-8.

                Now it would be great if python's curses module worked with unicode
                strings directly.

                Comment

                • Ross Ridge

                  #23
                  Re: Replacing curses

                  Thomas Dickey wrote:[color=blue]
                  > ...and send UTF-8 text, keeping track of where you really are on the screen.[/color]

                  You make that sound so easy.

                  Ross Ridge

                  Comment

                  • Ian Ward

                    #24
                    Re: Replacing curses

                    Ross Ridge wrote:[color=blue]
                    > Thomas Dickey wrote:[color=green]
                    >>...and send UTF-8 text, keeping track of where you really are on the screen.[/color]
                    > You make that sound so easy.[/color]

                    I'll have to deal with that anyway, since I'm doing all my own wrapping,
                    justification and clipping of text. (don't talk to me about RtoL text,
                    I'm getting to it)

                    I'm going to look at the Mined text editor for some terminal behavior
                    detection code. Mined is able to produce good UTF-8 output on a variety
                    of terminals, and it links agains ncurses, not ncursesw... Interesting.

                    Ian Ward


                    Comment

                    • Thomas Dickey

                      #25
                      Re: Replacing curses

                      Ian Ward <ian@excess.org > wrote:
                      [color=blue]
                      > I'm going to look at the Mined text editor for some terminal behavior[/color]

                      mined_2000 (there's more than one program named mined, and the other
                      doesn't do UTF-8).
                      [color=blue]
                      > detection code. Mined is able to produce good UTF-8 output on a variety
                      > of terminals, and it links agains ncurses, not ncursesw... Interesting.[/color]

                      It's probably using termcap (and the wide-character functions declared
                      in wchar.h).

                      --
                      Thomas E. Dickey
                      Thomas Dickey develops/maintains widely-used tools and libraries for software development (diffstat, yacc, mawk) and terminals (ncurses, lynx, xterm)

                      ftp://invisible-island.net

                      Comment

                      • Ian Ward

                        #26
                        cursesw+setloca le fixes it! (was: Re: Problem with curses and UTF-8)

                        Damjan wrote:[color=blue]
                        > import curses, locale
                        > locale.setlocal e(locale.LC_ALL , '')
                        > s = curses.initscr( )[/color]

                        Hey, that works for me. Combined characters and wide characters are
                        working too.

                        Now the real problem.. how do I convince the python higher-ups to link
                        against cursesw by default?

                        At the very least all distros that use UTF-8 as their default encoding
                        should switch to cursesw.

                        Ian Ward

                        Comment

                        • Ross Ridge

                          #27
                          Re: Replacing curses

                          Ian Ward wrote:[color=blue]
                          > I'll have to deal with that anyway, since I'm doing all my own wrapping,
                          > justification and clipping of text.[/color]

                          In general it's impossible to know how many display positions some
                          random Unicode character might use. For example, Chinese characters
                          normally take two display positions, but the terminal your using might
                          not support them and display a single width replacement character.
                          Hopefully, you're limitted in the character set you actually need to
                          support and the terminals that your applicaiton will be using.

                          Comment

                          • Martin v. Löwis

                            #28
                            Re: cursesw+setloca le fixes it!

                            Ian Ward wrote:[color=blue]
                            > Hey, that works for me. Combined characters and wide characters are
                            > working too.
                            >
                            > Now the real problem.. how do I convince the python higher-ups to link
                            > against cursesw by default?[/color]

                            That's very easy. Contribute a working patch. That patch should support
                            all possible situations (e.g. curses is ncurses, and ncursesw is
                            available, curses is ncurses, and ncursesw is not available, curses
                            is not ncurses), and submit that patch to sf.net/projects/python.

                            Regards,
                            Martin

                            Comment

                            • Martin v. Löwis

                              #29
                              Re: cursesw+setloca le fixes it!

                              Ian Ward wrote:[color=blue]
                              > Hey, that works for me. Combined characters and wide characters are
                              > working too.
                              >
                              > Now the real problem.. how do I convince the python higher-ups to link
                              > against cursesw by default?[/color]

                              That's very easy. Contribute a working patch. That patch should support
                              all possible situations (e.g. curses is ncurses, and ncursesw is
                              available, curses is ncurses, and ncursesw is not available, curses
                              is not ncurses), and submit that patch to sf.net/projects/python.

                              Regards,
                              Martin

                              Comment

                              • Ian Ward

                                #30
                                Re: cursesw+setloca le fixes it!

                                Martin v. Löwis wrote:[color=blue]
                                > That's very easy. Contribute a working patch. That patch should support
                                > all possible situations (e.g. curses is ncurses, and ncursesw is
                                > available, curses is ncurses, and ncursesw is not available, curses
                                > is not ncurses), and submit that patch to sf.net/projects/python.[/color]

                                Done.



                                Ian Ward

                                Comment

                                Working...