u'a' in string.letters fails: a Python 2.3 bug?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Edward K. Ream

    u'a' in string.letters fails: a Python 2.3 bug?

    From the documentation for the string module at:

    C:\Python23\Doc \Python-Docs-2.3.1\lib\modul e-string.html

    [quote]
    letters: The concatenation of the strings lowercase and uppercase described
    below. The specific value is locale-dependent, and will be updated when
    locale.setlocal e() is called.
    [end quote]

    If uch is a unicode character, the operation

    uch in string.letters

    may (will?) fail in Python 2.3. I've never seen it fail in previous
    versions. Examples:

    Python 2.3.1 (#47, Sep 23 2003, 23:47:32) [MSC v.1200 32 bit (Intel)] on
    win32
    [snip]
    IDLE 1.0[color=blue][color=green][color=darkred]
    >>> import string
    >>> '\xa6' in string.digits[/color][/color][/color]
    False[color=blue][color=green][color=darkred]
    >>> '\xa6' in string.letters[/color][/color][/color]
    False[color=blue][color=green][color=darkred]
    >>> u'\xa6' in string.letters[/color][/color][/color]

    Traceback (most recent call last):
    File "<pyshell#3 >", line 1, in -toplevel-
    u'\xa6' in string.letters
    UnicodeDecodeEr ror: 'ascii' codec can't decode byte 0x83 in position 52:
    ordinal not in range(128)[color=blue][color=green][color=darkred]
    >>>u'\xa6' in string.ascii_le tters[/color][/color][/color]
    False[color=blue][color=green][color=darkred]
    >>> u'a' in string.letters[/color][/color][/color]

    Traceback (most recent call last):
    File "<pyshell#1 >", line 1, in -toplevel-
    u'a' in string.letters
    UnicodeDecodeEr ror: 'ascii' codec can't decode byte 0x83 in position 52:
    ordinal not in range(128)

    Questions:

    1. Is this a bug, or am I missing something?

    2. Is this an issue only with Idle? I think not completely: this kind of
    code seems to work for my app on XP, and not for some of my app's users on
    Linux.

    3. Is replacing string.letters by string.ascii_le tters the recommended
    workaround?

    Edward

    P.S.
    [color=blue][color=green][color=darkred]
    >>> string.letters[/color][/color][/color]
    'ABCDEFGHIJKLMN OPQRSTUVWXYZabc defghijklmnopqr stuvwxyz\x83\x8 a\x8c\x8e\x9a\x 9
    c\x9e\x9f\xaa\x b5\xba\xc0\xc1\ xc2\xc3\xc4\xc5 \xc6\xc7\xc8\xc 9\xca\xcb\xcc\x c
    d\xce\xcf\xd0\x d1\xd2\xd3\xd4\ xd5\xd6\xd8\xd9 \xda\xdb\xdc\xd d\xde\xdf\xe0\x e
    1\xe2\xe3\xe4\x e5\xe6\xe7\xe8\ xe9\xea\xeb\xec \xed\xee\xef\xf 0\xf1\xf2\xf3\x f
    4\xf5\xf6\xf8\x f9\xfa\xfb\xfc\ xfd\xfe\xff'

    EKR
    --------------------------------------------------------------------
    Edward K. Ream email: edreamleo@chart er.net
    Leo: Literate Editor with Outlines
    Leo: http://webpages.charter.net/edreamleo/front.html
    --------------------------------------------------------------------


  • Michael Hudson

    #2
    Re: u'a' in string.letters fails: a Python 2.3 bug?

    "Edward K. Ream" <edreamleo@char ter.net> writes:
    [color=blue]
    > From the documentation for the string module at:
    >
    > C:\Python23\Doc \Python-Docs-2.3.1\lib\modul e-string.html
    >
    > [quote]
    > letters: The concatenation of the strings lowercase and uppercase described
    > below. The specific value is locale-dependent, and will be updated when
    > locale.setlocal e() is called.
    > [end quote]
    >
    > If uch is a unicode character, the operation
    >
    > uch in string.letters
    >
    > may (will?) fail in Python 2.3. I've never seen it fail in previous
    > versions.[/color]

    Must be because you weren't looking <wink>:

    Python 2.2.1 (#1, Apr 9 2002, 13:10:27)
    [GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-98)] on linux2
    Type "help", "copyright" , "credits" or "license" for more information.[color=blue][color=green][color=darkred]
    >>> u'a' in string.letters[/color][/color][/color]
    Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    UnicodeError: ASCII decoding error: ordinal not in range(128)

    (python from Sean's 2.2.1 RPM on redhat 7.2-ish).
    [color=blue]
    > 1. Is this a bug, or am I missing something?[/color]

    What you may be missing is that factors including but not limited to
    readline, the way python was invoked, orders of imports, locals
    settings and the phase of the moon may have an effect on whether
    "ordinals not in range(128)" get into string.letters.

    I think the interaction of readline and locale settings got a going
    over for 2.3 which *might* explain any differences you're seeing.
    [color=blue]
    > 2. Is this an issue only with Idle? I think not completely: this
    > kind of code seems to work for my app on XP, and not for some of my
    > app's users on Linux.[/color]

    See above :-)
    [color=blue]
    > 3. Is replacing string.letters by string.ascii_le tters the recommended
    > workaround?[/color]

    Err, probably. Depends what you're testing for, I guess. Wouldn't
    uch.isalpha() or one of the unicodedata thingies be more appropriate
    most of the time?

    Cheers,
    mwh

    PS: on typing control-D into the 2.2.1 session above, I get a
    segfault. Now *that's* got to be a bug!

    --
    surely, somewhere, somehow, in the history of computing, at least
    one manual has been written that you could at least remotely
    attempt to consider possibly glancing at. -- Adam Rixey

    Comment

    • Edward K. Ream

      #3
      Re: u'a' in string.letters fails: a Python 2.3 bug?

      > > 1. Is this a bug, or am I missing something?[color=blue]
      >
      > What you may be missing is that factors including but not limited to
      > readline, the way python was invoked, orders of imports, locals
      > settings and the phase of the moon may have an effect on whether
      > "ordinals not in range(128)" get into string.letters.[/color]

      Thanks for this info. I wonder why string.letters remains. Shouldn't it be
      deprecated?

      I've substituted string.ascii_le tters for string.letters as a temporary
      expedient, and will consider ch.isalpha() for future work. Thanks again.

      Edward
      --------------------------------------------------------------------
      Edward K. Ream email: edreamleo@chart er.net
      Leo: Literate Editor with Outlines
      Leo: http://webpages.charter.net/edreamleo/front.html
      --------------------------------------------------------------------


      Comment

      Working...