On all platfroms \w matches all unicode letters when used with flag
re.UNICODE, but this doesn't work on SuSE 9.2:
Python 2.3.4 (#1, Dec 17 2004, 19:56:48)
[GCC 3.3.4 (pre 3.3.5 20040809)] on linux2
Type "help", "copyright" , "credits" or "license" for more information.[color=blue][color=green][color=darkred]
>>> import re
>>> re.compile(ur'\ w+', re.U).match(u'\ xe4')
>>>[/color][/color][/color]
BTW, is correctly recognize this character as lowercase letter:[color=blue][color=green][color=darkred]
>>> import unicodedata
>>> unicodedata.cat egory(u'\xe4')[/color][/color][/color]
'Ll'
I've looked through all SuSE patches applied, but found nothing related.
What is the reason for broken behavior? Incorrect configure options?
--
Denis S. Otkidach
http://www.python.ru/ [ru]
re.UNICODE, but this doesn't work on SuSE 9.2:
Python 2.3.4 (#1, Dec 17 2004, 19:56:48)
[GCC 3.3.4 (pre 3.3.5 20040809)] on linux2
Type "help", "copyright" , "credits" or "license" for more information.[color=blue][color=green][color=darkred]
>>> import re
>>> re.compile(ur'\ w+', re.U).match(u'\ xe4')
>>>[/color][/color][/color]
BTW, is correctly recognize this character as lowercase letter:[color=blue][color=green][color=darkred]
>>> import unicodedata
>>> unicodedata.cat egory(u'\xe4')[/color][/color][/color]
'Ll'
I've looked through all SuSE patches applied, but found nothing related.
What is the reason for broken behavior? Incorrect configure options?
--
Denis S. Otkidach
http://www.python.ru/ [ru]
Comment