hello python-list!
the other day, i was trying to match unicode character sequences that
looked like this:
\\uAD0X...
my issue, is that the pattern i used was returning:
[ '\\uAD0X', '\\u1BF3', ... ]
when i expected:
[ '\\uAD0X\\u1BF3 ', ]
the code looks something like this:
pat = re.compile("(\\ \u[0-9A-F]{4})+", re.UNICODE|re.L OCALE)
#print pat.findall(txt _line)
results = pat.finditer(tx t_line)
i ran the pattern through a couple of my colleagues and they were all
in agreement that my pattern should have matched correctly.
is this a simple case of a messed up regex or am i not using the regex
api correctly?
cheers,
ct
the other day, i was trying to match unicode character sequences that
looked like this:
\\uAD0X...
my issue, is that the pattern i used was returning:
[ '\\uAD0X', '\\u1BF3', ... ]
when i expected:
[ '\\uAD0X\\u1BF3 ', ]
the code looks something like this:
pat = re.compile("(\\ \u[0-9A-F]{4})+", re.UNICODE|re.L OCALE)
#print pat.findall(txt _line)
results = pat.finditer(tx t_line)
i ran the pattern through a couple of my colleagues and they were all
in agreement that my pattern should have matched correctly.
is this a simple case of a messed up regex or am i not using the regex
api correctly?
cheers,
ct
Comment