Hi group,
I need to parse various text files in python. I was wondering if there was a
general purpose tokenizer available. I know about split(), but this
(otherwise very handy method does not allow me to specify a list of
splitting characters, only one at the time and it removes my splitting
operators (OK for spaces and \n's but not for =, / etc. Furthermore I tried
tokenize but this specifically for Python and is way too heavy for me. I am
looking for something like this:
splitchars = [' ', '\n', '=', '/', ....]
tokenlist = tokenize(rawfil e, splitchars)
Is there something like this available inside Python or did anyone already
make this? Thank you in advance
Maarten
--
=============== =============== =============== =============== =======
Maarten van Reeuwijk Heat and Fluid Sciences
Phd student dept. of Multiscale Physics
www.ws.tn.tudelft.nl Delft University of Technology
I need to parse various text files in python. I was wondering if there was a
general purpose tokenizer available. I know about split(), but this
(otherwise very handy method does not allow me to specify a list of
splitting characters, only one at the time and it removes my splitting
operators (OK for spaces and \n's but not for =, / etc. Furthermore I tried
tokenize but this specifically for Python and is way too heavy for me. I am
looking for something like this:
splitchars = [' ', '\n', '=', '/', ....]
tokenlist = tokenize(rawfil e, splitchars)
Is there something like this available inside Python or did anyone already
make this? Thank you in advance
Maarten
--
=============== =============== =============== =============== =======
Maarten van Reeuwijk Heat and Fluid Sciences
Phd student dept. of Multiscale Physics
www.ws.tn.tudelft.nl Delft University of Technology
Comment