Hi All,
I couldn't find a regular expressions group to ask this in, so I
thought I'd ask here as I'm a little familiar with php's regular
expressions syntax.
I have a comma delimited text file that I need to change to being tab
delimited.
My problem is that commas appear in the values of one of my columns,
and I'm trying to think of a graceful way of changing the other commas
(ie those that do indicate the delimitation of a field, rather than
which appear within the value of a field) in the file to tabs without
affecting the commas that appear in the column in question.
An example of the contents of the file would be:
1,"1","20040301 ","08-08","BOOK, RETAIL",20.00,2 3.56
2,"1","20040301 ","03-09","BOOK, WHOLESALE, DISTRIBUTOR",15 .99,22.00
So, I'm trying to create a regular expression that will change all the
commas to tabs, except where the comma(s) appear within quotes.
I've tried several different approaches, including a three-step
process where I just change the commas that appear within quotes to a
known 'escape' value, then changing all the commas to tabs, then
changing the 'escape' values back to commas, but I can't seem to
create a regular expression that will take into account the
possibility of several commas appearing between quotes.
I'm wondering if anyone can help me understand this better?
Many thanks in advance,
Murray
I couldn't find a regular expressions group to ask this in, so I
thought I'd ask here as I'm a little familiar with php's regular
expressions syntax.
I have a comma delimited text file that I need to change to being tab
delimited.
My problem is that commas appear in the values of one of my columns,
and I'm trying to think of a graceful way of changing the other commas
(ie those that do indicate the delimitation of a field, rather than
which appear within the value of a field) in the file to tabs without
affecting the commas that appear in the column in question.
An example of the contents of the file would be:
1,"1","20040301 ","08-08","BOOK, RETAIL",20.00,2 3.56
2,"1","20040301 ","03-09","BOOK, WHOLESALE, DISTRIBUTOR",15 .99,22.00
So, I'm trying to create a regular expression that will change all the
commas to tabs, except where the comma(s) appear within quotes.
I've tried several different approaches, including a three-step
process where I just change the commas that appear within quotes to a
known 'escape' value, then changing all the commas to tabs, then
changing the 'escape' values back to commas, but I can't seem to
create a regular expression that will take into account the
possibility of several commas appearing between quotes.
I'm wondering if anyone can help me understand this better?
Many thanks in advance,
Murray
Comment