What is a most simple expression for splitting a CSV line with "-protected
fields?
>
s='"123","a,b,\ "c\"",5.640 '
Use the csv-module. It should have a dialect for this, albeit I'm not 100%
sure if the escaping of the " is done properly from csv POV. Might be that
it requires excel-standard.
What is a most simple expression for splitting a CSV line
with "-protected fields?
s='"123","a,b,\ "c\"",5.640 '
>
import csv
>
the preferred way is to read the file using that module. if you insist
on processing a single line, you can do
>
cols = list(csv.reader ([string]))
>
</F>
Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit
(Intel)] on win32
| >>import csv
| >>s='"123","a,b ,\"c\"",5.640 '
| >>cols = list(csv.reader ([s]))
| >>cols
[['123', 'a,b,c""', '5.640']]
# maybe we need a bit more:
| >>cols = list(csv.reader ([s]))[0]
| >>cols
['123', 'a,b,c""', '5.640']
I'd guess that the OP is expecting 'a,b,"c"' for the second field.
What is a most simple expression for splitting a CSV line
with "-protected fields?
>
s='"123","a,b,\ "c\"",5.640 '
import csv
the preferred way is to read the file using that module. if you insist
on processing a single line, you can do
cols = list(csv.reader ([string]))
</F>
>
Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit
(Intel)] on win32
| >>import csv
| >>s='"123","a,b ,\"c\"",5.640 '
| >>cols = list(csv.reader ([s]))
| >>cols
[['123', 'a,b,c""', '5.640']]
# maybe we need a bit more:
| >>cols = list(csv.reader ([s]))[0]
| >>cols
['123', 'a,b,c""', '5.640']
>
I'd guess that the OP is expecting 'a,b,"c"' for the second field.
>
Twiddling with the knobs doesn't appear to help:
>
| >>list(csv.read er([s], escapechar='\\' ))[0]
['123', 'a,b,c""', '5.640']
| >>list(csv.read er([s], escapechar='\\' , doublequote=Fal se))[0]
['123', 'a,b,c""', '5.640']
>
Looks like a bug to me; AFAICT from the docs, the last attempt should
have worked.
Given Peter Otten's post, looks like
(1) there's a bug in the "fmtparam" mechanism -- it's ignoring the
escapechar in my first twiddle, which should give the same result as
Peter's.
(2)
| >>csv.excel.dou blequote
True
According to my reading of the docs:
"""
doublequote
Controls how instances of quotechar appearing inside a field should be
themselves be quoted. When True, the character is doubled. When False,
the escapechar is used as a prefix to the quotechar. It defaults to
True.
"""
Peter's example should not have worked.
Given Peter Otten's post, looks like
(1) there's a bug in the "fmtparam" mechanism -- it's ignoring the
escapechar in my first twiddle, which should give the same result as
Peter's.
(2)
| >>csv.excel.dou blequote
True
According to my reading of the docs:
"""
doublequote
Controls how instances of quotechar appearing inside a field should be
themselves be quoted. When True, the character is doubled. When False,
the escapechar is used as a prefix to the quotechar. It defaults to
True.
"""
Peter's example should not have worked.
the documentation also mentions a "quoting" parameter that "controls
when quotes should be generated by the writer and recognised by the
reader.". not sure how that changes things.
anyway, it's either unclear documentation or a bug in the code. better
submit a bug report so someone can fix one of them.
robert wrote:
>
What is a most simple expression for splitting a CSV line
with "-protected fields?
s='"123","a,b,\ "c\"",5.640 '
>
import csv
>
the preferred way is to read the file using that module. if you insist
on processing a single line, you can do
>
cols = list(csv.reader ([string]))
>
</F>
Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit
(Intel)] on win32
| >>import csv
| >>s='"123","a,b ,\"c\"",5.640 '
| >>cols = list(csv.reader ([s]))
| >>cols
[['123', 'a,b,c""', '5.640']]
# maybe we need a bit more:
| >>cols = list(csv.reader ([s]))[0]
| >>cols
['123', 'a,b,c""', '5.640']
I'd guess that the OP is expecting 'a,b,"c"' for the second field.
Looks like a bug to me; AFAICT from the docs, the last attempt should
have worked.
>
Given Peter Otten's post, looks like
(1) there's a bug in the "fmtparam" mechanism -- it's ignoring the
escapechar in my first twiddle, which should give the same result as
Peter's.
(2)
| >>csv.excel.dou blequote
True
According to my reading of the docs:
"""
doublequote
Controls how instances of quotechar appearing inside a field should be
themselves be quoted. When True, the character is doubled. When False,
the escapechar is used as a prefix to the quotechar. It defaults to
True.
"""
Peter's example should not have worked.
Doh. The OP's string was a raw string. I need some sleep.
Scrap bug #1!
Given Peter Otten's post, looks like
(1) there's a bug in the "fmtparam" mechanism -- it's ignoring the
escapechar in my first twiddle, which should give the same result as
Peter's.
(2)
| >>csv.excel.dou blequote
True
According to my reading of the docs:
"""
doublequote
Controls how instances of quotechar appearing inside a field should be
themselves be quoted. When True, the character is doubled. When False,
the escapechar is used as a prefix to the quotechar. It defaults to
True.
"""
Peter's example should not have worked.
>
the documentation also mentions a "quoting" parameter that "controls
when quotes should be generated by the writer and recognised by the
reader.". not sure how that changes things.
Hi Fredrik, I read that carefully -- "quoting" appears to have no
effect in this situation.
>
anyway, it's either unclear documentation or a bug in the code. better
submit a bug report so someone can fix one of them.
Comment