I'm trying to use codecs.open() and I see two issues when I pass
encoding='utf8' :
1) Newlines are hardcoded to LINEFEED (ascii 10) instead of the
platform-specific byte(s).
import codecs
f = codecs.open('tm p.txt', 'w', encoding='utf8' )
s = u'\u0391\u03b8\ u03ae\u03bd\u03 b1'
print >f, s
print >f, s
f.close()
This doesn't happen for the default encoding (=None).
2) csv.writer doesn't seem to work as expected when being passed a
codecs object; it treats it as if encoding is ascii:
import codecs, csv
f = codecs.open('tm p.txt', 'w', encoding='utf8' )
s = u'\u0391\u03b8\ u03ae\u03bd\u03 b1'
# this works fine
print >f, s
# this doesn't
csv.writer(f).w riterow([s])
f.close()
Traceback (most recent call last):
....
csv.writer(f).w riterow([s])
UnicodeEncodeEr ror: 'ascii' codec can't encode character u'\u0391' in
position 0: ordinal not in range(128)
Is this the expected behavior or are these bugs ?
George
encoding='utf8' :
1) Newlines are hardcoded to LINEFEED (ascii 10) instead of the
platform-specific byte(s).
import codecs
f = codecs.open('tm p.txt', 'w', encoding='utf8' )
s = u'\u0391\u03b8\ u03ae\u03bd\u03 b1'
print >f, s
print >f, s
f.close()
This doesn't happen for the default encoding (=None).
2) csv.writer doesn't seem to work as expected when being passed a
codecs object; it treats it as if encoding is ascii:
import codecs, csv
f = codecs.open('tm p.txt', 'w', encoding='utf8' )
s = u'\u0391\u03b8\ u03ae\u03bd\u03 b1'
# this works fine
print >f, s
# this doesn't
csv.writer(f).w riterow([s])
f.close()
Traceback (most recent call last):
....
csv.writer(f).w riterow([s])
UnicodeEncodeEr ror: 'ascii' codec can't encode character u'\u0391' in
position 0: ordinal not in range(128)
Is this the expected behavior or are these bugs ?
George
Comment