A 'raw' codec for binary "strings" in Python?

**Erik Max Francis** · Jul 18 '05, 08:56 AM

Re: A 'raw' codec for binary "strings&q uot; in Python?

Bill Janssen wrote:
[color=blue]
> You'll notice that
> the problem is in *decoding* the string, not in re-encoding it,
> because I'm using the default "C" locale, and "US-ASCII" is presumed
> for strings. But these strings are *not* US-ASCII, they are raw
> bytes. How do I format a string of raw bytes for conversion to a
> recognized charset encoding for printing?[/color]

Since the default encoding is ASCII, those 8-bit octets have no meaning
unless you do an explicit conversion. Trying to print them _should_
raise an error, because you're trying to do something that doesn't make
sense.

As Gerrit pointed out, it sounds like what you want is repr.

--
__ Erik Max Francis && max@alcyone.com && http://www.alcyone.com/max/
/ \ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
\__/ Liberty is the right to do whatever the law permits.
-- Charles Louis Montesquieu

**Michael Hudson** · Jul 18 '05, 08:56 AM

Re: A 'raw' codec for binary "strings&q uot; in Python?

Bill Janssen <janssen@parc.c om> writes:
[color=blue]
> I've encountered an issue dealing with strings read from files. I
> read a line from a file, then try to print it out as an ASCII string:
>
> line = fp.readline()
> print line.encode('US-ASCII', 'replace')
>
> and of course I get an error like:
>
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> UnicodeDecodeEr ror: 'ascii' codec can't decode byte 0xd5 in position 1: ordinal not in range(128)
>
> because the file contained some binary character. You'll notice that
> the problem is in *decoding* the string, not in re-encoding it,
> because I'm using the default "C" locale, and "US-ASCII" is presumed
> for strings.[/color]

Actually, the "C" locale has precisely nothing to do with it.
[color=blue]
> But these strings are *not* US-ASCII, they are raw bytes. How do I
> format a string of raw bytes for conversion to a recognized charset
> encoding for printing?[/color]

You don't?

Wouldn't

def m(c):
if c in string.printabl e:
return c
else:
return '?'

t = ''.join([m(chr(o)) for o in range(m)])

line.translate( t)

make more sense?

Cheers,
mwh

--
I like silliness in a MP skit, but not in my APIs. :-)
-- Guido van Rossum, python-dev

A 'raw' codec for binary "strings" in Python?

A 'raw' codec for binary "strings" in Python?

Comment

Comment