UnicodeDecodeError, how to elegantly deal with this?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Jorgen Bodde

    UnicodeDecodeError, how to elegantly deal with this?

    Hi All,

    I am relatively new to python unicode pains and I would like to have
    some advice. I have this snippet of code:

    def playFile(cmd, args):
    argstr = list()
    for arg in appcfg.options[appcfg.CFG_PLAY ER_ARGS].split():
    thefile = args["file"]
    filemask = u"%file%"
    therep = arg.replace(fil emask, thefile) ##### error here
    argstr.append(t herep)
    argstr.insert(0 , appcfg.options[appcfg.CFG_PLAY ER_PATH])

    try:
    subprocess.Pope n( argstr )
    except OSError:
    cmd.html = "<h1>Can't play file</h1></br>" + args["file"]
    return

    cmd.redirect = _getBaseURL("se ries?cmd_get_se ries=%i" % args["id"])
    cmd.html = ""

    -------------------

    It crashes on this:

    20:03:49: File
    "D:\backup\impo rtant\src\airs\ webserver\webdi spatch.py", line 117, in
    playFile therep = arg.replace(fil emask, thefile)

    20:03:49: UnicodeDecodeEr ror: 'ascii' codec can't decode byte 0xc2 in
    position 93: ordinal not in range(128)

    20:03:49: Unhandled Error: <type 'exceptions.Uni codeDecodeError '>:
    'ascii' codec can't decode byte 0xc2 in position 93: ordinal not in
    range(128)

    It chokes on a ` character in a file name. I read this file from disk,
    and I would like to play it. However in the replace action it cannot
    translate this character. How can I transparently deal with this issue
    because in my eyes it is simply replacing a string with a string, and
    I do not want to be bothered with unicode problems. I am not sure in
    which encoding it is in, but I am not experienced enough to see how I
    can solve this

    Can anybody guide me to an elegant solution?

    Thanks in advance!
    - Jorgen
  • John Machin

    #2
    Re: UnicodeDecodeEr ror, how to elegantly deal with this?

    On Aug 5, 4:23 am, "Jorgen Bodde" <jorgen.maill.. .@gmail.comwrot e:
    Hi All,
    >
    I am relatively new to python unicode pains and I would like to have
    some advice. I have this snippet of code:
    thefile = args["file"]
    filemask = u"%file%"
    therep = arg.replace(fil emask, thefile) ##### error here
    It crashes on this:
    >
    20:03:49: File
    "D:\backup\impo rtant\src\airs\ webserver\webdi spatch.py", line 117, in
    playFile therep = arg.replace(fil emask, thefile)
    >
    20:03:49: UnicodeDecodeEr ror: 'ascii' codec can't decode byte 0xc2 in
    position 93: ordinal not in range(128)
    >
    20:03:49: Unhandled Error: <type 'exceptions.Uni codeDecodeError '>:
    'ascii' codec can't decode byte 0xc2 in position 93: ordinal not in
    range(128)
    >
    It chokes on a ` character in a file name. I read this file from disk,
    and I would like to play it. However in the replace action it cannot
    translate this character. How can I transparently deal with this issue
    because in my eyes it is simply replacing a string with a string, and
    I do not want to be bothered with unicode problems. I am not sure in
    which encoding it is in, but I am not experienced enough to see how I
    can solve this
    If you don't want to be bothered with "unicode problems":
    (1) Don't create a "unicode problem" when one doesn't exist.
    (2) Don't bother other people with *your* "unicode problems".
    >
    Can anybody guide me to an elegant solution?
    >
    Short path:
    In this case, less is more; remove the u prefix in the line
    filemask = u"%file%"

    Long Path:
    Ignorance is not bliss. Lose the attitude. Unicode is your friend, not
    an instrument of Satan. Read this:


    By the way, how one's filesystem encodes file names can be a good
    thing to know; in your case it appears to be UTF-8.

    HTH,
    John

    Comment

    • Jorgen Bodde

      #3
      Re: UnicodeDecodeEr ror, how to elegantly deal with this?

      Hi John,
      If you don't want to be bothered with "unicode problems":
      (1) Don't create a "unicode problem" when one doesn't exist.
      (2) Don't bother other people with *your* "unicode problems".
      Well I guess you misunderstood what I meant. I meant I am a simple
      developer, getting a string from the file system that happens to be in
      some kind of encoding. It is totally a mystery to me why it crashes on
      that so that is what I meant with not wanted to be bothered with it,
      because I don't see any obvious reason why, not that I am too lazy to
      deal with it, it simply seems strange to me.
      In this case, less is more; remove the u prefix in the line
      filemask = u"%file%"
      Ok thanks. I thought making it unicode because it is a search string
      that is used in a UTF-8 encoded replacement, would solve it,
      Long Path:
      Ignorance is not bliss. Lose the attitude. Unicode is your friend, not
      an instrument of Satan. Read this:
      http://www.amk.ca/python/howto/unicode
      I never said that I have an attitude towards unicode, I simply
      misunderstood it's inner workings. Thanks for the link I will look at
      it.

      ps. sorry for the direct mail, I can't get used to one mailinglist
      always replying to the list, and the other replying to the user by
      default ;-)

      With regards,
      - Jorgen

      On Tue, Aug 5, 2008 at 11:00 AM, John Machin <sjmachin@lexic on.netwrote:
      On Aug 5, 4:23 am, "Jorgen Bodde" <jorgen.maill.. .@gmail.comwrot e:
      >Hi All,
      >>
      >I am relatively new to python unicode pains and I would like to have
      >some advice. I have this snippet of code:
      >
      > thefile = args["file"]
      > filemask = u"%file%"
      > therep = arg.replace(fil emask, thefile) ##### error here
      >
      >
      >It crashes on this:
      >>
      >20:03:49: File
      >"D:\backup\imp ortant\src\airs \webserver\webd ispatch.py", line 117, in
      >playFile therep = arg.replace(fil emask, thefile)
      >>
      >20:03:49: UnicodeDecodeEr ror: 'ascii' codec can't decode byte 0xc2 in
      >position 93: ordinal not in range(128)
      >>
      >20:03:49: Unhandled Error: <type 'exceptions.Uni codeDecodeError '>:
      >'ascii' codec can't decode byte 0xc2 in position 93: ordinal not in
      >range(128)
      >>
      >It chokes on a ` character in a file name. I read this file from disk,
      >and I would like to play it. However in the replace action it cannot
      >translate this character. How can I transparently deal with this issue
      >because in my eyes it is simply replacing a string with a string, and
      >I do not want to be bothered with unicode problems. I am not sure in
      >which encoding it is in, but I am not experienced enough to see how I
      >can solve this
      >
      If you don't want to be bothered with "unicode problems":
      (1) Don't create a "unicode problem" when one doesn't exist.
      (2) Don't bother other people with *your* "unicode problems".
      >
      >>
      >Can anybody guide me to an elegant solution?
      >>
      >
      Short path:
      In this case, less is more; remove the u prefix in the line
      filemask = u"%file%"
      >
      Long Path:
      Ignorance is not bliss. Lose the attitude. Unicode is your friend, not
      an instrument of Satan. Read this:

      >
      By the way, how one's filesystem encodes file names can be a good
      thing to know; in your case it appears to be UTF-8.
      >
      HTH,
      John
      --

      >

      Comment

      • John Machin

        #4
        Re: UnicodeDecodeEr ror, how to elegantly deal with this?

        On Aug 5, 8:37 pm, "Jorgen Bodde" <jorgen.maill.. .@gmail.comwrot e:
        Hi John,
        >
        If you don't want to be bothered with "unicode problems":
        (1) Don't create a "unicode problem" when one doesn't exist.
        (2) Don't bother other people with *your* "unicode problems".
        >
        Well I guess you misunderstood what I meant.
        Sorry, it's my ETL (English as a Third Language) problem; my mother
        tongue is the Queensland dialect of Australian :-)
        >
        In this case, less is more; remove the u prefix in the line
        filemask = u"%file%"
        >
        Ok thanks. I thought making it unicode because it is a search string
        that is used in a UTF-8 encoded replacement, would solve it,
        "UTF-8 encoded" implies a str (8-bits per character) object, not a
        unicode object. Solve what? What problem did you have before you put
        the u in there?
        >
        Long Path:
        Ignorance is not bliss. Lose the attitude. Unicode is your friend, not
        an instrument of Satan. Read this:
        http://www.amk.ca/python/howto/unicode
        >
        I never said that I have an attitude towards unicode, I simply
        misunderstood it's inner workings.
        I must have misunderstood "pains" and "bother", eh?

        Cheers,
        John

        Comment

        Working...