Encoding troubles

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Xaver Hinterhuber

    Encoding troubles

    Hello pythonistas,

    I program a class which stores the source code for an output page in a
    string.
    At request time it compiles it, executes it and returns the result.
    I now have upgraded the class from python 2.1 to python 2.3.
    So I have to do some encoding work I previously didn't have to do.
    If I execute the appended code, then it raises me an error stating:

    Error Type: UnicodeDecodeEr ror
    Error Value: 'ascii' codec can't decode byte 0xc2 in position 0: ordinal not
    in range(128)

    What is wrong?

    The code is the following:

    class Report:
    """This class is generating reports as pdf-files with the source code
    provided in the variable content"""
    # default values for test
    content = """
    Story = ['ÄÖÜäöüß?'] #german umlauts for test purposes
    """
    def compileContent( self):
    """Compiles the content of the pdf-pages"""
    content = self.content
    # Here the error occurs
    content = content.decode( 'iso8859-15')
    codeString = HTML(content, globals())
    codeString = codeString(self ._getContext())
    codeString = codeString.repl ace('\r\n', '\n')
    codeString = codeString.spli t('\n')
    if codeString[-1].strip() == '\n':
    del codeString[-1]
    # Code als Funktion kompilieren
    codeString.appe nd('return Story')
    codeString = '\n\t'.join(cod eString)
    codeString += '\n'
    codeString = 'def f():\n\t' + codeString
    codeObject = compile(codeStr ing, 'codeObject', 'exec')
    return codeObject

    def __call__(self):
    dict={}
    dict.update(glo bals())
    dict.update(loc als())
    dict.update(kw)
    codeObject = self.compileCon tent()
    exec codeObject in dict
    Story = dict['f']() # Ausführen der Funktion
    return Story
    --
    with kind regards
    Xaver Hinterhuber


  • Peter Otten

    #2
    Re: Encoding troubles

    Xaver Hinterhuber wrote:
    [color=blue]
    > At request time it compiles it, executes it and returns the result.
    > I now have upgraded the class from python 2.1 to python 2.3.
    > So I have to do some encoding work I previously didn't have to do.
    > If I execute the appended code, then it raises me an error stating:
    >
    > Error Type: UnicodeDecodeEr ror
    > Error Value: 'ascii' codec can't decode byte 0xc2 in position 0: ordinal
    > not in range(128)
    >
    > What is wrong?
    >
    > The code is the following:
    >
    > class Report:
    > """This class is generating reports as pdf-files with the source code
    > provided in the variable content"""
    > # default values for test
    > content = """
    > Story = ['ÄÖÜäöüß?'] #german umlauts for test purposes
    > """
    > def compileContent( self):
    > """Compiles the content of the pdf-pages"""
    > content = self.content
    > # Here the error occurs
    > content = content.decode( 'iso8859-15')[/color]

    Or here?
    [color=blue]
    > codeString = HTML(content, globals())[/color]

    What does HTML() do? Are there any non-unicode strings with non-ascii
    characters that you try to concatenate with content? E. g.:
    [color=blue][color=green][color=darkred]
    >>> u"äöü" + u"äöü"[/color][/color][/color]
    u'\xe4\xf6\xfc\ xe4\xf6\xfc'[color=blue][color=green][color=darkred]
    >>> "äöü" + "äöü"[/color][/color][/color]
    '\xe4\xf6\xfc\x e4\xf6\xfc'

    But
    [color=blue][color=green][color=darkred]
    >>> u"äöü" + "äöü"[/color][/color][/color]
    Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    UnicodeDecodeEr ror: 'ascii' codec can't decode byte 0xe4 in position 0:
    ordinal not in range(128)

    The solution is to make sure that either all strings are unicode or all
    strings are non-unicode (hopefully sharing the same encoding).

    Peter


    Comment

    • Xaver Hinterhuber

      #3
      Re: Encoding troubles

      Hi Peter,

      "Peter Otten" <__peter__@web. de> schrieb im Newsbeitrag
      news:c8anks$tq3 $07$1@news.t-online.com...
      [snip]
      [color=blue]
      > What does HTML() do? Are there any non-unicode strings with non-ascii
      > characters that you try to concatenate with content? E. g.:[/color]

      HTML is a zope class which handles dtml-markup.
      [color=blue][color=green][color=darkred]
      > >>> u"äöü" + u"äöü"[/color][/color]
      > u'\xe4\xf6\xfc\ xe4\xf6\xfc'[color=green][color=darkred]
      > >>> "äöü" + "äöü"[/color][/color]
      > '\xe4\xf6\xfc\x e4\xf6\xfc'
      >
      > But
      >[color=green][color=darkred]
      > >>> u"äöü" + "äöü"[/color][/color]
      > Traceback (most recent call last):
      > File "<stdin>", line 1, in ?
      > UnicodeDecodeEr ror: 'ascii' codec can't decode byte 0xe4 in position 0:
      > ordinal not in range(128)
      >
      > The solution is to make sure that either all strings are unicode or all
      > strings are non-unicode (hopefully sharing the same encoding).[/color]

      This was the problem.

      Thank you very much for your help.


      --
      with kind regards
      Xaver Hinterhuber


      Comment

      Working...