xml charset translation with xsl

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • reynard.michel@gmail.com

    xml charset translation with xsl

    Hi,
    I would like to translate an XML ISO-8859-1 document in UTF-8.
    For this I wrote the following XSL

    <?xml version="1.0"?>
    <xsl:styleshe et version="1.0"
    xmlns:xsl="http ://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" encoding="UTF-8" indent="yes"/>

    <xsl:template match="*">
    <xsl:copy>
    <xsl:apply-templates/>
    </xsl:copy>
    </xsl:template>
    </xsl:stylesheet>

    It works fine except that it does not keep the atributes.
    How can I do a translation with the attributes?

    Further more I've some permil il the xml. If they are written as &
    permil; they get translated in a permil char but if they are written as
    & #2030; which is the permil char in ISO-8859-1 it is transformed in an
    unknown char. How can I do a traslate from & #2030 to & #8240; (utf-8
    permil) or to a permil char?

    Thank you for your help

    Michel

  • Johannes Koch

    #2
    Re: xml charset translation with xsl

    reynard.michel@ gmail.com wrote:
    [color=blue]
    > Hi,
    > I would like to translate an XML ISO-8859-1 document in UTF-8.
    > For this I wrote the following XSL
    >
    > <?xml version="1.0"?>
    > <xsl:styleshe et version="1.0"
    > xmlns:xsl="http ://www.w3.org/1999/XSL/Transform">
    > <xsl:output method="xml" encoding="UTF-8" indent="yes"/>
    >
    > <xsl:template match="*">
    > <xsl:copy>
    > <xsl:apply-templates/>
    > </xsl:copy>
    > </xsl:template>
    > </xsl:stylesheet>
    >
    > It works fine except that it does not keep the atributes.
    > How can I do a translation with the attributes?[/color]

    See the XSLT spec for the identity template
    (<http://www.w3.org/TR/xslt#copying>).
    [color=blue]
    > Further more I've some permil il the xml. If they are written as &
    > permil; they get translated in a permil char but if they are written as
    > & #2030; which is the permil char in ISO-8859-1[/color]

    No, it's not. The XHTML character entity reference &permil; is ‰
    or (hex) &#x2030;. Note the 'x'.

    --
    Johannes Koch
    In te domine speravi; non confundar in aeternum.
    (Te Deum, 4th cent.)

    Comment

    • George Bina

      #3
      Re: xml charset translation with xsl

      That is not an identity transformation, you can use something like
      below and delete your current template:

      <xsl:template match="node() | @*">
      <xsl:copy>
      <xsl:apply-templates select="node() | @*"/>
      </xsl:copy>
      </xsl:template>

      to get the same contyent to the output.

      If you want to convert some character to another then you can add a
      rule matching text nodes and output the value of the text node through
      the translate function that converts your character in the initial
      document to the desired character in the output.

      Best Regards,
      George
      ---------------------------------------------------------------------
      George Cristian Bina
      <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger


      Comment

      Working...