Get € past a XML parser

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Rutger Claes

    Get € past a XML parser

    I'm having troubles getting the euro sign through an XML parser.

    With the following test code:
    <?php
    $string = "<root><tes t>€</test></root>";

    $parser = xml_parser_crea te();
    xml_set_charact er_data_handler ( $parser, 'cdata' );
    xml_set_element _handler( $parser, 'starthandler', 'endhandler' );

    if( !xml_parse( $parser, $string ) ) {
    print xml_error_strin g( xml_get_error_c ode( $parser ));
    }

    function cdata( $p, $data ) {
    print $data."\n";
    }

    function starthandler( $p, $tag, $att ) {
    // print $tag."\n";
    }

    function endhandler( $p, $tag ) {
    // print $tag."\n";
    }
    ?>

    I get the following result for $string
    $string = "<root><tes t>€</test></root>";
    ?
    $string = "<root><test>&# x20AC;</test></root>";
    ?
    $string = "<root><test>&e uro;</test></root>";
    Undeclared entity error

    Any solutions to this problem?

    Rutger Claes
    --
    Rutger Claes rgc@rgc.tld
    Replace tld with top level domain of belgium to contact me pgp:0x3B7D6BD6
    Do not reply to the from address. It's read by /dev/null and sa-learn only

  • ron.g.chaplin@gmail.com

    #2
    Re: Get &amp;euro; past a XML parser

    >> $string = "<root><tes t>€</test></root>";
    It should be
    $string = "<root><test>&a mp;#8364;</test></root>";

    You have to remember to use the translated &amp; in xml for &

    HTH
    Ron Chaplin
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    T73 Software & Design

    To provide custom and quality
    software, designs and services,
    to our customers, at an affordable rate,
    with minimal delay.

    Comment

    • Manuel Lemos

      #3
      Re: Get &amp;euro; past a XML parser

      Hello,

      on 01/07/2005 11:05 AM Rutger Claes said the following:[color=blue]
      > I'm having troubles getting the euro sign through an XML parser.
      >
      > With the following test code:
      > <?php
      > $string = "<root><tes t>€</test></root>";[/color]

      You need to explicitly declare that the output encoding is UTF-8 because
      ISO-8859-1 only comprises 8 bit latin characters. Iso-8859-15 would be
      the correct encoding but I don't think Expat supports any encoding
      besides UTF-8 or ISO-8859-1.

      --

      Regards,
      Manuel Lemos

      PHP Classes - Free ready to use OOP components written in PHP
      Free PHP Classes and Objects 2026 Versions with PHP Example Scripts, PHP Tutorials, Download PHP Scripts, PHP articles, Remote PHP Jobs, Hire PHP Developers, PHP Book Reviews, PHP Language OOP Materials


      PHP Reviews - Reviews of PHP books and other products


      Metastorage - Data object relational mapping layer generator

      Comment

      • Rutger Claes

        #4
        Re: Get &amp;euro; past a XML parser

        ron.g.chaplin@g mail.com wrote:
        [color=blue][color=green][color=darkred]
        >>> $string = "<root><tes t>€</test></root>";[/color][/color]
        > It should be
        > $string = "<root><test>&a mp;#8364;</test></root>";
        >
        > You have to remember to use the translated &amp; in xml for &
        >[/color]
        But you don't use entities inside entities I think
        [color=blue]
        > HTH
        > Ron Chaplin
        > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
        > T73 Software & Design
        > www.t73-softdesign.com
        > To provide custom and quality
        > software, designs and services,
        > to our customers, at an affordable rate,
        > with minimal delay.[/color]

        --
        Rutger Claes rgc@rgc.tld
        Replace tld with top level domain of belgium to contact me pgp:0x3B7D6BD6
        Do not reply to the from address. It's read by /dev/null and sa-learn only

        Comment

        • Rutger Claes

          #5
          Re: Get &amp;euro; past a XML parser

          Manuel Lemos wrote:
          [color=blue]
          > Hello,
          >
          > on 01/07/2005 11:05 AM Rutger Claes said the following:[color=green]
          >> I'm having troubles getting the euro sign through an XML parser.
          >>
          >> With the following test code:
          >> <?php
          >> $string = "<root><tes t>€</test></root>";[/color]
          >
          > You need to explicitly declare that the output encoding is UTF-8 because
          > ISO-8859-1 only comprises 8 bit latin characters. Iso-8859-15 would be
          > the correct encoding but I don't think Expat supports any encoding
          > besides UTF-8 or ISO-8859-1.
          >[/color]

          You're right. When I enforce UTF-8 on my xml from the time it get's out of
          the DOM Object through the SAX parser and Tidy I get some wrong symbols:
          â,¬. But when I tell my browser (Konqueror) to use charset UTF-8, it works.

          The problem now is that even though I have a
          <meta .... content-type: text/hml; charset=UTF-8" /> and a headers( '...
          charset=UTF-8' ) the browser still doesn't pick it up when it is set to
          auto charset. I've tried mozilla firefox too, same result.

          So now I have a working charset, but nobody will see it. Is there a way to
          fix this?

          Thanks for the answer,
          Rutger Claes
          --
          Rutger Claes rgc@rgc.tld
          Replace tld with top level domain of belgium to contact me pgp:0x3B7D6BD6
          Do not reply to the from address. It's read by /dev/null and sa-learn only

          Comment

          • Rutger Claes

            #6
            Re: Get &amp;euro; past a XML parser

            Rutger Claes wrote:
            [color=blue]
            > Manuel Lemos wrote:
            >[color=green]
            >> Hello,
            >>
            >> on 01/07/2005 11:05 AM Rutger Claes said the following:[color=darkred]
            >>> I'm having troubles getting the euro sign through an XML parser.
            >>>
            >>> With the following test code:
            >>> <?php
            >>> $string = "<root><tes t>€</test></root>";[/color]
            >>
            >> You need to explicitly declare that the output encoding is UTF-8 because
            >> ISO-8859-1 only comprises 8 bit latin characters. Iso-8859-15 would be
            >> the correct encoding but I don't think Expat supports any encoding
            >> besides UTF-8 or ISO-8859-1.
            >>[/color]
            >
            > You're right. When I enforce UTF-8 on my xml from the time it get's out
            > of the DOM Object through the SAX parser and Tidy I get some wrong
            > symbols:
            > â,¬. But when I tell my browser (Konqueror) to use charset UTF-8, it
            > works.
            >
            > The problem now is that even though I have a
            > <meta .... content-type: text/hml; charset=UTF-8" /> and a headers( '...
            > charset=UTF-8' ) the browser still doesn't pick it up when it is set to
            > auto charset. I've tried mozilla firefox too, same result.
            >
            > So now I have a working charset, but nobody will see it. Is there a way
            > to fix this?[/color]

            Fixed that too.
            It works now!
            [color=blue]
            > Thanks for the answer,
            > Rutger Claes[/color]

            --
            Rutger Claes rgc@rgc.tld
            Replace tld with top level domain of belgium to contact me pgp:0x3B7D6BD6
            Do not reply to the from address. It's read by /dev/null and sa-learn only

            Comment

            Working...