Xpath encoding problem

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Moti

    Xpath encoding problem


    Hi,

    I have the following code:

    $d = xpath->query($myxpath );
    $text = $d->item(0)->nodeValue;
    print $text;

    While this code is working well with English characters, non-English
    characters (Hebrew, German, Russian) are not encoded properly and the
    output is unreadable gibberish.

    I know PHP XML DOM is Unicode, but even iconv or other functions
    unable to display those characters as they should.
  • Moti

    #2
    Re: Xpath encoding problem


    Hi Rik and thanks for your quick help,

    I did test your code and it is working, except one site.
    The encoding on that site is the same like other sites (with correct
    output encoding):

    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">

    another different it has DTD of HTML 4 Transitional (others are XHTML
    or missing DTD).

    Any ideas?

    Comment

    • Rik Wasmus

      #3
      Re: Xpath encoding problem

      On Mon, 11 Feb 2008 19:05:20 +0100, Moti <Moti.Ba@gmail. comwrote:
      >
      Hi Rik and thanks for your quick help,
      >
      I did test your code and it is working, except one site.
      The encoding on that site is the same like other sites (with correct
      output encoding):
      >
      <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
      Don't rely on META tags. What are the actual headers?
      another different it has DTD of HTML 4 Transitional (others are XHTML
      or missing DTD).
      Should not be a problem.
      --
      Rik Wasmus

      Comment

      • Moti

        #4
        Re: Xpath encoding problem

        After debugging and comparing files, I found that if I insert

        <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">

        Before any other tags in <headthe encoding is OK, if not i got
        gibberish.


        Comment

        • Rik Wasmus

          #5
          Re: Xpath encoding problem

          On Mon, 11 Feb 2008 19:29:07 +0100, Moti <Moti.Ba@gmail. comwrote:
          After debugging and comparing files, I found that if I insert
          >
          <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
          >
          Before any other tags in <headthe encoding is OK, if not i got
          gibberish.
          Doing a header('Content-Type: text/html; charset=utf-8'); should take care
          of that...
          --
          Rik Wasmus

          Comment

          • Moti

            #6
            Re: Xpath encoding problem

            On Feb 11, 9:11 pm, "Rik Wasmus" <luiheidsgoe... @hotmail.comwro te:
            On Mon, 11 Feb 2008 19:29:07 +0100, Moti <Moti...@gmail. comwrote:
            After debugging and comparing files, I found that if I insert
            >
            <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
            >
            Before any other tags in <headthe encoding is OK, if not i got
            gibberish.
            >
            Doing a header('Content-Type: text/html; charset=utf-8'); should take care
            of that...
            --
            Rik Wasmus
            I meant adding that to html data grabbed from web site, not to my
            page.

            Comment

            Working...