Hebrew in php

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • BJY

    Hebrew in php

    I can't seem to use PHP code in the start of my hebrew-language pages.

    If I do this....
    <HTML dir="rtl" lang="he">
    <?php include 'header.inc'; ?>
    ....all is fine.

    However, if I do this at the start of the file...
    <?php include 'header.inc'; ?>
    .... where header.inc includes the opening HTML tag, the resulting
    output is left to right.

    If I View Source, there are 2 non-viewable characters before the HTML
    tag (they appear as ? in textpad, they don't appear at all in notepad).
    If i delete these characters and save the file, the output then
    appears correct. (RTL).

    The .php and .inc files are saved in UTF-8.

    This is a simplified version. What I really need is to be able to do
    cookies, output buffering, etc. at the start of the script. I also
    would like to use SMARTY, but that too does the same thing.

    Any ideas????

    Thanks in advance!

  • Chung Leong

    #2
    Re: Hebrew in php

    BJY wrote:
    I can't seem to use PHP code in the start of my hebrew-language pages.
    >
    If I do this....
    <HTML dir="rtl" lang="he">
    <?php include 'header.inc'; ?>
    ...all is fine.
    >
    However, if I do this at the start of the file...
    <?php include 'header.inc'; ?>
    ... where header.inc includes the opening HTML tag, the resulting
    output is left to right.
    >
    If I View Source, there are 2 non-viewable characters before the HTML
    tag (they appear as ? in textpad, they don't appear at all in notepad).
    If i delete these characters and save the file, the output then
    appears correct. (RTL).
    >
    The .php and .inc files are saved in UTF-8.
    >
    This is a simplified version. What I really need is to be able to do
    cookies, output buffering, etc. at the start of the script. I also
    would like to use SMARTY, but that too does the same thing.
    >
    Any ideas????
    >
    Thanks in advance!
    Looks like your editor is adding a LTR mark at the beginning of the
    doc. Because IE treats text outside of <html></htmas though it's
    inside the document body, the character would switch the default
    direction back to LTR. Set the dir attribute of <bodyinstead and it
    should work.

    Comment

    • Benjamin Esham

      #3
      Re: Hebrew in php

      * BJY:
      I can't seem to use PHP code in the start of my hebrew-language pages.
      >
      If I do this....
      <HTML dir="rtl" lang="he">
      <?php include 'header.inc'; ?>
      ...all is fine.
      >
      However, if I do this at the start of the file...
      <?php include 'header.inc'; ?>
      ... where header.inc includes the opening HTML tag, the resulting
      output is left to right.
      >
      If I View Source, there are 2 non-viewable characters before the HTML tag
      (they appear as ? in textpad, they don't appear at all in notepad). If i
      delete these characters and save the file, the output then appears
      correct. (RTL).
      >
      The .php and .inc files are saved in UTF-8.
      Can you show us the beginning of header.inc? And possibly a hex dump of the
      first line or so? If your file contains a Unicode byte order mark [1],
      those are probably the characters messing your file up.

      [1] http://en.wikipedia.org/wiki/Byte_Order_Mark

      --
      Benjamin D. Esham
      bdesham@gmail.c om | AIM: bdesham128 | Jabber: same as e-mail
      WHY DO THEY ALWAYS SEND THE POOR!?

      Comment

      • BJY

        #4
        Re: Hebrew in php

        Chung Leong wrote:
        Set the dir attribute of <bodyinstead and it should work.
        Chung - Thanks, setting the dir on the body tag did solve this specific
        rtl problem... But i'm still trying to solve the main problem of these
        extraneous characters...

        Benjamin Esham wrote:
        Can you show us the beginning of header.inc? And possibly a hex dump of the
        first line or so? If your file contains a Unicode byte order mark [1],
        those are probably the characters messing your file up.
        The hex dump of the output indeed shows 'junk' which i don't know where
        it comes from:
        EF BB BF EF BB BF EF BB BF 3C 48 54 4D 4C
        .... where the 3C indicates the start of the <HTML tag

        Any idea what makes PHP spit out this at the start of output?


        The header.inc is pretty straight-forward:
        <HTML dir="rtl" lang="he">
        <TITLE>Burrit os - בוריטוס - מסעדה מקסיקנית</TITLE>
        <script language="javas cript" src="burritos.j s"></script>
        <meta http-equiv="Content-Type" content="text/html;
        charset=iso-8859-1" />
        <HEAD>
        <style>
        <?php include 'burritos.css'; ?>
        </style>
        ..... and then more html

        Thanks again!

        Comment

        • Benjamin Esham

          #5
          Re: Hebrew in php

          BJY wrote:
          Benjamin Esham wrote:
          >
          Can you show us the beginning of header.inc? And possibly a hex dump of
          the first line or so? If your file contains a Unicode byte order mark
          [1], those are probably the characters messing your file up.
          >
          The hex dump of the output indeed shows 'junk' which i don't know where it
          comes from: EF BB BF EF BB BF EF BB BF 3C 48 54 4D 4C ... where the 3C
          indicates the start of the <HTML tag
          >
          Any idea what makes PHP spit out this at the start of output?
          I think I've found your problem. From [1]:

          | Quite a lot of Windows software (including Windows Notepad) adds one to
          | UTF-8 files. However in Unix-like systems (which make heavy use of text
          | files for configuration) this practice is not recommended, as it will
          | interfere with correct processing of important codes such as the hash-bang
          | at the start of an interpreted script. It may also interfere with source
          | for programming languages that don't recognise it. For example, [...] in
          | PHP, if output buffering is disabled, it has the subtle effect of causing
          | the page to start being sent to the browser, preventing custom headers
          | from being specified by the PHP script. The UTF-8 representation of the
          | BOM is the byte sequence EF BB BF.

          It looks like the phantom "EF BB BF" bytes are Unicode BOMs. (Why there are
          three of them I have no idea.)

          Are you using Notepad? If so, I recommend you use a different editor; if
          not, see if there's an option to turn off the inclusion of the BOM. If
          you're looking for a new editor, I recommend Vim [2], which has great
          support for bidirectional text editing, although it does have a rather steep
          learning curve. You might check out the comp.editors group if you have any
          further questions about text editing and editors.

          [1] http://en.wikipedia.org/wiki/Byte_Order_Mark
          [2] http://www.vim.org

          HTH!
          --
          Benjamin D. Esham
          bdesham@gmail.c om | AIM: bdesham128 | Jabber: same as e-mail
          "Whenever a theory appears to you as the only possible one, take
          this as a sign that you have neither understood the theory nor
          the problem which it was intended to solve." — Karl Popper

          Comment

          • Markus Ernst

            #6
            Re: Hebrew in php

            BJY schrieb:
            The header.inc is pretty straight-forward:
            <HTML dir="rtl" lang="he">
            <TITLE>Burrit os - בוריטוס - מסעדה מקסיקנית</TITLE>
            <script language="javas cript" src="burritos.j s"></script>
            <meta http-equiv="Content-Type" content="text/html;
            charset=iso-8859-1" />
            BTW I assume that the charset should rather say utf-8.

            --
            Markus

            Comment

            • BJY

              #7
              Re: Hebrew in php

              To summarize this thread for posterity's sake:

              The problem indeed has nothing to do with PHP or with Hebrew/RTL.

              The problem was that saving a file as UTF-8 using Notepad adds phantom
              characters at the start of the file, which PHP properly sent as part of
              the output and which confused the browser.

              Editing the file with vim allowed me to remove the junk.

              Thanks Benjamin and all for your help....

              BJY

              Comment

              Working...