PHP - using mail() and unicode text - text gets disturbed

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Edo van der Zouwen

    PHP - using mail() and unicode text - text gets disturbed

    I have the following problem. On a website there's a (simple) feedback
    form. This is used also by Polish visitors who (of course) type Polish
    text using special characters.

    However, when I receive the text in my mailbox, all special characters
    have been turned into mess......

    For example: "wspólprace " is turned into "współpra ce".

    It seems PHP is handling the Unicode-8 strings quite well (when I
    'echo' the strings on the site, I see the text correctly), until the
    point that it is send by using mail().

    Is this a server configuration issue? Or something else?

    How can I get my text to remain in Unicode?

    I have this problem both on my testserver (Apache 1.3.28, PHP 4.3.2 on
    Windows XP) as on my providers server (Apache under Linux).


    Hope anybody can help.

    Many thanks,


    Edo.
  • Filth

    #2
    Re: PHP - using mail() and unicode text - text gets disturbed

    [color=blue]
    > For example: "wspólprace " is turned into "wspóÅ,pra ce".
    >
    > It seems PHP is handling the Unicode-8 strings quite well[/color]

    are you setting up the headers of the email to state something such as

    Content-Type: text/html;charset=is o-8859-15


    Comment

    • Chung Leong

      #3
      Re: PHP - using mail() and unicode text - text gets disturbed

      It's an encoding issue. One way to deal with this is to escape the UTF-8
      text using imap_8bit() and set the charset in the email header to UTF-8.
      Many email clients don't handle this correctly though. I would recommend
      sending multipart mails. In the plaintext part, remove the accent marks
      (solidarnos'c' -> solidarnosc). In the HTML part, encoding the special
      characters as HTML entities (doka,d => dokąd). This will ensure that
      everyone see something that's readable. The same strategy is used by Outlook
      Express. It'll be helpful if you send yourself a test email and look at the
      source.

      Here are a couple functions that do what I suggested:

      $pl_markless_tr = array(
      "\xC4\x85" => "a",
      "\xC4\x87" => "c",
      "\xC4\x99" => "e",
      "\xC5\x82" => "l",
      "\xC5\x84" => "n",
      "\xC5\x9b" => "s",
      "\xC5\xba" => "z",
      "\xC5\xbc" => "z");

      $pl_uni_entitie s_tr = array(
      "\xC4\x85" => "ą",
      "\xC4\x87" => "ć",
      "\xC4\x99" => "ę",
      "\xC5\x82" => "ł",
      "\xC5\x84" => "ń",
      "\xC5\x9b" => "ś",
      "\xC5\xba" => "ź",
      "\xC5\xbc" => "ż");

      function remove_polish_m arks($s) {
      global $pl_markless_tr ;
      return strtr($s, $pl_markless_tr );
      }

      function escape_polish_m arks($s) {
      global $pl_uni_entitie s_tr;
      return strtr($s, $pl_uni_entitie s_tr);
      }


      Uzytkownik "Edo van der Zouwen"
      <ezouwen@dithie rvoorisdomainen hetisbijdemonke nnerswetenwatte doen.nl> napisal
      w wiadomosci news:jm3q10dkg5 ssdfoj4g5paa7nn u85j3pub4@4ax.c om...[color=blue]
      > I have the following problem. On a website there's a (simple) feedback
      > form. This is used also by Polish visitors who (of course) type Polish
      > text using special characters.
      >
      > However, when I receive the text in my mailbox, all special characters
      > have been turned into mess......
      >
      > For example: "wspólprace " is turned into "współpra ce".
      >
      > It seems PHP is handling the Unicode-8 strings quite well (when I
      > 'echo' the strings on the site, I see the text correctly), until the
      > point that it is send by using mail().
      >
      > Is this a server configuration issue? Or something else?
      >
      > How can I get my text to remain in Unicode?
      >
      > I have this problem both on my testserver (Apache 1.3.28, PHP 4.3.2 on
      > Windows XP) as on my providers server (Apache under Linux).
      >
      >
      > Hope anybody can help.
      >
      > Many thanks,
      >
      >
      > Edo.[/color]


      Comment

      • Andy Hassall

        #4
        Re: PHP - using mail() and unicode text - text gets disturbed

        On Sun, 1 Feb 2004 15:33:30 -0000, "Filth" <p.macdonald@bl ueyonder.co.uk>
        wrote:
        [color=blue][color=green]
        >> For example: "wspólprace " is turned into "wspóÅ,pra ce".
        >>
        >> It seems PHP is handling the Unicode-8 strings quite well[/color]
        >
        >are you setting up the headers of the email to state something such as
        >
        >Content-Type: text/html;charset=is o-8859-15[/color]

        Content-Type: text/plain;charset=u tf-8

        ... sounds like the more appropriate header to send in this case.

        --
        Andy Hassall <andy@andyh.co. uk> / Space: disk usage analysis tool
        <http://www.andyh.co.uk > / <http://www.andyhsoftwa re.co.uk/space>

        Comment

        • Edo van der Zouwen

          #5
          Re: PHP - using mail() and unicode text - text gets disturbed

          On Sun, 1 Feb 2004 15:33:30 -0000, "Filth"
          <p.macdonald@bl ueyonder.co.uk> wrote:
          [color=blue]
          >[color=green]
          >> For example: "wspólprace " is turned into "wspóÅ,pra ce".
          >>
          >> It seems PHP is handling the Unicode-8 strings quite well[/color]
          >
          >are you setting up the headers of the email to state something such as
          >
          >Content-Type: text/html;charset=is o-8859-15
          >[/color]


          Thanks, this did the trick, except the header should contain:

          "Content-Type: text/html; charset=UNICODE-1-1-UTF-8"

          Cheers,


          Edo.

          Comment

          • Edo van der Zouwen

            #6
            Re: PHP - using mail() and unicode text - text gets disturbed

            On Sun, 1 Feb 2004 12:06:26 -0500, "Chung Leong"
            <chernyshevsky@ hotmail.com> wrote:
            [color=blue]
            >It's an encoding issue. One way to deal with this is to escape the UTF-8
            >text using imap_8bit() and set the charset in the email header to UTF-8.
            >Many email clients don't handle this correctly though. I would recommend
            >sending multipart mails. In the plaintext part, remove the accent marks
            >(solidarnos' c' -> solidarnosc). In the HTML part, encoding the special
            >characters as HTML entities (doka,d => dokąd). This will ensure that
            >everyone see something that's readable. The same strategy is used by Outlook
            >Express. It'll be helpful if you send yourself a test email and look at the
            >source.
            >
            >Here are a couple functions that do what I suggested:
            >
            >$pl_markless_t r = array(
            >"\xC4\x85" => "a",
            >"\xC4\x87" => "c",
            >"\xC4\x99" => "e",
            >"\xC5\x82" => "l",
            >"\xC5\x84" => "n",
            >"\xC5\x9b" => "s",
            >"\xC5\xba" => "z",
            >"\xC5\xbc" => "z");
            >
            >$pl_uni_entiti es_tr = array(
            >"\xC4\x85" => "ą",
            >"\xC4\x87" => "ć",
            >"\xC4\x99" => "ę",
            >"\xC5\x82" => "ł",
            >"\xC5\x84" => "ń",
            >"\xC5\x9b" => "ś",
            >"\xC5\xba" => "ź",
            >"\xC5\xbc" => "ż");
            >
            >function remove_polish_m arks($s) {
            > global $pl_markless_tr ;
            > return strtr($s, $pl_markless_tr );
            >}
            >
            >function escape_polish_m arks($s) {
            > global $pl_uni_entitie s_tr;
            > return strtr($s, $pl_uni_entitie s_tr);
            >}
            >
            >[/color]

            Thanks, very interesting method. For the time being, the email client
            used by the receiver of the webforms is capable of handling the
            unicode text, so I'll stick to just using a header which enables
            Unicode text.

            However, I'll definiately save and check your method, might be very
            useful in the future.

            Dziekuje i do wiedzenia :-)


            Edo.

            Comment

            • Edo van der Zouwen

              #7
              Re: PHP - using mail() and unicode text - text gets disturbed

              On Sun, 01 Feb 2004 18:20:19 +0000, Andy Hassall <andy@andyh.co. uk>
              wrote:
              [color=blue]
              >
              >Content-Type: text/plain;charset=u tf-8
              >
              > ... sounds like the more appropriate header to send in this case.
              >[/color]

              Thx, found that out myself, but appreciate your input.

              Edo.

              Comment

              Working...