Character Set Conversions

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • softap@gmail.com

    Character Set Conversions

    I have a PHP routine which parses my incoming emails and extracts
    certain key data from the body text.

    This works fine normally, but occasionally an email may contain the
    name and address of somebody in Europe, and when this happens the
    character set of the email changes from "us-ascii" to "ISO 8859-1" and
    this confuses my parsing code, because I get characters like =E4 and
    =E5 for some of the accented european characters in people's name or
    address, and where there would normally be a row of "equals" characters
    which is used as a separator line in the email, I get =3D=3D=3D etc.

    The data is being saved into a mySQL database, so I want the name and
    address to look correct when it is printed (ie the =E4 should be
    printed as the correct accented european character).

    I've looked into the various PHP functions to unencode strings etc, as
    my ideal plan would be to convert the incoming text strings before my
    parser examines them, but I'm confused about which function to use.

    Andy

  • Alvaro G. Vicario

    #2
    Re: Character Set Conversions

    *** softap@gmail.co m escribió/wrote (3 Aug 2006 02:03:40 -0700):
    This works fine normally, but occasionally an email may contain the
    name and address of somebody in Europe, and when this happens the
    character set of the email changes from "us-ascii" to "ISO 8859-1" and
    this confuses my parsing code, because I get characters like =E4 and
    =E5 for some of the accented european characters in people's name or
    address, and where there would normally be a row of "equals" characters
    which is used as a separator line in the email, I get =3D=3D=3D etc.
    This cute function was published in this group some weeks ago:

    <?

    function quoted_word_cal lback($m) {
    switch($m[2]) {
    case 'Q': case 'q': return quoted_printabl e_decode($m[3]);
    case 'B': case 'b': return base64_decode($ m[3]);
    }
    }

    $s = "OLED & =?ISO-8859-1?Q?br=E4nslece ller?=";
    echo preg_replace_ca llback('/=\?(.*)\?([BQ])\?(.*)\?=/U',
    'quoted_word_ca llback', $s);

    ?>

    Hope it helps.





    --
    -+ http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
    ++ Mi sitio sobre programación web: http://bits.demogracia.com
    +- Mi web de humor con rayos UVA: http://www.demogracia.com
    --

    Comment

    Working...