Latin Capital A with circumflex preceding a pound symbol.

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • junk@junk.com

    Latin Capital A with circumflex preceding a pound symbol.


    Hi,

    Sorry if this has been asked before, and apologise if this is the
    wrong NG.

    I am using PHP 5.0.5 and Apache 2.0.54 in a Win2k environment.

    Lately I have been playng with RSS feeds. I managed to get "lastRSS"
    which is a simple RSS parser.

    When I tried to setup an RSS feed to eBay to get custom searches
    straight to my desktop I noticed that the UK Pound sterling symbol is
    shown preceded by a Latin capital A with circumflex. (An 'A' wearning
    a hat).

    I checked the RSS feed and the extra char is not there.

    So, I am unsure how to progress to sort this out. I don't know if PHP
    or apache is the problem. I can only find one other comment on Google
    where someone is having the same problem. But still no answer.

    I checked the changelogs for the lastest versions of PHP and Apache
    and there is no mention of this bug.

    Is it just me?


    Any clues will be much appreciated.

    Kind regards

    Nick Thomas
  • Andy Hassall

    #2
    Re: Latin Capital A with circumflex preceding a pound symbol.

    On Thu, 02 Feb 2006 21:35:22 GMT, junk@junk.com wrote:
    [color=blue]
    >Sorry if this has been asked before, and apologise if this is the
    >wrong NG.
    >
    >I am using PHP 5.0.5 and Apache 2.0.54 in a Win2k environment.
    >
    >Lately I have been playng with RSS feeds. I managed to get "lastRSS"
    >which is a simple RSS parser.
    >
    >When I tried to setup an RSS feed to eBay to get custom searches
    >straight to my desktop I noticed that the UK Pound sterling symbol is
    >shown preceded by a Latin capital A with circumflex. (An 'A' wearning
    >a hat).
    >
    >I checked the RSS feed and the extra char is not there.
    >
    >So, I am unsure how to progress to sort this out. I don't know if PHP
    >or apache is the problem. I can only find one other comment on Google
    >where someone is having the same problem. But still no answer.
    >
    >I checked the changelogs for the lastest versions of PHP and Apache
    >and there is no mention of this bug.
    >
    >Is it just me?
    >
    >Any clues will be much appreciated.[/color]

    First thing to consider is the encoding - what encoding is the RSS feed in? As
    it's XML, the most common encoding is UTF-8.

    What did you check the RSS feed with? If you used a browser or a half decent
    editor it would most likely have understood the encoding and presented the
    character correctly.

    But your PHP code may be trying to treat UTF-8 as single-byte ISO-8859-1.

    A British pound symbol is two bytes in UTF-8 - it's U+00A3 which is 0xC2 0xA3
    in UTF-8.

    POUND SIGN, U+00A3, pound sterling, sterling pound, lira italian, irish punt, italian lira, POUND SIGN, punt irish


    If you tried to display this as ISO-8859-1 you'd get:

    0xC2 = Latin capital A with circumflex
    0xA3 = British pound symbol




    --
    Andy Hassall :: andy@andyh.co.u k :: http://www.andyh.co.uk
    http://www.andyhsoftware.co.uk/space :: disk and FTP usage analysis tool

    Comment

    • junk@junk.com

      #3
      Re: Latin Capital A with circumflex preceding a pound symbol.

      On Thu, 02 Feb 2006 22:00:17 +0000, Andy Hassall <andy@andyh.co. uk>
      wrote:
      [color=blue]
      >On Thu, 02 Feb 2006 21:35:22 GMT, junk@junk.com wrote:
      >[color=green]
      >>Sorry if this has been asked before, and apologise if this is the
      >>wrong NG.
      >>
      >>I am using PHP 5.0.5 and Apache 2.0.54 in a Win2k environment.
      >>
      >>Lately I have been playng with RSS feeds. I managed to get "lastRSS"
      >>which is a simple RSS parser.
      >>
      >>When I tried to setup an RSS feed to eBay to get custom searches
      >>straight to my desktop I noticed that the UK Pound sterling symbol is
      >>shown preceded by a Latin capital A with circumflex. (An 'A' wearning
      >>a hat).
      >>
      >>I checked the RSS feed and the extra char is not there.
      >>
      >>So, I am unsure how to progress to sort this out. I don't know if PHP
      >>or apache is the problem. I can only find one other comment on Google
      >>where someone is having the same problem. But still no answer.
      >>
      >>I checked the changelogs for the lastest versions of PHP and Apache
      >>and there is no mention of this bug.
      >>
      >>Is it just me?
      >>
      >>Any clues will be much appreciated.[/color]
      >
      > First thing to consider is the encoding - what encoding is the RSS feed in? As
      >it's XML, the most common encoding is UTF-8.
      >
      > What did you check the RSS feed with? If you used a browser or a half decent
      >editor it would most likely have understood the encoding and presented the
      >character correctly.
      >
      > But your PHP code may be trying to treat UTF-8 as single-byte ISO-8859-1.
      >
      > A British pound symbol is two bytes in UTF-8 - it's U+00A3 which is 0xC2 0xA3
      >in UTF-8.
      >
      > http://www.fileformat.info/info/unic...00A3/index.htm
      >
      > If you tried to display this as ISO-8859-1 you'd get:
      >
      > 0xC2 = Latin capital A with circumflex
      > 0xA3 = British pound symbol
      >
      > http://en.wikipedia.org/wiki/ISO_8859-1
      >[/color]


      Ahh. I realize that my knowledge in this area is somewhat lacking.
      After some more digging (and googling) I now come to the unfortunate
      realisation that I made a mistake.

      To fix my problem I simply needed to add :
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      to the <head> section of my HTML page.

      I now understand that this is not a PHP problem at all, and I
      apologise for suggesting such.

      Thanks to Andy for pointing me in the right direction.

      Regards
      Nick

      Comment

      Working...