For Programmers: Free Programming Magazines  


Home > Archive > PHP Programming > February 2008 > Re: Strange 'Â' character output when using









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Re: Strange 'Â' character output when using
Toby A Inkster

2008-02-25, 4:15 am

Andy Hassall wrote:
> bizt <bissatch@yahoo.co.uk> wrote:
>
>
> simplexml always outputs in UTF-8. Is your page's encoding UTF-8?


At a guess, ISO-8859-1 or perhaps ISO-8859-15.

In UTF-8, a "prefix" of an 0xC2 byte is used to access the top half of the
"Latin-1 Supplement" block which includes a lot of juicy characters such
as currency symbols, fractions, superscript 2 and 3, the copyright and
registered trademark symbols, and the non-breaking space.

However in ISO-8859-1 and -15, the byte 0xC2 represents an Â, so if UTF-8
is misinterpreted as one of those, then you get  followed by some other
nonsense character.

Probably the easiest solution would be to take the output from SimpleXML
and pass it through iconv():

$xmlout = iconv('UTF-8', 'ISO-8859-15//TRANSLIT', $xmlout);

Note that UTF-8 is capable of representing a far greater range of
characters than ISO-8859-1/-15 are, so certain characters may not properly
survive conversion. (Using the '//TRANSLIT' option tells iconv to do its
best, and if, say, a particular accented character is not available in
ISO-8859-1, then to substitute an unaccented one in its place.)

--
Toby A Inkster BSc (Hons) ARCS
[G of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 26 days, 15:55.]

Bottled Water
http://tobyinkster.co.uk/blog/2008/02/18/bottled-water/
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com