For Programmers: Free Programming Magazines  


Home > Archive > PHP DB > July 2007 > Re: [PHP-DB] PHP + PostgreSQL: invalid byte sequence for









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Re: [PHP-DB] PHP + PostgreSQL: invalid byte sequence for
Neil Smith [MVP, Digital media]

2007-07-23, 6:59 pm

At 01:18 23/07/2007, you wrote:
>Message-ID: <D0.83.42241.94332A64@pb1.pair.com>
>From: aldnin <aldnin@yahoo.de>
>Subject: Re: [PHP-DB] PHP + PostgreSQL: invalid byte sequence for encoding
>
>
set.[color=darkred]
>
>Well, I've set the default_charset to UTF8, it=20
>was set before to "" (empty) - but the output on=20
>console (cli) and the problem is still the same=20
>also after changing this to UTF8, so: this is=20
>not the problem, and I don't need proper output=20
>on console without utf8_decode() - if I want=20
>proper output there I just do a decode, like I=20
>do when I want it to get outputed in the browser properly.
>
>Maybe a cleaner explanation of the problem:
>
>I fetch something from database, which looks=20
>like "lacarri=C3=A8re" when I output it in PHP -=20
>well don't let us get from PHPs output.=20
>Then I fetch something from another ressource=20
>looking like "lacarri=E8re" - when I compare both=20
>strings in PHP it tells me that they are "not equal".
>
>The default_charset seems to work only on output=20
>buffer, so the solution for that problem could=20
>only be a mechanism to tell PHP handling all=20
>strings UTF8 byte encoded, which should mean a=20
>lot of more ressources to be taken for this=20
>process - I understand that this is not a solution.
>
>So the only solutions could be:
>
>a) Decode and encode properly utf8 stuff and to=20
>take care if the content is utf8-byte encoded so=20
>it needs to be decoded before using it properly with other strings
>
>b) A mechanism to tell the pg-functions in PHP=20
>to decode all data which is UTF8-Encoded. The=20
>ADODB-Layers seems to do that properly, but the=20
>pg-functions don't do that as I can see.
>
>Try to send "select 'lacarri=E8re' as test;' with=20
>pg_query to any postgres database, you'll get an=20
>error, if not... well, then I'm wrong and I've=20
>set up PHP wrong to handle UTF8-stuff.




There are several areas when encoding issues can=20
arise between PHP (client) and DB server. One=20
which you've not considered is the client=20
connection, that is the encoding used when transferring resultsets to PHP.

I met this a few ws ago in MySQL while=20
stashing XML recordsets with non ISO-8859-1 content.

The solution is pretty simple once you hit it,=20
and works in both MySQL and PGSQL because it's standard SQL-92 :

$query=3D"SET NAMES 'UTF-8'";

Issue that at the time you first make your=20
connection in your DB abstraction library - you=20
can send the query immediately after establishing=20
the connection, an all subsequent queries using=20
that connection will have the charset for transfer correctly stated.

@see :
'21.2.3. Automatic Character Set Conversion Between Server and Client'
http://www.postgresql.org/docs/8.1/.../multibyte.html


HTH
Cheers - Neil
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com