Home > Archive > PERL Beginners > August 2007 > Big unicode problem with Perl 5.8.8 with MySQL 5.0 (i.e. Debian 4.0)
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Big unicode problem with Perl 5.8.8 with MySQL 5.0 (i.e. Debian 4.0)
|
|
| Karjala 2007-08-30, 3:59 am |
| I just upgraded from Debian 3 (Perl 5.8.4 and MySQL 4.1) to Debian 4
(Perl 5.8.8 and MySQL 5.0 and DBD::mysql 3.0008) and have the following
problem:
In the following lines I create a table with a char field of length 4
and then try, using a perl script, to populate it with a string of 4
unicode characters, and see that only the 2 first characters have been
stored, in a "double-encoded" form (thus taking the space of 4
characters). Needless to say, this is a huge problem.
First the table:
mysql> create table bbb (a int primary key auto_increment, b varchar(4));
Query OK, 0 rows affected (0.00 sec)
Then the perl script to populate the field:
#!/usr/bin/perl -w
use DBI;
my $dbh = DBI->connect("DBI:mysql:aaa", 'username', 'password', {
RaiseError => 1 });
$dbh->do("insert into bbb set b = 'Αθήν'");
And then checking the result:
mysql> select * from bbb;
+---+------------+
| a | b |
+---+------------+
| 1 | ΡÆÎ¸ |
+---+------------+
1 row in set (0.00 sec)
That was with default_character_set=utf8 under the [mysql] section of
my.conf.
Commenting out that line and viewing the table again, we get:
mysql> select *, char_length(b) from bbb;
+---+------+----------------+
| a | b | char_length(b) |
+---+------+----------------+
| 1 | Αθ | 4 |
+---+------+----------------+
1 row in set (0.00 sec)
i.e. we only got the first two letters in the table, but doubly-encoded
to take up the space of 4 chars.
I'm desperate for a solution, a hint, or if you run Debian to please try
these short scripts on your machine to tell me whether you're getting
the same results (or better ones).
Thanks.
P.S. I'm 99.9% positive I've made sure the problem is not at my
terminal's encoding, by uploading the perl script from another machine
(that's known to have no problem) and inserting a 'use encoding "utf8";'
pragma as well.
And thanks again.
| |
| Chas Owens 2007-08-31, 7:27 pm |
| On 8/29/07, Karjala <karjala_lists@karjala.org> wrote:
snip
> I'm desperate for a solution, a hint, or if you run Debian to please try
> these short scripts on your machine to tell me whether you're getting
> the same results (or better ones).
snip
Have you tried doing the insert from the mysql command to see if it is
the database that is at fault and not Perl?
You probably are already aware of this url, but I am including it just
case you haven't read this stuff:
http://www.mysql.org/doc/refman/5.0/en/charset.html
|
|
|
|
|